DeepsetFileUploader
Takes documents from a pipeline and uploads them to deepset AI Platform as TXT files.
Basic Information
- Type:
deepset_cloud_custom_nodes.augmenters.deepset_file_uploader.DeepsetFileUploader - Components it can connect with:
- Components accepting a list of Document objects as input and output.
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | The documents to upload. | |
| raise_on_failure | bool | False | Whether to raise an error if the documents fail to upload. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of uploaded documents. |
Overview
DeepsetFileUploader is used in indexes. You can use it with a web crawling component to upload the crawled documents to a specified workspace. It can also upload any other document the index creates. You can specify the workspace where you want to save the created files in the workspace parameter.
Usage Example
Initiating the Component
components:
DeepsetFileUploader:
type: augmenters.deepset_file_uploader.DeepsetFileUploader
init_parameters:
Using the Component in a Pipeline
This is an example of an index where FileUploader receives documents from DeepsetFirecrawlWebScraper, uploads them to deepset AI Platform, and sends them to DocumentWriter to write into a document store:

components:
FileClassifier:
type: haystack.components.routers.file_type_router.FileTypeRouter
init_parameters:
mime_types:
- text/csv
- application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: |-
{% set str_list = [] %}
{% for document in documents %}
{% set _ = str_list.append(document.content) %}
{% endfor %}
{{ str_list }}
output_type: typing.List[str]
DocumentWriter:
type: haystack.components.writers.document_writer.DocumentWriter
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
embedding_dim: 768
similarity: cosine
policy: NONE
DeepsetCSVRowsToDocumentsConverter:
type: deepset_cloud_custom_nodes.converters.csv_rows_to_documents.DeepsetCSVRowsToDocumentsConverter
init_parameters:
content_column: urls
encoding: utf-8
DeepsetFirecrawlWebScraper:
type: deepset_cloud_custom_nodes.crawler.firecrawl.DeepsetFirecrawlWebScraper
init_parameters: {}
XLSXToDocument:
type: deepset_cloud_custom_nodes.converters.xlsx.XLSXToDocument
init_parameters:
document_per: sheet
content_column: content
sheet_name:
DocumentJoiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
weights:
top_k:
sort_by_score: true
DeepsetFileUploader:
type: deepset_cloud_custom_nodes.augmenters.deepset_file_uploader.DeepsetFileUploader
init_parameters:
workspace:
api_key:
type: env_var
env_vars:
- DEEPSET_CLOUD_API_KEY
strict: false
write_mode: OVERWRITE
base_url: https://api.cloud.deepset.ai/api/v1
connections:
- sender: OutputAdapter.output
receiver: DeepsetFirecrawlWebScraper.urls
- sender: FileClassifier.application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
receiver: XLSXToDocument.sources
- sender: DocumentJoiner.documents
receiver: OutputAdapter.documents
- sender: XLSXToDocument.documents
receiver: DocumentJoiner.documents
- sender: DeepsetCSVRowsToDocumentsConverter.documents
receiver: DocumentJoiner.documents
- sender: FileClassifier.text/csv
receiver: DeepsetCSVRowsToDocumentsConverter.sources
- sender: DeepsetFirecrawlWebScraper.documents
receiver: DeepsetFileUploader.documents
- sender: DeepsetFileUploader.documents
receiver: DocumentWriter.documents
max_runs_per_component: 100
metadata: {}
inputs:
files:
- FileClassifier.sources
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| workspace | str | The name of the workspace to upload the documents to. | |
| api_key | Secret | Secret.from_env_var('DEEPSET_CLOUD_API_KEY') | The API key. |
| write_mode | Literal['KEEP', 'OVERWRITE', 'FAIL'] | OVERWRITE | The write mode for the upload. Default is "OVERWRITE". You can find possible variants at https://docs.cloud.deepset.ai/reference/upload_file_api_v1_workspaces__workspace_name__files_post |
| base_url | str | https://api.cloud.deepset.ai/api/v1 | The base URL for the API. Default is "https://api.cloud.deepset.ai/api/v1". |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | The documents to upload. | |
| raise_on_failure | bool | False | Whether to raise an error if the documents fail to upload. |
Was this page helpful?