DeepsetFileDownloader
Use DeepsetFileDownloader
to download files with the extensions you specify and store them in the local file system.
Basic Information
- Pipeline type: Query
- Type:
deepset_cloud_custom_nodes.augmenters.deepset_file_downloader.DeepsetFileDownloader
- Components it can connect with:
- Rankers: It can receive documents from Rankers and download them.
DeepsetPDFDocumentToBase64Image
: It can send the downloaded PDFs toDeepsetPDFDocumentToBase64Image
so that it can turn them into images.
Inputs
Name | Type | Description |
---|---|---|
documents | List of documents | The documents to download. |
Outputs
Name | Type | Description |
---|---|---|
documents | List of documents | The list of downloaded documents with the file path set in the meta field. |
Overview
DeepsetFileDownloader
is used in visual question answering pipelines as a helper component. It downloads the PDF files containing images and sends them on to the DeepsetPDFDocumentToBase64Image
component which converts them into images the LLM can consume.
Usage Example
This is an example of a visual question answering pipeline
components:
...
ranker:
type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
init_parameters:
model: "BAAI/bge-reranker-v2-m3"
top_k: 5
model_kwargs:
torch_dtype: "torch.float16"
tokenizer_kwargs:
model_max_length: 1024
image_downloader:
type: deepset_cloud_custom_nodes.augmenters.deepset_file_downloader.DeepsetFileDownloader
init_parameters:
file_extensions:
- ".pdf"
pdf_to_image:
type: deepset_cloud_custom_nodes.converters.pdf_to_image.DeepsetPDFDocumentToBase64Image
init_parameters:
detail: "high"
...
connections:
...
- sender: ranker.documents
receiver: image_downloader.documents
- sender: image_downloader.documents
receiver: pdf_to_image.documents
# pdf_to_image is usually connected with PromptBuilder, it sends the converted images to it
...
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
Parameter | Type | Possible Values | Description |
---|---|---|---|
file_extensions | List of strings | Default: None | A list of file extensions to download. |
Run Method Parameters
These are the parameters you can configure for the component's run()
method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
Parameter | Type | Description |
---|---|---|
documents | List of Document objects | Documents to download. |
Updated 9 days ago