SentenceTransformersDocumentImageEmbedder

Compute image embeddings for a list of documents using Sentence Transformers models.

Basic Information

Type: haystack.components.embedders.image.SentenceTransformersDocumentImageEmbedder
Components it can connect with:
- DocumentWriter: SentenceTransformersDocumentImageEmbedder can send embedded documents to DocumentWriter for storage.
- Retriever: SentenceTransformersDocumentImageEmbedder can receive documents from a Retriever for embedding.

Inputs

Parameter	Type	Default	Description
documents	List[Document]		Documents to embed. Each document must have a valid file path in its metadata pointing to an image or PDF file.

Outputs

Parameter	Type	Default	Description
documents	List[Document]		Documents with embeddings stored in the `embedding` field. Each document also includes metadata about the embedding source.

Overview

SentenceTransformersDocumentImageEmbedder uses Sentence Transformers models that can embed text and images. It stores the calculated embreddings in the embedding metadata field of each document.

SentenceTransformersDocumentImageEmbedder supports both direct image files and PDF documents by extracting specific pages as images. It uses pre-trained models that can embed images and text into the same vector space, making it suitable for multimodal applications.

The component automatically handles image preprocessing, including resizing and format conversion, and can process both individual images and PDF pages. Each processed document includes metadata indicating the embedding source type.

Authentication

SentenceTransformersDocumentImageEmbedder uses the Hugging Face Hub to download models. You need to provide an API token to download private models. Connect deepset to your Hugging Face account to use private models hosted on Hugging Face:

Add Workspace-Level Integration

Click your profile icon and choose Settings.
Go to Workspace>Integrations.
Find the provider you want to connect and click Connect next to them.
Enter the API key and any other required details.
Click Connect. You can use this integration in pipelines and indexes in the current workspace.

Add Organization-Level Integration

Click your profile icon and choose Settings.
Go to Organization>Integrations.
Find the provider you want to connect and click Connect next to them.
Enter the API key and any other required details.
Click Connect. You can use this integration in pipelines and indexes in all workspaces in the current organization.

Usage Example

Initializing the Component

components:
  SentenceTransformersDocumentImageEmbedder:
    type: haystack.components.embedders.image.sentence_transformers_doc_image_embedder.SentenceTransformersDocumentImageEmbedder
    init_parameters:
      file_path_meta_field: file_path
      root_path: "/data/images"
      model: sentence-transformers/clip-ViT-B-32
      device:
      token:
      batch_size: 32
      progress_bar: true
      normalize_embeddings: false
      trust_remote_code: false
      local_files_only: false
      model_kwargs:
      tokenizer_kwargs:
      config_kwargs:
      precision: float32
      encode_kwargs:
      backend: torch

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

Parameter	Type	Default	Description
file_path_meta_field	str	file_path	The metadata field in the Document that contains the file path to the image or PDF.
root_path	Optional[str]	None	The root directory path where document files are located. If provided, file paths in document metadata will be resolved relative to this path. If None, file paths are treated as absolute paths.
model	str	sentence-transformers/clip-ViT-B-32	The Sentence Transformers model to use for calculating embeddings. Must be able to embed images and text into the same vector space. Compatible models include clip-ViT-B-32, clip-ViT-L-14, clip-ViT-B-16, and others.
device	Optional[ComponentDevice]	None	The device to use for loading the model. Overrides the default device.
token	Optional[Secret]	None	The API token to download private models from Hugging Face.
batch_size	int	32	Number of documents to embed at once.
progress_bar	bool	True	If `True`, shows a progress bar when embedding documents.
normalize_embeddings	bool	False	If `True`, the embeddings are normalized using L2 normalization, so that each embedding has a norm of 1.
trust_remote_code	bool	False	If `False`, allows only Hugging Face verified model architectures. If `True`, allows custom models and scripts.
local_files_only	bool	False	If `True`, does not attempt to download the model from Hugging Face Hub and only looks at local files.
model_kwargs	Optional[Dict[str, Any]]	None	Additional keyword arguments for `AutoModelForSequenceClassification.from_pretrained` when loading the model.
tokenizer_kwargs	Optional[Dict[str, Any]]	None	Additional keyword arguments for `AutoTokenizer.from_pretrained` when loading the tokenizer.
config_kwargs	Optional[Dict[str, Any]]	None	Additional keyword arguments for `AutoConfig.from_pretrained` when loading the model configuration.
precision	Literal	float32	The precision to use for the embeddings. All non-float32 precisions are quantized embeddings.
encode_kwargs	Optional[Dict[str, Any]]	None	Additional keyword arguments for `SentenceTransformer.encode` when embedding documents.
backend	Literal	torch	The backend to use for the Sentence Transformers model. Choose from "torch", "onnx", or "openvino".

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

Parameter	Type	Default	Description
documents	List[Document]		Documents to embed. Each document must have a valid file path in its metadata pointing to an image or PDF file.

Was this page helpful?

Basic Information​

Inputs​

Outputs​

Overview​

Authentication​

Add Workspace-Level Integration​

Add Organization-Level Integration​

Usage Example​

Initializing the Component​

Parameters​

Init Parameters​

Run Method Parameters​