SentenceTransformersDocumentEmbedder
Calculate document embeddings using Sentence Transformers models. The model runs locally, so no external API calls are made during embedding. Use this component in indexing pipelines to embed documents before writing them to a document store.
Embedding Models in Query Pipelines and Indexes
The embedding model you use to embed documents in your indexing pipeline must be the same as the embedding model you use to embed the query in your query pipeline.
This means the embedders for your indexing and query pipelines must match. For example, if you use CohereDocumentEmbedder to embed your documents, you should use CohereTextEmbedder with the same model to embed your queries.
When using custom embedding models, enable GPU acceleration in your index settings if your index is slow:
- Go to Indexes and click the index that contains the
SentenceTransformersDocumentEmbeddercomponent. You're redirected to the Index Details page. - Go to Settings and click the GPU Acceleration toggle to turn it on.
For details, see GPU Acceleration.
Key Features
- Downloads and runs Sentence Transformers models locally — no external API required.
- Stores embeddings in the
embeddingfield of each document. - Supports embedding document metadata fields alongside document text.
- Supports a wide range of backends: PyTorch, ONNX, and OpenVINO.
- Compatible with private Hugging Face models via API token authentication.
Configuration
- Drag the
SentenceTransformersDocumentEmbeddercomponent onto the canvas from the Component Library. - Click the component to open the configuration panel.
- On the General tab:
- Enter the model name. Specify the path to a local model or the ID of the model on Hugging Face.
- Go to the Advanced tab to configure
device,token,batch_size,prefix,suffix,normalize_embeddings,trust_remote_code,model_kwargs,tokenizer_kwargs, andconfig_kwargs.
Connections
SentenceTransformersDocumentEmbedder accepts a list of documents as input. It outputs documents — the same documents with embeddings added.
Typically, you place this component in an indexing pipeline after a document splitter and before a DocumentWriter.
Usage Example
Using the component in a pipeline
This index uses SentenceTransformersDocumentEmbedder to embed documents before writing them to a document store:
components:
TextFileToDocument:
type: haystack.components.converters.txt.TextFileToDocument
init_parameters:
encoding: utf-8
DocumentSplitter:
type: haystack.components.preprocessors.document_splitter.DocumentSplitter
init_parameters:
split_by: sentence
split_length: 5
split_overlap: 1
document_embedder:
type: haystack.components.embedders.sentence_transformers_document_embedder.SentenceTransformersDocumentEmbedder
init_parameters:
model: sentence-transformers/all-mpnet-base-v2
token:
type: env_var
env_vars:
- HF_API_TOKEN
- HF_TOKEN
strict: false
prefix:
suffix:
batch_size: 32
progress_bar: true
normalize_embeddings: false
meta_fields_to_embed:
embedding_separator: "\n"
trust_remote_code: false
DocumentWriter:
type: haystack.components.writers.document_writer.DocumentWriter
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
use_ssl: true
verify_certs: false
policy: WRITE
connections:
- sender: TextFileToDocument.documents
receiver: DocumentSplitter.documents
- sender: DocumentSplitter.documents
receiver: document_embedder.documents
- sender: document_embedder.documents
receiver: DocumentWriter.documents
inputs:
files:
- TextFileToDocument.sources
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | Documents to embed. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A dictionary with the following keys: - documents: Documents with embeddings. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | str | sentence-transformers/all-mpnet-base-v2 | The model to use for calculating embeddings. Pass a local path or ID of the model on Hugging Face. |
| device | Optional[ComponentDevice] | None | The device to use for loading the model. Overrides the default device. |
| token | Optional[Secret] | Secret.from_env_var(['HF_API_TOKEN', 'HF_TOKEN'], strict=False) | The API token to download private models from Hugging Face. |
| prefix | str | A string to add at the beginning of each document text. Can be used to prepend the text with an instruction, as required by some embedding models, such as E5 and bge. | |
| suffix | str | A string to add at the end of each document text. | |
| batch_size | int | 32 | Number of documents to embed at once. |
| progress_bar | bool | True | If True, shows a progress bar when embedding documents. |
| normalize_embeddings | bool | False | If True, the embeddings are normalized using L2 normalization, so that each embedding has a norm of 1. |
| meta_fields_to_embed | Optional[List[str]] | None | List of metadata fields to embed along with the document text. |
| embedding_separator | str | \n | Separator used to concatenate the metadata fields to the document text. |
| trust_remote_code | bool | False | If False, allows only Hugging Face verified model architectures. If True, allows custom models and scripts. |
| local_files_only | bool | False | If True, does not attempt to download the model from Hugging Face Hub and only looks at local files. |
| truncate_dim | Optional[int] | None | The dimension to truncate sentence embeddings to. None does no truncation. If the model wasn't trained with Matryoshka Representation Learning, truncating embeddings can significantly affect performance. |
| model_kwargs | Optional[Dict[str, Any]] | None | Additional keyword arguments for AutoModelForSequenceClassification.from_pretrained when loading the model. Refer to specific model documentation for available kwargs. |
| tokenizer_kwargs | Optional[Dict[str, Any]] | None | Additional keyword arguments for AutoTokenizer.from_pretrained when loading the tokenizer. Refer to specific model documentation for available kwargs. |
| config_kwargs | Optional[Dict[str, Any]] | None | Additional keyword arguments for AutoConfig.from_pretrained when loading the model configuration. |
| precision | Literal['float32', 'int8', 'uint8', 'binary', 'ubinary'] | float32 | The precision to use for the embeddings. All non-float32 precisions are quantized embeddings. Quantized embeddings are smaller and faster to compute, but may have a lower accuracy. They are useful for reducing the size of the embeddings of a corpus for semantic search, among other tasks. |
| encode_kwargs | Optional[Dict[str, Any]] | None | Additional keyword arguments for SentenceTransformer.encode when embedding documents. This parameter is provided for fine customization. Be careful not to clash with already set parameters and avoid passing parameters that change the output type. |
| backend | Literal['torch', 'onnx', 'openvino'] | torch | The backend to use for the Sentence Transformers model. Choose from "torch", "onnx", or "openvino". Refer to the Sentence Transformers documentation for more information on acceleration and quantization options. |
| revision | Optional[str] | None | The specific model version to use. It can be a branch name, a tag name, or a commit ID for a stored model on Hugging Face. This enables pinning to a particular model version for reproducibility and stability. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | Documents to embed. |
Was this page helpful?