Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

SentenceTransformersSparseTextEmbedder

Embed text strings, such as queries, using sparse embedding models from Sentence Transformers. The model runs locally, so no external API calls are made during embedding. Use this component in query pipelines to embed user queries for sparse retrieval.

Embedding Models in Query Pipelines and Indexes

The embedding model you use to embed documents in your indexing pipeline must be the same as the embedding model you use to embed the query in your query pipeline.

This means the embedders for your indexing and query pipelines must match. For example, if you use CohereDocumentEmbedder to embed your documents, you should use CohereTextEmbedder with the same model to embed your queries.

Key Features

  • Downloads and runs sparse Sentence Transformers models locally — no external API required.
  • Based on SPLADE (SParse Lexical AnD Expansion), combining the benefits of learned sparse representations with efficient sparse retrieval.
  • Outputs a SparseEmbedding where each non-zero value represents the importance weight of a term in the vocabulary.
  • Compatible with private Hugging Face models via API token authentication.
  • Works with sparse embedding retrievers such as QdrantSparseEmbeddingRetriever.

Configuration

  1. Drag the SentenceTransformersSparseTextEmbedder component onto the canvas from the Component Library.
  2. Click the component to open the configuration panel.
  3. On the General tab:
    1. Enter the model name. Specify the path to a local model or the ID of the model on Hugging Face.
  4. Go to the Advanced tab to configure device, token, prefix, suffix, trust_remote_code, model_kwargs, tokenizer_kwargs, config_kwargs, backend, and revision.

Connections

SentenceTransformersSparseTextEmbedder accepts a text string as input. It outputs sparse_embedding — a sparse vector representation of the input text.

Typically, you connect the pipeline Input component to the text input and send sparse_embedding to a sparse embedding retriever.

To use private models from Hugging Face, connect the platform to Hugging Face first. For details, see Use Hugging Face Models.

Usage Example


components:
sparse_text_embedder:
type: haystack.components.embedders.sentence_transformers_sparse_text_embedder.SentenceTransformersSparseTextEmbedder
init_parameters:
model: prithivida/Splade_PP_en_v2 # SPLADE model for sparse embeddings
prefix: ""
suffix: ""

sparse_retriever:
type: haystack_integrations.components.retrievers.qdrant.retriever.QdrantSparseEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.qdrant.document_store.QdrantDocumentStore
init_parameters:
location: ${QDRANT_HOST}
api_key: ${QDRANT_API_KEY}
index: default
use_sparse_embeddings: true
return_embedding: false
top_k: 10
scale_score: false

prompt_builder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: |-
You are a helpful assistant.
Answer the question based on the provided documents.
If the documents don't contain enough information, say so.

Documents:
{% for document in documents %}
Document[{{ loop.index }}]:
{{ document.content }}
{% endfor %}

Question: {{ question }}
Answer:

llm:
type: haystack.components.generators.openai.OpenAIGenerator
init_parameters:
api_key: {"type": "env_var", "env_vars": ["OPENAI_API_KEY"], "strict": false}
model: gpt-4o
generation_kwargs:
max_tokens: 500
temperature: 0.0

answer_builder:
type: haystack.components.builders.answer_builder.AnswerBuilder
init_parameters: {}

connections:
- sender: sparse_text_embedder.sparse_embedding
receiver: sparse_retriever.query_sparse_embedding
- sender: sparse_retriever.documents
receiver: prompt_builder.documents
- sender: sparse_retriever.documents
receiver: answer_builder.documents
- sender: prompt_builder.prompt
receiver: llm.prompt
- sender: llm.replies
receiver: answer_builder.replies

max_runs_per_component: 100

inputs:
query:
- sparse_text_embedder.text
- prompt_builder.question
- answer_builder.query
filters:
- sparse_retriever.filters

outputs:
documents: sparse_retriever.documents
answers: answer_builder.answers

Parameters

Inputs

ParameterTypeDefaultDescription
textstrText to embed.

Outputs

ParameterTypeDefaultDescription
sparse_embeddingSparseEmbeddingThe sparse embedding of the input text.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
modelstrprithivida/Splade_PP_en_v2The model to use for calculating sparse embeddings. Specify the path to a local model or the ID of the model on Hugging Face. For available models, check Hugging Face.
deviceOptional[ComponentDevice]NoneOverrides the default device used to load the model.
tokenOptional[Secret]An API token to use private models from Hugging Face.
prefixstr""A string to add at the beginning of each text to be embedded. Some models may benefit from adding a prefix or suffix to the text before embedding. For example, the prithivida/Splade_PP_en_v2 model may benefit from adding a prefix of "query: " to the text before embedding.
suffixstr""A string to add at the end of each text to embed. Some models may benefit from adding a prefix or suffix to the text before embedding.
trust_remote_codeboolFalseIf True, permits custom models and scripts.
local_files_onlyboolFalseIf True, only looks at local files without downloading from Hugging Face Hub.
model_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for AutoModelForSequenceClassification.from_pretrained when loading the model. Refer to specific model documentation for available kwargs.
tokenizer_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for AutoTokenizer.from_pretrained when loading the tokenizer. Refer to specific model documentation for available kwargs.
config_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for AutoConfig.from_pretrained when loading the model configuration.
backendLiteral["torch", "onnx", "openvino"]torchThe backend to use for the Sentence Transformers model. Choose from torch, onnx, or openvino. Refer to the Sentence Transformers documentation for more information on acceleration and quantization options.
revisionOptional[str]NoneThe specific model version to use. It can be a branch name, a tag name, or a commit ID for a stored model on Hugging Face.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
textstrText to embed.