FastembedRanker
Rank documents based on their similarity to the query using Fastembed models.
Key Features
- Uses Fastembed reranking models for semantic similarity ranking.
- Sorts documents from most to least relevant to improve retrieval quality.
- Configurable
top_kto control the number of returned documents. - Supports data-parallel processing for faster ranking on large datasets.
- Compatible with metadata field embedding for richer context.
Configuration
- Drag the
FastembedRankercomponent onto the canvas from the Component Library. - Click the component to open the configuration panel.
- Configure the parameters as needed.
Connections
FastembedRanker receives a query string and a documents list — typically from a Retriever or DocumentJoiner. It outputs a ranked documents list sorted from most to least relevant. Connect its output to ChatPromptBuilder or AnswerBuilder for downstream processing.
Usage Example
This query pipeline uses FastembedRanker to rerank documents after retrieval:
components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'default'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20
fuzziness: 0
FastembedRanker:
type: haystack_integrations.components.rankers.fastembed.ranker.FastembedRanker
init_parameters:
model_name: Xenova/ms-marco-MiniLM-L-6-v2
top_k: 5
cache_dir:
threads:
batch_size: 64
parallel:
local_files_only: false
meta_fields_to_embed:
meta_data_separator: "\n"
ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- role: system
content: "You are a helpful assistant answering questions based on the provided documents."
- role: user
content: "Documents:\n{% for doc in documents %}\n{{ doc.content }}\n{% endfor %}\n\nQuestion: {{ query }}"
OpenAIChatGenerator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o-mini
OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]
answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm
connections:
- sender: bm25_retriever.documents
receiver: FastembedRanker.documents
- sender: FastembedRanker.documents
receiver: ChatPromptBuilder.documents
- sender: FastembedRanker.documents
receiver: answer_builder.documents
- sender: ChatPromptBuilder.prompt
receiver: OpenAIChatGenerator.messages
- sender: OpenAIChatGenerator.replies
receiver: OutputAdapter.replies
- sender: OutputAdapter.output
receiver: answer_builder.replies
inputs:
query:
- bm25_retriever.query
- FastembedRanker.query
- ChatPromptBuilder.query
- answer_builder.query
filters:
- bm25_retriever.filters
outputs:
documents: FastembedRanker.documents
answers: answer_builder.answers
max_runs_per_component: 100
metadata: {}
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The input query to compare the documents to. | |
| documents | List[Document] | A list of documents to be ranked. | |
| top_k | Optional[int] | None | The maximum number of documents to return. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of documents sorted from most similar to least similar to the query. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model_name | str | Xenova/ms-marco-MiniLM-L-6-v2 | Fastembed model name. Check the list of supported models in the Fastembed documentation. |
| top_k | int | 10 | The maximum number of documents to return. |
| cache_dir | Optional[str] | None | The path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system's temp directory. |
| threads | Optional[int] | None | The number of threads single onnxruntime session can use. |
| batch_size | int | 64 | Number of strings to encode at once. |
| parallel | Optional[int] | None | If > 1, data-parallel encoding is used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead. |
| local_files_only | bool | False | If True, only use the model files in the cache_dir. |
| meta_fields_to_embed | Optional[List[str]] | None | List of meta fields that should be concatenated with the document content for reranking. |
| meta_data_separator | str | \n | Separator used to concatenate the meta fields to the Document content. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The input query to compare the documents to. | |
| documents | List[Document] | A list of documents to be ranked. | |
| top_k | Optional[int] | None | The maximum number of documents to return. |
Was this page helpful?