FastembedRanker
Rank documents based on their similarity to the query using Fastembed models.
Key Features
- Ranks documents from most to least semantically relevant to the query.
- Uses Fastembed reranking models for fast, efficient reranking.
- Configurable number of results returned via
top_k. - Supports embedding metadata fields alongside document content for richer reranking.
- Improves quality of generated answers by surfacing the most relevant documents at the top.
Configuration
- Drag the
FastembedRankercomponent onto the canvas from the Component Library. - Click on the component to open the configuration panel.
- On the General tab:
- Set the reranking model name. You can find supported models in the Fastembed documentation.
- Set
top_kto control how many documents to return.
- Go to the Advanced tab to configure additional settings such as
batch_size,parallel, andmeta_fields_to_embed.
Connections
FastembedRanker accepts a query string and a list of documents as inputs. Connect it after a retriever or DocumentJoiner in a query pipeline.
It outputs a ranked list of documents sorted from most to least relevant. Connect its documents output to ChatPromptBuilder, AnswerBuilder, or another downstream component.
Source Code
To check this component's source code, open ranker.py in the Haystack Core Integrations repository.
Usage Examples
Basic Configuration
FastembedRanker:
type: haystack_integrations.components.rankers.fastembed.ranker.FastembedRanker
init_parameters:
model_name: Xenova/ms-marco-MiniLM-L-6-v2
top_k: 5
batch_size: 64
local_files_only: false
meta_data_separator: "\n"
This query pipeline uses FastembedRanker to rerank documents after retrieval:
components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'default'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20
fuzziness: 0
FastembedRanker:
type: haystack_integrations.components.rankers.fastembed.ranker.FastembedRanker
init_parameters:
model_name: Xenova/ms-marco-MiniLM-L-6-v2
top_k: 5
cache_dir:
threads:
batch_size: 64
parallel:
local_files_only: false
meta_fields_to_embed:
meta_data_separator: "\n"
ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- role: system
content: "You are a helpful assistant answering questions based on the provided documents."
- role: user
content: "Documents:\n{% for doc in documents %}\n{{ doc.content }}\n{% endfor %}\n\nQuestion: {{ query }}"
OpenAIChatGenerator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o-mini
OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]
answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm
connections:
- sender: bm25_retriever.documents
receiver: FastembedRanker.documents
- sender: FastembedRanker.documents
receiver: ChatPromptBuilder.documents
- sender: FastembedRanker.documents
receiver: answer_builder.documents
- sender: ChatPromptBuilder.prompt
receiver: OpenAIChatGenerator.messages
- sender: OpenAIChatGenerator.replies
receiver: OutputAdapter.replies
- sender: OutputAdapter.output
receiver: answer_builder.replies
inputs:
query:
- bm25_retriever.query
- FastembedRanker.query
- ChatPromptBuilder.query
- answer_builder.query
filters:
- bm25_retriever.filters
outputs:
documents: FastembedRanker.documents
answers: answer_builder.answers
max_runs_per_component: 100
metadata: {}
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
query | str | The input query to compare the documents to. | |
documents | List[Document] | A list of documents to be ranked. | |
top_k | Optional[int] | None | The maximum number of documents to return. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
documents | List[Document] | A list of documents sorted from most similar to least similar to the query. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
model_name | str | Xenova/ms-marco-MiniLM-L-6-v2 | Fastembed model name. Check the list of supported models in the Fastembed documentation. |
top_k | int | 10 | The maximum number of documents to return. |
cache_dir | Optional[str] | None | The path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system's temp directory. |
threads | Optional[int] | None | The number of threads single onnxruntime session can use. |
batch_size | int | 64 | Number of strings to encode at once. |
parallel | Optional[int] | None | If > 1, data-parallel encoding is used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead. |
local_files_only | bool | False | If True, only use the model files in the cache_dir. |
meta_fields_to_embed | Optional[List[str]] | None | List of meta fields that should be concatenated with the document content for reranking. |
meta_data_separator | str | \n | Separator used to concatenate the meta fields to the Document content. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
query | str | The input query to compare the documents to. | |
documents | List[Document] | A list of documents to be ranked. | |
top_k | Optional[int] | None | The maximum number of documents to return. |
Was this page helpful?