Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

FastembedRanker

Rank documents based on their similarity to the query using Fastembed models.

Key Features

  • Ranks documents from most to least semantically relevant to the query.
  • Uses Fastembed reranking models for fast, efficient reranking.
  • Configurable number of results returned via top_k.
  • Supports embedding metadata fields alongside document content for richer reranking.
  • Improves quality of generated answers by surfacing the most relevant documents at the top.

Configuration

  1. Drag the FastembedRanker component onto the canvas from the Component Library.
  2. Click on the component to open the configuration panel.
  3. On the General tab:
    • Set the reranking model name. You can find supported models in the Fastembed documentation.
    • Set top_k to control how many documents to return.
  4. Go to the Advanced tab to configure additional settings such as batch_size, parallel, and meta_fields_to_embed.

Connections

FastembedRanker accepts a query string and a list of documents as inputs. Connect it after a retriever or DocumentJoiner in a query pipeline.

It outputs a ranked list of documents sorted from most to least relevant. Connect its documents output to ChatPromptBuilder, AnswerBuilder, or another downstream component.

Source Code

To check this component's source code, open ranker.py in the Haystack Core Integrations repository.

Usage Examples

Basic Configuration

  FastembedRanker:
type: haystack_integrations.components.rankers.fastembed.ranker.FastembedRanker
init_parameters:
model_name: Xenova/ms-marco-MiniLM-L-6-v2
top_k: 5
batch_size: 64
local_files_only: false
meta_data_separator: "\n"

This query pipeline uses FastembedRanker to rerank documents after retrieval:

components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'default'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20
fuzziness: 0

FastembedRanker:
type: haystack_integrations.components.rankers.fastembed.ranker.FastembedRanker
init_parameters:
model_name: Xenova/ms-marco-MiniLM-L-6-v2
top_k: 5
cache_dir:
threads:
batch_size: 64
parallel:
local_files_only: false
meta_fields_to_embed:
meta_data_separator: "\n"

ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- role: system
content: "You are a helpful assistant answering questions based on the provided documents."
- role: user
content: "Documents:\n{% for doc in documents %}\n{{ doc.content }}\n{% endfor %}\n\nQuestion: {{ query }}"

OpenAIChatGenerator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o-mini

OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]

answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm

connections:
- sender: bm25_retriever.documents
receiver: FastembedRanker.documents
- sender: FastembedRanker.documents
receiver: ChatPromptBuilder.documents
- sender: FastembedRanker.documents
receiver: answer_builder.documents
- sender: ChatPromptBuilder.prompt
receiver: OpenAIChatGenerator.messages
- sender: OpenAIChatGenerator.replies
receiver: OutputAdapter.replies
- sender: OutputAdapter.output
receiver: answer_builder.replies

inputs:
query:
- bm25_retriever.query
- FastembedRanker.query
- ChatPromptBuilder.query
- answer_builder.query
filters:
- bm25_retriever.filters

outputs:
documents: FastembedRanker.documents
answers: answer_builder.answers

max_runs_per_component: 100

metadata: {}

Parameters

Inputs

ParameterTypeDefaultDescription
querystrThe input query to compare the documents to.
documentsList[Document]A list of documents to be ranked.
top_kOptional[int]NoneThe maximum number of documents to return.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]A list of documents sorted from most similar to least similar to the query.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
model_namestrXenova/ms-marco-MiniLM-L-6-v2Fastembed model name. Check the list of supported models in the Fastembed documentation.
top_kint10The maximum number of documents to return.
cache_dirOptional[str]NoneThe path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system's temp directory.
threadsOptional[int]NoneThe number of threads single onnxruntime session can use.
batch_sizeint64Number of strings to encode at once.
parallelOptional[int]NoneIf > 1, data-parallel encoding is used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
local_files_onlyboolFalseIf True, only use the model files in the cache_dir.
meta_fields_to_embedOptional[List[str]]NoneList of meta fields that should be concatenated with the document content for reranking.
meta_data_separatorstr\nSeparator used to concatenate the meta fields to the Document content.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
querystrThe input query to compare the documents to.
documentsList[Document]A list of documents to be ranked.
top_kOptional[int]NoneThe maximum number of documents to return.