Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

FastembedRanker

Rank documents based on their similarity to the query using Fastembed models.

Key Features

  • Uses Fastembed reranking models for semantic similarity ranking.
  • Sorts documents from most to least relevant to improve retrieval quality.
  • Configurable top_k to control the number of returned documents.
  • Supports data-parallel processing for faster ranking on large datasets.
  • Compatible with metadata field embedding for richer context.

Configuration

  1. Drag the FastembedRanker component onto the canvas from the Component Library.
  2. Click the component to open the configuration panel.
  3. Configure the parameters as needed.

Connections

FastembedRanker receives a query string and a documents list — typically from a Retriever or DocumentJoiner. It outputs a ranked documents list sorted from most to least relevant. Connect its output to ChatPromptBuilder or AnswerBuilder for downstream processing.

Usage Example

This query pipeline uses FastembedRanker to rerank documents after retrieval:

components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'default'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20
fuzziness: 0

FastembedRanker:
type: haystack_integrations.components.rankers.fastembed.ranker.FastembedRanker
init_parameters:
model_name: Xenova/ms-marco-MiniLM-L-6-v2
top_k: 5
cache_dir:
threads:
batch_size: 64
parallel:
local_files_only: false
meta_fields_to_embed:
meta_data_separator: "\n"

ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- role: system
content: "You are a helpful assistant answering questions based on the provided documents."
- role: user
content: "Documents:\n{% for doc in documents %}\n{{ doc.content }}\n{% endfor %}\n\nQuestion: {{ query }}"

OpenAIChatGenerator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o-mini

OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]

answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm

connections:
- sender: bm25_retriever.documents
receiver: FastembedRanker.documents
- sender: FastembedRanker.documents
receiver: ChatPromptBuilder.documents
- sender: FastembedRanker.documents
receiver: answer_builder.documents
- sender: ChatPromptBuilder.prompt
receiver: OpenAIChatGenerator.messages
- sender: OpenAIChatGenerator.replies
receiver: OutputAdapter.replies
- sender: OutputAdapter.output
receiver: answer_builder.replies

inputs:
query:
- bm25_retriever.query
- FastembedRanker.query
- ChatPromptBuilder.query
- answer_builder.query
filters:
- bm25_retriever.filters

outputs:
documents: FastembedRanker.documents
answers: answer_builder.answers

max_runs_per_component: 100

metadata: {}

Parameters

Inputs

ParameterTypeDefaultDescription
querystrThe input query to compare the documents to.
documentsList[Document]A list of documents to be ranked.
top_kOptional[int]NoneThe maximum number of documents to return.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]A list of documents sorted from most similar to least similar to the query.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
model_namestrXenova/ms-marco-MiniLM-L-6-v2Fastembed model name. Check the list of supported models in the Fastembed documentation.
top_kint10The maximum number of documents to return.
cache_dirOptional[str]NoneThe path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system's temp directory.
threadsOptional[int]NoneThe number of threads single onnxruntime session can use.
batch_sizeint64Number of strings to encode at once.
parallelOptional[int]NoneIf > 1, data-parallel encoding is used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
local_files_onlyboolFalseIf True, only use the model files in the cache_dir.
meta_fields_to_embedOptional[List[str]]NoneList of meta fields that should be concatenated with the document content for reranking.
meta_data_separatorstr\nSeparator used to concatenate the meta fields to the Document content.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
querystrThe input query to compare the documents to.
documentsList[Document]A list of documents to be ranked.
top_kOptional[int]NoneThe maximum number of documents to return.