Skip to main content

FastembedRanker

Rank documents based on their similarity to the query using Fastembed models.

Basic Information

  • Type: haystack_integrations.components.rankers.fastembed.ranker.FastembedRanker
  • Components it can connect with:
    • Retrievers: Receives documents from a Retriever or DocumentJoiner in a query pipeline.
    • Builders: Sends ranked documents to ChatPromptBuilder or AnswerBuilder.

Inputs

ParameterTypeDefaultDescription
querystrThe input query to compare the documents to.
documentsList[Document]A list of documents to be ranked.
top_kOptional[int]NoneThe maximum number of documents to return.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]A list of documents sorted from most similar to least similar to the query.

Overview

FastembedRanker ranks documents based on their semantic similarity to a query using Fastembed reranking models. It sorts documents from most to least semantically relevant to the query.

Use this component after a Retriever to improve the quality of retrieved documents before passing them to a Generator. Reranking helps surface the most relevant documents at the top, improving the quality of generated answers.

Usage Example

This query pipeline uses FastembedRanker to rerank documents after retrieval:

components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'default'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20
fuzziness: 0

FastembedRanker:
type: haystack_integrations.components.rankers.fastembed.ranker.FastembedRanker
init_parameters:
model_name: Xenova/ms-marco-MiniLM-L-6-v2
top_k: 5
cache_dir:
threads:
batch_size: 64
parallel:
local_files_only: false
meta_fields_to_embed:
meta_data_separator: "\n"

ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- role: system
content: "You are a helpful assistant answering questions based on the provided documents."
- role: user
content: "Documents:\n{% for doc in documents %}\n{{ doc.content }}\n{% endfor %}\n\nQuestion: {{ query }}"

OpenAIChatGenerator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o-mini

OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]

answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm

connections:
- sender: bm25_retriever.documents
receiver: FastembedRanker.documents
- sender: FastembedRanker.documents
receiver: ChatPromptBuilder.documents
- sender: FastembedRanker.documents
receiver: answer_builder.documents
- sender: ChatPromptBuilder.prompt
receiver: OpenAIChatGenerator.messages
- sender: OpenAIChatGenerator.replies
receiver: OutputAdapter.replies
- sender: OutputAdapter.output
receiver: answer_builder.replies

inputs:
query:
- bm25_retriever.query
- FastembedRanker.query
- ChatPromptBuilder.query
- answer_builder.query
filters:
- bm25_retriever.filters

outputs:
documents: FastembedRanker.documents
answers: answer_builder.answers

max_runs_per_component: 100

metadata: {}

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
model_namestrXenova/ms-marco-MiniLM-L-6-v2Fastembed model name. Check the list of supported models in the Fastembed documentation.
top_kint10The maximum number of documents to return.
cache_dirOptional[str]NoneThe path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system's temp directory.
threadsOptional[int]NoneThe number of threads single onnxruntime session can use.
batch_sizeint64Number of strings to encode at once.
parallelOptional[int]NoneIf > 1, data-parallel encoding is used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
local_files_onlyboolFalseIf True, only use the model files in the cache_dir.
meta_fields_to_embedOptional[List[str]]NoneList of meta fields that should be concatenated with the document content for reranking.
meta_data_separatorstr\nSeparator used to concatenate the meta fields to the Document content.

Run Method Parameters

These are the parameters you can configure for the run() method. You can pass these parameters at query time through the API, in Playground, or when running a job.

ParameterTypeDefaultDescription
querystrThe input query to compare the documents to.
documentsList[Document]A list of documents to be ranked.
top_kOptional[int]NoneThe maximum number of documents to return.