FastembedRanker
Rank documents based on their similarity to the query using Fastembed models.
Basic Information
- Type:
haystack_integrations.components.rankers.fastembed.ranker.FastembedRanker - Components it can connect with:
- Retrievers: Receives documents from a
RetrieverorDocumentJoinerin a query pipeline. - Builders: Sends ranked documents to
ChatPromptBuilderorAnswerBuilder.
- Retrievers: Receives documents from a
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The input query to compare the documents to. | |
| documents | List[Document] | A list of documents to be ranked. | |
| top_k | Optional[int] | None | The maximum number of documents to return. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of documents sorted from most similar to least similar to the query. |
Overview
FastembedRanker ranks documents based on their semantic similarity to a query using Fastembed reranking models. It sorts documents from most to least semantically relevant to the query.
Use this component after a Retriever to improve the quality of retrieved documents before passing them to a Generator. Reranking helps surface the most relevant documents at the top, improving the quality of generated answers.
Usage Example
This query pipeline uses FastembedRanker to rerank documents after retrieval:
components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'default'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20
fuzziness: 0
FastembedRanker:
type: haystack_integrations.components.rankers.fastembed.ranker.FastembedRanker
init_parameters:
model_name: Xenova/ms-marco-MiniLM-L-6-v2
top_k: 5
cache_dir:
threads:
batch_size: 64
parallel:
local_files_only: false
meta_fields_to_embed:
meta_data_separator: "\n"
ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- role: system
content: "You are a helpful assistant answering questions based on the provided documents."
- role: user
content: "Documents:\n{% for doc in documents %}\n{{ doc.content }}\n{% endfor %}\n\nQuestion: {{ query }}"
OpenAIChatGenerator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o-mini
OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]
answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm
connections:
- sender: bm25_retriever.documents
receiver: FastembedRanker.documents
- sender: FastembedRanker.documents
receiver: ChatPromptBuilder.documents
- sender: FastembedRanker.documents
receiver: answer_builder.documents
- sender: ChatPromptBuilder.prompt
receiver: OpenAIChatGenerator.messages
- sender: OpenAIChatGenerator.replies
receiver: OutputAdapter.replies
- sender: OutputAdapter.output
receiver: answer_builder.replies
inputs:
query:
- bm25_retriever.query
- FastembedRanker.query
- ChatPromptBuilder.query
- answer_builder.query
filters:
- bm25_retriever.filters
outputs:
documents: FastembedRanker.documents
answers: answer_builder.answers
max_runs_per_component: 100
metadata: {}
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model_name | str | Xenova/ms-marco-MiniLM-L-6-v2 | Fastembed model name. Check the list of supported models in the Fastembed documentation. |
| top_k | int | 10 | The maximum number of documents to return. |
| cache_dir | Optional[str] | None | The path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system's temp directory. |
| threads | Optional[int] | None | The number of threads single onnxruntime session can use. |
| batch_size | int | 64 | Number of strings to encode at once. |
| parallel | Optional[int] | None | If > 1, data-parallel encoding is used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead. |
| local_files_only | bool | False | If True, only use the model files in the cache_dir. |
| meta_fields_to_embed | Optional[List[str]] | None | List of meta fields that should be concatenated with the document content for reranking. |
| meta_data_separator | str | \n | Separator used to concatenate the meta fields to the Document content. |
Run Method Parameters
These are the parameters you can configure for the run() method. You can pass these parameters at query time through the API, in Playground, or when running a job.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The input query to compare the documents to. | |
| documents | List[Document] | A list of documents to be ranked. | |
| top_k | Optional[int] | None | The maximum number of documents to return. |
Was this page helpful?