Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

VoyageRanker

Rank documents by relevance to a query using Voyage AI reranking models. Rerankers are typically used after initial retrieval (like BM25 or embedding-based retrieval) to refine the results before passing them to a language model.

Key Features

  • Reranks retrieved documents by relevance using Voyage AI's reranking models.
  • Configurable number of results with top_k.
  • Supports configurable text prefix and suffix for preprocessing.
  • Optionally embeds metadata fields along with document content for ranking.
  • Configurable timeout and retry behavior.

Configuration

  1. Drag the VoyageRanker component onto the canvas from the Component Library.
  2. Click on the component to open the configuration panel.
  3. On the General tab:
    • Connect Haystack Platform to your Voyage AI account on the Integrations page. For detailed instructions, see Use Voyage AI Models.
    • Select the reranking model to use.
    • Set top_k to control the number of ranked documents to return.
  4. Go to the Advanced tab to configure timeout, max_retries, truncate, meta_fields_to_embed, and meta_data_separator.

Connections

VoyageRanker receives a query string and a list of documents from retrievers like OpenSearchBM25Retriever. It sends ranked documents to PromptBuilder or downstream components for answer generation.

Usage Examples

Basic Configuration

  ranker:
type: haystack_integrations.components.rankers.voyage.ranker.VoyageRanker
init_parameters:
api_key:
type: env_var
env_vars:
- VOYAGE_API_KEY
strict: false
model: rerank-2
top_k: 8
meta_data_separator: "\n"

This is an example RAG pipeline with VoyageRanker for document reranking:

components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'default'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 50
fuzziness: 0

query_embedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
normalize_embeddings: true
model: intfloat/e5-base-v2

embedding_retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'default'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 50

document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate

ranker:
type: haystack_integrations.components.rankers.voyage.ranker.VoyageRanker
init_parameters:
api_key:
type: env_var
env_vars:
- VOYAGE_API_KEY
strict: false
model: rerank-2
truncate:
top_k: 8
prefix:
suffix:
timeout:
max_retries:
meta_fields_to_embed:
meta_data_separator: "\n"

meta_field_grouping_ranker:
type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
init_parameters:
group_by: file_id
subgroup_by:
sort_docs_by: split_id

answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm

PromptBuilder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: "You are a helpful assistant answering the user's questions based on the provided documents.\nDo not use your own knowledge.\n\nProvided documents:\n{% for document in documents %}\nDocument [{{ loop.index }}]:\n{{ document.content }}\n{% endfor %}\n\nQuestion: {{ query }}\nAnswer:"

generator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o
generation_kwargs:
max_tokens: 1000
temperature: 0.7

connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: query_embedder.embedding
receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: document_joiner.documents
receiver: ranker.documents
- sender: ranker.documents
receiver: meta_field_grouping_ranker.documents
- sender: meta_field_grouping_ranker.documents
receiver: answer_builder.documents
- sender: meta_field_grouping_ranker.documents
receiver: PromptBuilder.documents
- sender: PromptBuilder.prompt
receiver: generator.messages
- sender: generator.replies
receiver: answer_builder.replies

inputs:
query:
- "bm25_retriever.query"
- "query_embedder.text"
- "ranker.query"
- "answer_builder.query"
- "PromptBuilder.query"
filters:
- "bm25_retriever.filters"
- "embedding_retriever.filters"

outputs:
documents: "meta_field_grouping_ranker.documents"
answers: "answer_builder.answers"

max_runs_per_component: 100

metadata: {}

Parameters

Inputs

ParameterTypeDescription
querystrThe query to rank documents against.
documentsList[Document]A list of documents to rank.
top_kOptional[int]Maximum number of documents to return. Overrides the value set at initialization.

Outputs

ParameterTypeDescription
documentsList[Document]Documents ranked by relevance, sorted from most to least relevant.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
api_keySecretSecret.from_env_var('VOYAGE_API_KEY')The Voyage AI API key. It can be explicitly provided or automatically read from the environment variable VOYAGE_API_KEY.
modelstrrerank-2The name of the Voyage reranking model to use. See the Voyage Rerankers documentation for available models.
truncateOptional[bool]NoneWhether to truncate the input text to fit within the context length. If None, truncates slightly over-length text but raises an error for significantly over-length text.
top_kOptional[int]NoneThe number of most relevant documents to return. If not specified, returns all documents.
prefixstr""A string to add to the beginning of each text.
suffixstr""A string to add to the end of each text.
timeoutOptional[int]NoneTimeout for Voyage AI client calls. If not set, it is inferred from the VOYAGE_TIMEOUT environment variable or set to 30.
max_retriesOptional[int]NoneMaximum retries if Voyage AI returns an internal error. If not set, it is inferred from the VOYAGE_MAX_RETRIES environment variable or set to five.
meta_fields_to_embedOptional[List[str]]NoneList of metadata fields to include when ranking documents.
meta_data_separatorstr"\n"Separator used to concatenate metadata fields to the document content.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
querystrThe query to rank documents against.
documentsList[Document]A list of documents to rank.
top_kOptional[int]NoneMaximum number of documents to return. Overrides the value set at initialization.