VoyageRanker

Rank documents by relevance to a query using Voyage AI reranking models.

Basic Information

Type: haystack_integrations.components.rankers.voyage.ranker.VoyageRanker
Components it can connect with:
- OpenSearchBM25Retriever: VoyageRanker can receive documents from a retriever.
- PromptBuilder: VoyageRanker can send ranked documents to a prompt builder.

Inputs

Parameter	Type	Default	Description
query	str		The query to rank documents against.
documents	List[Document]		A list of documents to rank.
top_k	Optional[int]	None	Maximum number of documents to return. Overrides the value set at initialization.

Outputs

Parameter	Type	Default	Description
documents	List[Document]		Documents ranked by relevance, sorted from most to least relevant.

Overview

Use VoyageRanker to rerank documents based on their relevance to a query. This component uses Voyage AI's reranking models to score and sort documents, improving retrieval quality by surfacing the most relevant results.

Rerankers are typically used after initial retrieval (like BM25 or embedding-based retrieval) to refine the results before passing them to a language model.

Authorization

You need a Voyage AI API key to use this component. Connect Haystack Platform to your Voyage AI account on the Integrations page. For detailed instructions, see Use Voyage AI Models.

Usage Example

This is an example RAG pipeline with VoyageRanker for document reranking:

components:
  bm25_retriever:
    type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          hosts:
          index: 'default'
          max_chunk_bytes: 104857600
          embedding_dim: 768
          return_embedding: false
          method:
          mappings:
          settings:
          create_index: true
          http_auth:
          use_ssl:
          verify_certs:
          timeout:
      top_k: 50
      fuzziness: 0

  query_embedder:
    type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
    init_parameters:
      normalize_embeddings: true
      model: intfloat/e5-base-v2

  embedding_retriever:
    type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          hosts:
          index: 'default'
          max_chunk_bytes: 104857600
          embedding_dim: 768
          return_embedding: false
          method:
          mappings:
          settings:
          create_index: true
          http_auth:
          use_ssl:
          verify_certs:
          timeout:
      top_k: 50

  document_joiner:
    type: haystack.components.joiners.document_joiner.DocumentJoiner
    init_parameters:
      join_mode: concatenate

  ranker:
    type: haystack_integrations.components.rankers.voyage.ranker.VoyageRanker
    init_parameters:
      api_key:
        type: env_var
        env_vars:
        - VOYAGE_API_KEY
        strict: false
      model: rerank-2
      truncate:
      top_k: 8
      prefix:
      suffix:
      timeout:
      max_retries:
      meta_fields_to_embed:
      meta_data_separator: "\n"

  meta_field_grouping_ranker:
    type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
    init_parameters:
      group_by: file_id
      subgroup_by:
      sort_docs_by: split_id

  answer_builder:
    type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
    init_parameters:
      reference_pattern: acm

  PromptBuilder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
      template: "You are a helpful assistant answering the user's questions based on the provided documents.\nDo not use your own knowledge.\n\nProvided documents:\n{% for document in documents %}\nDocument [{{ loop.index }}]:\n{{ document.content }}\n{% endfor %}\n\nQuestion: {{ query }}\nAnswer:"

  generator:
    type: haystack.components.generators.chat.openai.OpenAIChatGenerator
    init_parameters:
      api_key:
        type: env_var
        env_vars:
        - OPENAI_API_KEY
        strict: false
      model: gpt-4o
      generation_kwargs:
        max_tokens: 1000
        temperature: 0.7

connections:
- sender: bm25_retriever.documents
  receiver: document_joiner.documents
- sender: query_embedder.embedding
  receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
  receiver: document_joiner.documents
- sender: document_joiner.documents
  receiver: ranker.documents
- sender: ranker.documents
  receiver: meta_field_grouping_ranker.documents
- sender: meta_field_grouping_ranker.documents
  receiver: answer_builder.documents
- sender: meta_field_grouping_ranker.documents
  receiver: PromptBuilder.documents
- sender: PromptBuilder.prompt
  receiver: generator.messages
- sender: generator.replies
  receiver: answer_builder.replies

inputs:
  query:
  - "bm25_retriever.query"
  - "query_embedder.text"
  - "ranker.query"
  - "answer_builder.query"
  - "PromptBuilder.query"
  filters:
  - "bm25_retriever.filters"
  - "embedding_retriever.filters"

outputs:
  documents: "meta_field_grouping_ranker.documents"
  answers: "answer_builder.answers"

max_runs_per_component: 100

metadata: {}

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

Parameter	Type	Default	Description
api_key	Secret	Secret.from_env_var('VOYAGE_API_KEY')	The Voyage AI API key. It can be explicitly provided or automatically read from the environment variable VOYAGE_API_KEY.
model	str	rerank-2	The name of the Voyage reranking model to use. See the Voyage Rerankers documentation for available models.
truncate	Optional[bool]	None	Whether to truncate the input text to fit within the context length. If `None`, truncates slightly over-length text but raises an error for significantly over-length text.
top_k	Optional[int]	None	The number of most relevant documents to return. If not specified, returns all documents.
prefix	str	""	A string to add to the beginning of each text.
suffix	str	""	A string to add to the end of each text.
timeout	Optional[int]	None	Timeout for Voyage AI client calls. If not set, it is inferred from the `VOYAGE_TIMEOUT` environment variable or set to 30.
max_retries	Optional[int]	None	Maximum retries if Voyage AI returns an internal error. If not set, it is inferred from the `VOYAGE_MAX_RETRIES` environment variable or set to five.
meta_fields_to_embed	Optional[List[str]]	None	List of metadata fields to include when ranking documents.
meta_data_separator	str	"\n"	Separator used to concatenate metadata fields to the document content.

Run Method Parameters

These are the parameters you can configure for the component's run() method. You can pass these parameters at query time through the API, in Playground, or when running a job.

Parameter	Type	Default	Description
query	str		The query to rank documents against.
documents	List[Document]		A list of documents to rank.
top_k	Optional[int]	None	Maximum number of documents to return. Overrides the value set at initialization.

Was this page helpful?

Basic Information​

Inputs​

Outputs​

Overview​

Authorization​

Usage Example​

Parameters​

Init Parameters​

Run Method Parameters​