NvidiaRanker

Rank documents based on their similarity to the query using NVIDIA models.

Basic Information

Type: haystack_integrations.components.rankers.nvidia.ranker.NvidiaRanker
Components it can connect with:
- Retriever: NvidiaRanker receives documents from Retriever.
- PromptBuilder: NvidiaRanker can send the ranked documents to PromptBuilder to be used in a prompt.
- Any component that outputs documents or accepts documents as input.

Inputs

Parameter	Type	Default	Description
query	str		The query to rank the documents against.
documents	List[Document]		The list of documents to rank.
top_k	Optional[int]	None	The number of documents to return.

Outputs

Parameter	Type	Default	Description
documents	List[Document]		List of documents most similar to the query in descending order of similarity.

Overview

NvidiaRanker ranks documents based on their semantic similarity to the user query. It uses ranking models provided by NVIDIA NIMs. The default model for this ranker is nvidia/nv-rerankqa-mistral-4b-v3.

Documents are ordered from most to least semantically relevant to the query.

You can also specify the top_k parameter to set the maximum number of documents to return.

Authorization

You need an NVIDIA API key to use this component. Connect Haystack Enterprise Platform to NVIDIA on the Integrations page. For detailed instructions, see Use NVIDIA Models.

Usage Example

Using the Component in a Pipeline

This is an example of a document search pipeline where NvidiaRanker receives joined documents from both a keyword and a semantic retriever. It then ranks the documents based on their similarity to the user query and outputs them as the final result.

components:
  bm25_retriever:
    type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          use_ssl: true
          verify_certs: false
          hosts:
          - ${OPENSEARCH_HOST}
          http_auth:
          - ${OPENSEARCH_USER}
          - ${OPENSEARCH_PASSWORD}
          embedding_dim: 1024
          similarity: cosine
          index: ''
          max_chunk_bytes: 104857600
          return_embedding: false
          method:
          mappings:
          settings:
          create_index: true
          timeout:
      top_k: 20
  embedding_retriever:
    type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          use_ssl: true
          verify_certs: false
          hosts:
          - ${OPENSEARCH_HOST}
          http_auth:
          - ${OPENSEARCH_USER}
          - ${OPENSEARCH_PASSWORD}
          embedding_dim: 1024
          similarity: cosine
          index: ''
          max_chunk_bytes: 104857600
          return_embedding: false
          method:
          mappings:
          settings:
          create_index: true
          timeout:
      top_k: 20
  document_joiner:
    type: haystack.components.joiners.document_joiner.DocumentJoiner
    init_parameters:
      join_mode: concatenate
  NvidiaTextEmbedder:
    type: haystack_integrations.components.embedders.nvidia.text_embedder.NvidiaTextEmbedder
    init_parameters:
      api_key:
        type: env_var
        env_vars:
        - NVIDIA_API_KEY
        strict: true
      model: nvidia/nv-embedqa-e5-v5
      api_url: https://integrate.api.nvidia.com/v1
      prefix: ''
      suffix: ''
      truncate:
      timeout:
  NvidiaRanker:
    type: haystack_integrations.components.rankers.nvidia.ranker.NvidiaRanker
    init_parameters:
      api_key:
        type: env_var
        env_vars:
        - NVIDIA_API_KEY
        strict: true
      model: nvidia/nv-rerankqa-mistral-4b-v3
      api_url: https://integrate.api.nvidia.com/v1
      top_k: 10
      truncate:
      query_prefix: ''
      document_prefix: ''
      meta_fields_to_embed:
      embedding_separator: "\n"
      timeout:

connections:
- sender: bm25_retriever.documents
  receiver: document_joiner.documents
- sender: embedding_retriever.documents
  receiver: document_joiner.documents
- sender: NvidiaTextEmbedder.embedding
  receiver: embedding_retriever.query_embedding
- sender: document_joiner.documents
  receiver: NvidiaRanker.documents

max_runs_per_component: 100

metadata: {}

inputs:
  query:
  - bm25_retriever.query
  - NvidiaTextEmbedder.text
  - NvidiaRanker.query
  filters:
  - bm25_retriever.filters
  - embedding_retriever.filters

outputs:
  documents: NvidiaRanker.documents

top_k parameter

In the example above, the top_k values for the retrievers and the ranker are different. The retrievers' top_k specifies how many documents they return. The ranker then orders these documents.

You can set the same or a smaller top_k value for the ranker. The ranker's top_k is the number of documents it returns (if it's the last component in the pipeline) or forwards to the next component. In the pipeline example above, the ranker is the last component, so the output you get when you run the pipeline are the top 10 documents, as per the ranker's top_k.

Adjusting the top_k values can help you optimize performance. In this case, a smaller top_k value of the retrievers means fewer documents to process for the ranker, which can speed up the pipeline.

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

Parameter	Type	Default	Description
model	Optional[str]	None	Ranking model to use.
truncate	Optional[Union[RankerTruncateMode, str]]	None	Truncation strategy to use. Can be "NONE", "END", or RankerTruncateMode. Defaults to NIM's default.
api_key	Optional[Secret]	Secret.from_env_var('NVIDIA_API_KEY')	API key for the NVIDIA NIM.
api_url	str	os.getenv('NVIDIA_API_URL', DEFAULT_API_URL)	Custom API URL for the NVIDIA NIM.
top_k	int	5	Number of documents to return.
query_prefix	str		A string to add at the beginning of the query text before ranking. Use it to prepend the text with an instruction, as required by reranking models like `bge`.
document_prefix	str		A string to add at the beginning of each document before ranking. You can use it to prepend the document with an instruction, as required by embedding models like `bge`.
meta_fields_to_embed	Optional[List[str]]	None	List of metadata fields to embed with the document.
embedding_separator	str	\n	Separator to concatenate metadata fields to the document.
timeout	Optional[float]	None	Timeout for request calls, if not set it is inferred from the `NVIDIA_TIMEOUT` environment variable or set to 60 by default.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

Parameter	Type	Default	Description
query	str		The query to rank the documents against.
documents	List[Document]		The list of documents to rank.
top_k	Optional[int]	None	The number of documents to return.

Was this page helpful?

Basic Information​

Inputs​

Outputs​

Overview​

Authorization​

Usage Example​

Using the Component in a Pipeline​

Parameters​

Init Parameters​

Run Method Parameters​