Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

NvidiaRanker

Ranks documents by their semantic similarity to the query using NVIDIA NIM ranking models.

Key Features

  • Ranks documents using NVIDIA NIM ranking models.
  • Supports both self-hosted NVIDIA NIM models and NVIDIA API Catalog models.
  • Orders documents from most to least semantically relevant to the query.
  • Configurable top_k to return only the most relevant documents.
  • Supports prefix strings for query and document text, required by some ranking models.
  • Default model: nvidia/nv-rerankqa-mistral-4b-v3.

Configuration

Authentication

Connect Haystack Platform to NVIDIA on the Integrations page. For detailed instructions, see Use NVIDIA Models.

  1. Drag the NvidiaRanker component onto the canvas from the Component Library.
  2. Click the component to open the configuration panel.
  3. On the General tab:
    1. Enter the name of the NVIDIA ranking model to use, such as nvidia/nv-rerankqa-mistral-4b-v3.
  4. Go to the Advanced tab to configure the API key, API URL, top_k, and truncation mode.

Connections

NvidiaRanker accepts a query string, a documents list, and an optional top_k as inputs. It outputs the ranked documents list sorted from most to least relevant.

Connect a Retriever or DocumentJoiner to its documents input. Connect its documents output to PromptBuilder or use it as the pipeline's final output.

top_k parameter

The top_k values for the retrievers and the ranker serve different purposes. The retrievers' top_k specifies how many documents they return. The ranker's top_k is the number of documents it returns to the next component or as pipeline output.

You can set the same or a smaller top_k value for the ranker. A smaller top_k for the retrievers means fewer documents for the ranker to process, which can improve pipeline performance.

Usage Example

This is an example of a document search pipeline where NvidiaRanker receives joined documents from both a keyword and a semantic retriever and returns the ranked documents as the final result.

components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 1024
similarity: cosine
index: ''
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
top_k: 20
embedding_retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 1024
similarity: cosine
index: ''
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
top_k: 20
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
NvidiaTextEmbedder:
type: haystack_integrations.components.embedders.nvidia.text_embedder.NvidiaTextEmbedder
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: nvidia/nv-embedqa-e5-v5
api_url: https://integrate.api.nvidia.com/v1
prefix: ''
suffix: ''
truncate:
timeout:
NvidiaRanker:
type: haystack_integrations.components.rankers.nvidia.ranker.NvidiaRanker
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: nvidia/nv-rerankqa-mistral-4b-v3
api_url: https://integrate.api.nvidia.com/v1
top_k: 10
truncate:
query_prefix: ''
document_prefix: ''
meta_fields_to_embed:
embedding_separator: "\n"
timeout:

connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: NvidiaTextEmbedder.embedding
receiver: embedding_retriever.query_embedding
- sender: document_joiner.documents
receiver: NvidiaRanker.documents

max_runs_per_component: 100

metadata: {}

inputs:
query:
- bm25_retriever.query
- NvidiaTextEmbedder.text
- NvidiaRanker.query
filters:
- bm25_retriever.filters
- embedding_retriever.filters

outputs:
documents: NvidiaRanker.documents

Parameters

Inputs

ParameterTypeDefaultDescription
querystrThe query to rank the documents against.
documentsList[Document]The list of documents to rank.
top_kOptional[int]NoneThe number of documents to return.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]List of documents most similar to the query in descending order of similarity.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
modelOptional[str]NoneRanking model to use.
truncateOptional[Union[RankerTruncateMode, str]]NoneTruncation strategy to use. Can be "NONE", "END", or RankerTruncateMode. Defaults to NIM's default.
api_keyOptional[Secret]Secret.from_env_var('NVIDIA_API_KEY')API key for the NVIDIA NIM.
api_urlstros.getenv('NVIDIA_API_URL', DEFAULT_API_URL)Custom API URL for the NVIDIA NIM.
top_kint5Number of documents to return.
query_prefixstrA string to add at the beginning of the query text before ranking. Use it to prepend the text with an instruction, as required by reranking models like bge.
document_prefixstrA string to add at the beginning of each document before ranking. You can use it to prepend the document with an instruction, as required by embedding models like bge.
meta_fields_to_embedOptional[List[str]]NoneList of metadata fields to embed with the document.
embedding_separatorstr\nSeparator to concatenate metadata fields to the document.
timeoutOptional[float]NoneTimeout for request calls, if not set it is inferred from the NVIDIA_TIMEOUT environment variable or set to 60 by default.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
querystrThe query to rank the documents against.
documentsList[Document]The list of documents to rank.
top_kOptional[int]NoneThe number of documents to return.