NvidiaRanker
Ranks documents by their semantic similarity to the query using NVIDIA NIM ranking models.
Key Features
- Ranks documents by semantic similarity to the query using NVIDIA NIMs ranking models.
- Documents are ordered from most to least semantically relevant to the query.
- Default model:
nvidia/nv-rerankqa-mistral-4b-v3. - Configurable
top_kto limit the number of returned documents. - Supports custom query and document prefixes.
Configuration
- Drag the
NvidiaRankercomponent onto the canvas from the Component Library. - Click on the component to open the configuration panel.
- On the General tab:
- Connect Haystack Platform to your NVIDIA account on the Integrations page. For instructions, see Use NVIDIA Models.
- Select the ranking model.
- Set
top_kto control the number of documents returned.
- Go to the Advanced tab to configure
truncate,query_prefix,document_prefix,meta_fields_to_embed,embedding_separator,timeout, andapi_url.
Connections
NvidiaRanker accepts a query string through its query input and a list of documents through its documents input, with an optional top_k override at runtime. It outputs ranked documents through its documents output, sorted from most to least relevant.
Connect a Retriever's (or DocumentJoiner's) documents output to NvidiaRanker's documents input. Then connect its documents output to PromptBuilder or use it as the pipeline's final output.
top_k parameterIn pipelines with both a retriever and a ranker, the top_k values are different. The retriever's top_k specifies how many documents it returns. The ranker then orders these documents.
You can set the same or a smaller top_k value for the ranker. The ranker's top_k is the number of documents it returns (if it's the last component in the pipeline) or forwards to the next component.
Adjusting the top_k values can help you optimize performance. A smaller top_k for the retriever means fewer documents to process for the ranker, which can speed up the pipeline.
Source Code
To check this component's source code, open ranker.py in the Haystack Core Integrations repository.
Usage Examples
Basic Configuration
NvidiaRanker:
type: haystack_integrations.components.rankers.nvidia.ranker.NvidiaRanker
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: nvidia/nv-rerankqa-mistral-4b-v3
api_url: https://integrate.api.nvidia.com/v1
top_k: 10
query_prefix: ''
document_prefix: ''
embedding_separator: "\n"
This is an example of a document search pipeline where NvidiaRanker receives joined documents from both a keyword and a semantic retriever and returns the ranked documents as the final result.
components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 1024
similarity: cosine
index: ''
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
top_k: 20
embedding_retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 1024
similarity: cosine
index: ''
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
top_k: 20
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
NvidiaTextEmbedder:
type: haystack_integrations.components.embedders.nvidia.text_embedder.NvidiaTextEmbedder
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: nvidia/nv-embedqa-e5-v5
api_url: https://integrate.api.nvidia.com/v1
prefix: ''
suffix: ''
truncate:
timeout:
NvidiaRanker:
type: haystack_integrations.components.rankers.nvidia.ranker.NvidiaRanker
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: nvidia/nv-rerankqa-mistral-4b-v3
api_url: https://integrate.api.nvidia.com/v1
top_k: 10
truncate:
query_prefix: ''
document_prefix: ''
meta_fields_to_embed:
embedding_separator: "\n"
timeout:
connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: NvidiaTextEmbedder.embedding
receiver: embedding_retriever.query_embedding
- sender: document_joiner.documents
receiver: NvidiaRanker.documents
max_runs_per_component: 100
metadata: {}
inputs:
query:
- bm25_retriever.query
- NvidiaTextEmbedder.text
- NvidiaRanker.query
filters:
- bm25_retriever.filters
- embedding_retriever.filters
outputs:
documents: NvidiaRanker.documents
Parameters
Inputs
| Parameter | Type | Description |
|---|---|---|
query | str | The query to rank the documents against. |
documents | List[Document] | The list of documents to rank. |
top_k | Optional[int] | The number of documents to return. |
Outputs
| Parameter | Type | Description |
|---|---|---|
documents | List[Document] | List of documents most similar to the query in descending order of similarity. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | Optional[str] | None | Ranking model to use. |
| truncate | Optional[Union[RankerTruncateMode, str]] | None | Truncation strategy to use. Can be "NONE", "END", or RankerTruncateMode. Defaults to NIM's default. |
| api_key | Optional[Secret] | Secret.from_env_var('NVIDIA_API_KEY') | API key for the NVIDIA NIM. |
| api_url | str | os.getenv('NVIDIA_API_URL', DEFAULT_API_URL) | Custom API URL for the NVIDIA NIM. |
| top_k | int | 5 | Number of documents to return. |
| query_prefix | str | A string to add at the beginning of the query text before ranking. Use it to prepend the text with an instruction, as required by reranking models like bge. | |
| document_prefix | str | A string to add at the beginning of each document before ranking. You can use it to prepend the document with an instruction, as required by embedding models like bge. | |
| meta_fields_to_embed | Optional[List[str]] | None | List of metadata fields to embed with the document. |
| embedding_separator | str | \n | Separator to concatenate metadata fields to the document. |
| timeout | Optional[float] | None | Timeout for request calls, if not set it is inferred from the NVIDIA_TIMEOUT environment variable or set to 60 by default. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The query to rank the documents against. | |
| documents | List[Document] | The list of documents to rank. | |
| top_k | Optional[int] | None | The number of documents to return. |
Related Information
Was this page helpful?