NvidiaRanker
Ranks documents by their semantic similarity to the query using NVIDIA NIM ranking models.
Key Features
- Ranks documents using NVIDIA NIM ranking models.
- Supports both self-hosted NVIDIA NIM models and NVIDIA API Catalog models.
- Orders documents from most to least semantically relevant to the query.
- Configurable
top_kto return only the most relevant documents. - Supports prefix strings for query and document text, required by some ranking models.
- Default model:
nvidia/nv-rerankqa-mistral-4b-v3.
Configuration
Connect Haystack Platform to NVIDIA on the Integrations page. For detailed instructions, see Use NVIDIA Models.
- Drag the
NvidiaRankercomponent onto the canvas from the Component Library. - Click the component to open the configuration panel.
- On the General tab:
- Enter the name of the NVIDIA ranking model to use, such as
nvidia/nv-rerankqa-mistral-4b-v3.
- Enter the name of the NVIDIA ranking model to use, such as
- Go to the Advanced tab to configure the API key, API URL,
top_k, and truncation mode.
Connections
NvidiaRanker accepts a query string, a documents list, and an optional top_k as inputs. It outputs the ranked documents list sorted from most to least relevant.
Connect a Retriever or DocumentJoiner to its documents input. Connect its documents output to PromptBuilder or use it as the pipeline's final output.
top_k parameterThe top_k values for the retrievers and the ranker serve different purposes. The retrievers' top_k specifies how many documents they return. The ranker's top_k is the number of documents it returns to the next component or as pipeline output.
You can set the same or a smaller top_k value for the ranker. A smaller top_k for the retrievers means fewer documents for the ranker to process, which can improve pipeline performance.
Usage Example
This is an example of a document search pipeline where NvidiaRanker receives joined documents from both a keyword and a semantic retriever and returns the ranked documents as the final result.
components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 1024
similarity: cosine
index: ''
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
top_k: 20
embedding_retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 1024
similarity: cosine
index: ''
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
top_k: 20
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
NvidiaTextEmbedder:
type: haystack_integrations.components.embedders.nvidia.text_embedder.NvidiaTextEmbedder
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: nvidia/nv-embedqa-e5-v5
api_url: https://integrate.api.nvidia.com/v1
prefix: ''
suffix: ''
truncate:
timeout:
NvidiaRanker:
type: haystack_integrations.components.rankers.nvidia.ranker.NvidiaRanker
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: nvidia/nv-rerankqa-mistral-4b-v3
api_url: https://integrate.api.nvidia.com/v1
top_k: 10
truncate:
query_prefix: ''
document_prefix: ''
meta_fields_to_embed:
embedding_separator: "\n"
timeout:
connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: NvidiaTextEmbedder.embedding
receiver: embedding_retriever.query_embedding
- sender: document_joiner.documents
receiver: NvidiaRanker.documents
max_runs_per_component: 100
metadata: {}
inputs:
query:
- bm25_retriever.query
- NvidiaTextEmbedder.text
- NvidiaRanker.query
filters:
- bm25_retriever.filters
- embedding_retriever.filters
outputs:
documents: NvidiaRanker.documents
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The query to rank the documents against. | |
| documents | List[Document] | The list of documents to rank. | |
| top_k | Optional[int] | None | The number of documents to return. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | List of documents most similar to the query in descending order of similarity. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | Optional[str] | None | Ranking model to use. |
| truncate | Optional[Union[RankerTruncateMode, str]] | None | Truncation strategy to use. Can be "NONE", "END", or RankerTruncateMode. Defaults to NIM's default. |
| api_key | Optional[Secret] | Secret.from_env_var('NVIDIA_API_KEY') | API key for the NVIDIA NIM. |
| api_url | str | os.getenv('NVIDIA_API_URL', DEFAULT_API_URL) | Custom API URL for the NVIDIA NIM. |
| top_k | int | 5 | Number of documents to return. |
| query_prefix | str | A string to add at the beginning of the query text before ranking. Use it to prepend the text with an instruction, as required by reranking models like bge. | |
| document_prefix | str | A string to add at the beginning of each document before ranking. You can use it to prepend the document with an instruction, as required by embedding models like bge. | |
| meta_fields_to_embed | Optional[List[str]] | None | List of metadata fields to embed with the document. |
| embedding_separator | str | \n | Separator to concatenate metadata fields to the document. |
| timeout | Optional[float] | None | Timeout for request calls, if not set it is inferred from the NVIDIA_TIMEOUT environment variable or set to 60 by default. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The query to rank the documents against. | |
| documents | List[Document] | The list of documents to rank. | |
| top_k | Optional[int] | None | The number of documents to return. |
Was this page helpful?