NvidiaRanker
Rank documents based on their similarity to the query using NVIDIA models.
Basic Information
- Type:
haystack_integrations.components.rankers.nvidia.ranker.NvidiaRanker - Components it can connect with:
Retriever:NvidiaRankerreceives documents fromRetriever.PromptBuilder:NvidiaRankercan send the ranked documents toPromptBuilderto be used in a prompt.- Any component that outputs documents or accepts documents as input.
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The query to rank the documents against. | |
| documents | List[Document] | The list of documents to rank. | |
| top_k | Optional[int] | None | The number of documents to return. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | List of documents most similar to the query in descending order of similarity. |
Overview
NvidiaRanker ranks documents based on their semantic similarity to the user query. It uses ranking models provided by NVIDIA NIMs. The default model for this ranker is nvidia/nv-rerankqa-mistral-4b-v3.
Documents are ordered from most to least semantically relevant to the query.
You can also specify the top_k parameter to set the maximum number of documents to return.
Authorization
You need an NVIDIA API key to use this component. Connect Haystack Enterprise Platform to NVIDIA on the Integrations page. For detailed instructions, see Use NVIDIA Models.
Usage Example
Using the Component in a Pipeline
This is an example of a document search pipeline where NvidiaRanker receives joined documents from both a keyword and a semantic retriever. It then ranks the documents based on their similarity to the user query and outputs them as the final result.
components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 1024
similarity: cosine
index: ''
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
top_k: 20
embedding_retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 1024
similarity: cosine
index: ''
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
top_k: 20
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
NvidiaTextEmbedder:
type: haystack_integrations.components.embedders.nvidia.text_embedder.NvidiaTextEmbedder
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: nvidia/nv-embedqa-e5-v5
api_url: https://integrate.api.nvidia.com/v1
prefix: ''
suffix: ''
truncate:
timeout:
NvidiaRanker:
type: haystack_integrations.components.rankers.nvidia.ranker.NvidiaRanker
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: nvidia/nv-rerankqa-mistral-4b-v3
api_url: https://integrate.api.nvidia.com/v1
top_k: 10
truncate:
query_prefix: ''
document_prefix: ''
meta_fields_to_embed:
embedding_separator: "\n"
timeout:
connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: NvidiaTextEmbedder.embedding
receiver: embedding_retriever.query_embedding
- sender: document_joiner.documents
receiver: NvidiaRanker.documents
max_runs_per_component: 100
metadata: {}
inputs:
query:
- bm25_retriever.query
- NvidiaTextEmbedder.text
- NvidiaRanker.query
filters:
- bm25_retriever.filters
- embedding_retriever.filters
outputs:
documents: NvidiaRanker.documents
top_k parameterIn the example above, the top_k values for the retrievers and the ranker are different. The retrievers' top_k specifies how many documents they return. The ranker then orders these documents.
You can set the same or a smaller top_k value for the ranker. The ranker's top_k is the number of documents it returns (if it's the last component in the pipeline) or forwards to the next component. In the pipeline example above, the ranker is the last component, so the output you get when you run the pipeline are the top 10 documents, as per the ranker's top_k.
Adjusting the top_k values can help you optimize performance. In this case, a smaller top_k value of the retrievers means fewer documents to process for the ranker, which can speed up the pipeline.
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | Optional[str] | None | Ranking model to use. |
| truncate | Optional[Union[RankerTruncateMode, str]] | None | Truncation strategy to use. Can be "NONE", "END", or RankerTruncateMode. Defaults to NIM's default. |
| api_key | Optional[Secret] | Secret.from_env_var('NVIDIA_API_KEY') | API key for the NVIDIA NIM. |
| api_url | str | os.getenv('NVIDIA_API_URL', DEFAULT_API_URL) | Custom API URL for the NVIDIA NIM. |
| top_k | int | 5 | Number of documents to return. |
| query_prefix | str | A string to add at the beginning of the query text before ranking. Use it to prepend the text with an instruction, as required by reranking models like bge. | |
| document_prefix | str | A string to add at the beginning of each document before ranking. You can use it to prepend the document with an instruction, as required by embedding models like bge. | |
| meta_fields_to_embed | Optional[List[str]] | None | List of metadata fields to embed with the document. |
| embedding_separator | str | \n | Separator to concatenate metadata fields to the document. |
| timeout | Optional[float] | None | Timeout for request calls, if not set it is inferred from the NVIDIA_TIMEOUT environment variable or set to 60 by default. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The query to rank the documents against. | |
| documents | List[Document] | The list of documents to rank. | |
| top_k | Optional[int] | None | The number of documents to return. |
Was this page helpful?