DeepsetNvidiaRanker
Rank documents by their relevance to the query using NVIDIA Triton.
Basic Information
- Type:
deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker - Components it most often connects to:
- Retrievers:
DeepsetNvidiaRankercan receive documents from a Retriever and then rank them. PromptBuilder:DeepsetNvidiaRankercan send the ranked documents toPromptBuilder, which adds them to the prompt for the LLM.- Any component that outputs a list of documents or accepts a list of documents as input.
- Retrievers:
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The input query to compare the documents to. | |
| documents | List[Document] | A list of documents to be ranked. | |
| top_k | int | None | None |
| scale_score | bool | None | None |
| calibration_factor | float | None | None |
| score_threshold | float | None | None |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of documents closest to the query, sorted from most similar to least similar. |
Overview
DeepsetNvidiaRanker uses the NVIDIA Triton Inference Server to rank documents by their similarity to the query, assigning similarity scores to each document.
This component runs on optimized hardware and is usable within Haystack Enterprise Platform only, which means it doesn't work if you export it to a local Python file. If you're planning to export, use TransformersSimilarityRanker instead.
The default reranker model for new pipelines is tomaarsen/Qwen3-Reranker-0.6B-seq-cls. Older models (intfloat/simlm-msmarco-reranker, BAAI/bge-reranker-v2-m3, and svalabs/cross-electra-ms-marco-german-uncased) are no longer available for new pipelines. If your existing pipeline uses one of these models, it continues to work without any changes, and the model appears labelled as legacy in the model list.
Usage Example
Basic Configuration
DeepsetNvidiaRanker:
type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
init_parameters:
model: intfloat/simlm-msmarco-reranker
query_prefix: ''
document_prefix: ''
top_k: 10
embedding_separator: \n
scale_score: true
calibration_factor: 1
Using the Component in a Pipeline
This is an example of a query pipeline where DeepsetNvidiaRanker receives documents from a DocumentJoiner, ranks them, and then returns them as the pipeline output.
Here's the YAML configuration:
components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 768
similarity: cosine
top_k: 20
embedding_retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 768
similarity: cosine
top_k: 20
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
DeepsetNvidiaTextEmbedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
model: intfloat/multilingual-e5-base
prefix: ''
suffix: ''
truncate: null
normalize_embeddings: false
timeout: null
backend_kwargs: null
DeepsetNvidiaRanker:
type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
init_parameters:
model: tomaarsen/Qwen3-Reranker-0.6B-seq-cls
query_prefix: ''
document_prefix: ''
top_k: 10
score_threshold: null
meta_fields_to_embed: null
embedding_separator: \n
scale_score: true
calibration_factor: 1
timeout: null
backend_kwargs: null
connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: DeepsetNvidiaTextEmbedder.embedding
receiver: embedding_retriever.query_embedding
- sender: document_joiner.documents
receiver: DeepsetNvidiaRanker.documents
max_runs_per_component: 100
metadata: {}
inputs:
query:
- bm25_retriever.query
- DeepsetNvidiaTextEmbedder.text
- DeepsetNvidiaRanker.query
filters:
- bm25_retriever.filters
- embedding_retriever.filters
outputs:
documents: DeepsetNvidiaRanker.documents
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | DeepsetNVIDIARankingModels | DeepsetNVIDIARankingModels.TOMAARSEN_QWEN3_RERANKER_0_6B_SEQ_CLS | The model to use for ranking. Choose the model from a list on the component card. Legacy models are labelled as legacy in the picker and are only available if your pipeline already uses them. |
| query_prefix | str | String to prepend to queries | |
| document_prefix | str | String to prepend to documents | |
| top_k | int | 10 | Maximum number of documents to return |
| batch_size | int | 40 | The number of documents to rank at once. |
| score_threshold | float | None | None |
| meta_fields_to_embed | List[str] | None | None |
| embedding_separator | str | \n | Separator used when concatenating metadata fields with the document content. |
| scale_score | bool | True | If True, scales raw logit predictions using a Sigmoid activation function. |
| calibration_factor | float | 1 | Calibration factor used with sigmoid(logits * calibration_factor). Only used if scale_score is True. |
| timeout | float | None | None |
| backend_kwargs | dict | None | None |
Was this page helpful?