Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

DeepsetNvidiaRanker

Rank documents by their relevance to the query using NVIDIA Triton.

Basic Information

  • Type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
  • Components it most often connects to:
    • Retrievers: DeepsetNvidiaRanker can receive documents from a Retriever and then rank them.
    • PromptBuilder: DeepsetNvidiaRanker can send the ranked documents to PromptBuilder, which adds them to the prompt for the LLM.
    • Any component that outputs a list of documents or accepts a list of documents as input.

Inputs

ParameterTypeDefaultDescription
querystrThe input query to compare the documents to.
documentsList[Document]A list of documents to be ranked.
top_kintNoneNone
scale_scoreboolNoneNone
calibration_factorfloatNoneNone
score_thresholdfloatNoneNone

Outputs

ParameterTypeDefaultDescription
documentsList[Document]A list of documents closest to the query, sorted from most similar to least similar.

Overview

DeepsetNvidiaRanker uses the NVIDIA Triton Inference Server to rank documents by their similarity to the query, assigning similarity scores to each document.

This component runs on optimized hardware and is usable within Haystack Enterprise Platform only, which means it doesn't work if you export it to a local Python file. If you're planning to export, use TransformersSimilarityRanker instead.

Default Model

The default reranker model for new pipelines is tomaarsen/Qwen3-Reranker-0.6B-seq-cls. Older models (intfloat/simlm-msmarco-reranker, BAAI/bge-reranker-v2-m3, and svalabs/cross-electra-ms-marco-german-uncased) are no longer available for new pipelines. If your existing pipeline uses one of these models, it continues to work without any changes, and the model appears labelled as legacy in the model list.

Usage Example

Basic Configuration

  DeepsetNvidiaRanker:
type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
init_parameters:
model: intfloat/simlm-msmarco-reranker
query_prefix: ''
document_prefix: ''
top_k: 10
embedding_separator: \n
scale_score: true
calibration_factor: 1

Using the Component in a Pipeline

This is an example of a query pipeline where DeepsetNvidiaRanker receives documents from a DocumentJoiner, ranks them, and then returns them as the pipeline output.

Here's the YAML configuration:

components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 768
similarity: cosine
top_k: 20
embedding_retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 768
similarity: cosine
top_k: 20
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
DeepsetNvidiaTextEmbedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
model: intfloat/multilingual-e5-base
prefix: ''
suffix: ''
truncate: null
normalize_embeddings: false
timeout: null
backend_kwargs: null
DeepsetNvidiaRanker:
type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
init_parameters:
model: tomaarsen/Qwen3-Reranker-0.6B-seq-cls
query_prefix: ''
document_prefix: ''
top_k: 10
score_threshold: null
meta_fields_to_embed: null
embedding_separator: \n
scale_score: true
calibration_factor: 1
timeout: null
backend_kwargs: null
connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: DeepsetNvidiaTextEmbedder.embedding
receiver: embedding_retriever.query_embedding
- sender: document_joiner.documents
receiver: DeepsetNvidiaRanker.documents
max_runs_per_component: 100
metadata: {}
inputs:
query:
- bm25_retriever.query
- DeepsetNvidiaTextEmbedder.text
- DeepsetNvidiaRanker.query
filters:
- bm25_retriever.filters
- embedding_retriever.filters
outputs:
documents: DeepsetNvidiaRanker.documents

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
modelDeepsetNVIDIARankingModelsDeepsetNVIDIARankingModels.TOMAARSEN_QWEN3_RERANKER_0_6B_SEQ_CLSThe model to use for ranking. Choose the model from a list on the component card. Legacy models are labelled as legacy in the picker and are only available if your pipeline already uses them.
query_prefixstrString to prepend to queries
document_prefixstrString to prepend to documents
top_kint10Maximum number of documents to return
batch_sizeint40The number of documents to rank at once.
score_thresholdfloatNoneNone
meta_fields_to_embedList[str]NoneNone
embedding_separatorstr\nSeparator used when concatenating metadata fields with the document content.
scale_scoreboolTrueIf True, scales raw logit predictions using a Sigmoid activation function.
calibration_factorfloat1Calibration factor used with sigmoid(logits * calibration_factor). Only used if scale_score is True.
timeoutfloatNoneNone
backend_kwargsdictNoneNone