DeepsetNvidiaNIMRanker
Ranks documents by their relevance to the query using NVIDIA NIM models for inference.
Basic Information
- Type:
deepset_cloud_custom_nodes.rankers.nvidia.nim_ranker.DeepsetNvidiaNIMRanker
-
- Components it most often connects to:
- Retrievers:
DeepsetNvidiaNIMRanker
can receive documents from a Retriever and then rank them. PromptBuilder
:DeepsetNvidiaNIMRanker
can send the ranked documents toPromptBuilder
, which adds them to the prompt for the LLM.- Any component that outputs a list of documents or accepts a list of documents as input.
Inputs
Parameter | Type | Default | Description |
---|---|---|---|
query | str | The input query to compare the documents to. | |
documents | List[Document] | A list of documents to be ranked. | |
top_k | int | None | None | The maximum number of documents to return. |
scale_score | bool | None | None | If True , scales the raw logit predictions using a Sigmoid activation function. If False , disables scaling of the raw logit predictions. |
calibration_factor | float | None | None | Use this factor to calibrate probabilities with sigmoid(logits * calibration_factor) . Used only if scale_score is True . |
score_threshold | float | None | None | The minimum score for documents to be included int he result. |
Outputs
Parameter | Type | Default | Description |
---|---|---|---|
documents | List[Document] | Documents closest to the query, sorted from most similar to least similar. |
Overview
DeepsetNvidiaNIMRanker
uses NVIDIA NIM models to rank documents by their similarity to the query. It expects a model that takes query and document text as input and returns similarity scores.
This component runs on models provided by on hardware optimized for performance. Unlike models hosted on platforms like Hugging Face, these models are not downloaded at query time. Instead, you choose a model upfront on the component card.
The optimized models are only available on . To run this component on your own hardware, use TransformersSimiliarityRanker
instead.
Usage Example
Initializing the Component
components:
DeepsetNvidiaNIMRanker:
type: rankers.nvidia.nim_ranker.DeepsetNvidiaNIMRanker
init_parameters:
Using the Component in a Pipeline
In this example, the Ranker receives joined documents from a keyword and a semantic retriever and returns the ranked documents as pipeline output:
components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 768
similarity: cosine
index: ''
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
top_k: 20
embedding_retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 768
similarity: cosine
index: ''
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
top_k: 20
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
DeepsetNvidiaTextEmbedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
model: intfloat/multilingual-e5-base
prefix: ''
suffix: ''
truncate:
normalize_embeddings: false
timeout:
backend_kwargs:
DeepsetNvidiaNIMRanker:
type: deepset_cloud_custom_nodes.rankers.nvidia.nim_ranker.DeepsetNvidiaNIMRanker
init_parameters:
model: nvidia/llama-3.2-nv-rerankqa-1b-v2
query_prefix: ''
document_prefix: ''
top_k: 10
batch_size: 40
score_threshold:
meta_fields_to_embed:
embedding_separator: \n
scale_score: true
calibration_factor: 1
timeout:
truncate:
backend_kwargs:
connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: DeepsetNvidiaTextEmbedder.embedding
receiver: embedding_retriever.query_embedding
- sender: document_joiner.documents
receiver: DeepsetNvidiaNIMRanker.documents
max_runs_per_component: 100
metadata: {}
inputs:
query:
- bm25_retriever.query
- DeepsetNvidiaTextEmbedder.text
- DeepsetNvidiaNIMRanker.query
filters:
- bm25_retriever.filters
- embedding_retriever.filters
outputs:
documents: DeepsetNvidiaNIMRanker.documents
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
Parameter | Type | Default | Description | |
---|---|---|---|---|
model | DeepsetNvidiaNIMRankingModels | DeepsetNvidiaNIMRankingModels.NVIDIA_LLAMA_3_2_NV_RERANKQA_1B_V2 | The model to use for ranking. | |
query_prefix | str | String to prepend to queries. | ||
document_prefix | str | String to prepend to documents. | ||
top_k | int | 10 | Maximum number of documents to return. | |
batch_size | int | 40 | The number of documents to rank at once. | |
score_threshold | float | None | None | Minimum score threshold for returned documents to be included in the results. |
meta_fields_to_embed | List[str] | None | None | List of metadata fields to include in embedding. |
embedding_separator | str | \n | Separator for concatenating metadata fields. | |
scale_score | bool | True | Whether to scale the scores using a sigmoid function. | |
calibration_factor | float | None | 1.0 | Factor to calibrate probabilities when scaling scores. |
timeout | float | None | None | Timeout in seconds for the Triton server requests. |
truncate | Optional[EmbeddingTruncateMode] | None | Specifies how to truncate inputs longer than the maximum token length. Possible options are: START , END , NONE . If set to START , the input is truncated from the start. If set to END , the input is truncated from the end. If set to NONE , returns an error if the input is too long. | |
backend_kwargs | Dict[str, Any] | None | None | Additional keyword arguments to pass to the backend. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
Parameter | Type | Default | Description | |
---|---|---|---|---|
query | str | The input query to compare the documents to. | ||
documents | List[Document] | A list of documents to be ranked. | ||
top_k | int | None | None | The maximum number of documents to return. |
scale_score | bool | None | None | If True , scales the raw logit predictions using a Sigmoid activation function. If False , disables scaling of the raw logit predictions. |
calibration_factor | float | None | None | Use this factor to calibrate probabilities with sigmoid(logits * calibration_factor) . Used only if scale_score is True . |
score_threshold | float | None | None | Use it to return documents only with a score above this threshold. |
Updated 4 days ago