DeepsetNvidiaRanker

Rank documents by their relevance to the query using NVIDIA Triton.

Basic Information

Type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
Components it most often connects to:
- Retrievers: DeepsetNvidiaRanker can receive documents from a Retriever and then rank them.
- PromptBuilder: DeepsetNvidiaRanker can send the ranked documents to PromptBuilder, which adds them to the prompt for the LLM.
- Any component that outputs a list of documents or accepts a list of documents as input.

Inputs

Name	Type	Default	Description
`query`	String	-	The query used for ranking documents by their similarity to the query.
`documents`	List of `Document` objects	-	The documents to be ranked.
`top_k`	Integer	`None`	The maximum number of documents to return.
`scale_score`	Boolean	`None`	Indicates if the score should be scaled. Possible values: `True`: Scales the raw logit predictions using a Sigmoid activation function. `False`: Disables scaling raw logit predictions.
`calibration_factor`	Float	`None`	The factor to calibrate probabilities with `sigmoid(logits * calibration_factor)`. Used only if `scale_score=True`.
`score_threshold`	Float	`None`	Returns only documents with a score above this threshold.

Outputs

Name	Type	Description
`documents`	List of `Document` objects	The ranked documents sorted in descending order from the most similar to the query to the least similar.

Overview

DeepsetNvidiaRanker uses the NVIDIA Triton Inference Server to rank documents by their similarity to the query, assigning similarity scores to each document.

This component runs on optimized hardware and is usable within deepset AI Platform only, which means it doesn't work if you export it to a local Python file. If you're planning to export, use TransformersSimilarityRanker instead.

Usage Example

This is an example of a query pipeline where DeepsetNvidiaRanker receives documents from a DocumentJoiner, ranks them, and then returns them as the pipeline output.

Here's the YAML configuration:

components:
  bm25_retriever:
    type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          use_ssl: true
          verify_certs: false
          hosts:
            - ${OPENSEARCH_HOST}
          http_auth:
            - ${OPENSEARCH_USER}
            - ${OPENSEARCH_PASSWORD}
          embedding_dim: 768
          similarity: cosine
      top_k: 20
  embedding_retriever:
    type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          use_ssl: true
          verify_certs: false
          hosts:
            - ${OPENSEARCH_HOST}
          http_auth:
            - ${OPENSEARCH_USER}
            - ${OPENSEARCH_PASSWORD}
          embedding_dim: 768
          similarity: cosine
      top_k: 20
  document_joiner:
    type: haystack.components.joiners.document_joiner.DocumentJoiner
    init_parameters:
      join_mode: concatenate
  DeepsetNvidiaTextEmbedder:
    type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
    init_parameters:
      model: intfloat/multilingual-e5-base
      prefix: ''
      suffix: ''
      truncate: null
      normalize_embeddings: false
      timeout: null
      backend_kwargs: null
  DeepsetNvidiaRanker:
    type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
    init_parameters:
      model: intfloat/simlm-msmarco-reranker
      query_prefix: ''
      document_prefix: ''
      top_k: 10
      score_threshold: null
      meta_fields_to_embed: null
      embedding_separator: \n
      scale_score: true
      calibration_factor: 1
      timeout: null
      backend_kwargs: null
connections:
  - sender: bm25_retriever.documents
    receiver: document_joiner.documents
  - sender: embedding_retriever.documents
    receiver: document_joiner.documents
  - sender: DeepsetNvidiaTextEmbedder.embedding
    receiver: embedding_retriever.query_embedding
  - sender: document_joiner.documents
    receiver: DeepsetNvidiaRanker.documents
max_runs_per_component: 100
metadata: {}
inputs:
  query:
    - bm25_retriever.query
    - DeepsetNvidiaTextEmbedder.text
    - DeepsetNvidiaRanker.query
  filters:
    - bm25_retriever.filters
    - embedding_retriever.filters
outputs:
  documents: DeepsetNvidiaRanker.documents

Init Parameters

Parameter	Type	Possible values	Description
`model`	String	Default: `intfloat/simlm-msmarco-reranker`	The model to use for ranking. Currently only the `intfloat/simlm-msmarco-reranker` model is supported. Required.
`query_prefix`	String	Default: `""`	String to prepend to queries. Required.
`document_prefix`	String	Default: `""`	String to prepend to documents. Required.
`top_k`	Integer	Default: `10`	Maximum number of documents to return. Required.
`score_threshold`	Float	Default: `None`	Minimum score threshold for returned documents. Required.
`meta_fields_to_embed`	List of strings	Default: `None`	List of metadata fields to include in embedding. Required.
`embedding_separator`	String	Default: `"\n"`	Separator for concatenating metadata fields. Required.
`scale_score`	Boolean	Default: `True`	Whether to scale the scores using a sigmoid function. Required.
`calibration_factor`	Float	Default: `1.0`	Factor to calibrate probabilities when scaling scores. Required.
`timeout`	Float or None	Default: `None`	Timeout in seconds for the Triton server requests. Required.
`backend_kwargs`	Dictionary	Default: `None`	Additional keyword arguments to pass to the backend. Required.

Updated about 2 months ago