DeepsetNvidiaRanker

Rank documents by their relevance to the query using NVIDIA Triton.

Basic Information

  • Pipeline type: Query
  • Type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
  • Components it most often connects to:
    • Retrievers: DeepsetNvidiaRanker can receive documents from a Retriever and then rank them.
    • PromptBuilder: DeepsetNvidiaRanker can send the ranked documents to PromptBuilder, which adds them to the prompt for the LLM.
    • Any component that outputs a list of documents or accepts a list of documents as input.

Inputs

NameTypeDefaultDescription
queryString-The query used for ranking documents by their similarity to the query.
documentsList of Document objects-The documents to be ranked.
top_kIntegerNoneThe maximum number of documents to return.
scale_scoreBooleanNoneIndicates if the score should be scaled. Possible values:
True: Scales the raw logit predictions using a Sigmoid activation function.
False: Disables scaling raw logit predictions.
calibration_factorFloatNoneThe factor to calibrate probabilities with sigmoid(logits * calibration_factor). Used only if scale_score=True.
score_thresholdFloatNoneReturns only documents with a score above this threshold.

Outputs

NameTypeDescription
documentsList of Document objectsThe ranked documents sorted in descending order from the most similar to the query to the least similar.

Overview

DeepsetNvidiaRanker uses the NVIDIA Triton Inference Server to rank documents by their similarity to the query, assigning similarity scores to each document.

This component runs on optimized hardware and is usable within deepset Cloud only, which means it doesn't work if you export it to a local Python file. If you're planning to export, use TransformersSimilarityRanker instead.

Usage Example

This is an example of a query pipeline where DeepsetNvidiaRanker receives documents from a DocumentJoiner, ranks them, and then returns them as the pipeline output.

Here's the YAML configuration:

components:
  bm25_retriever:
    type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          use_ssl: true
          verify_certs: false
          hosts:
            - ${OPENSEARCH_HOST}
          http_auth:
            - ${OPENSEARCH_USER}
            - ${OPENSEARCH_PASSWORD}
          embedding_dim: 768
          similarity: cosine
      top_k: 20
  embedding_retriever:
    type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          use_ssl: true
          verify_certs: false
          hosts:
            - ${OPENSEARCH_HOST}
          http_auth:
            - ${OPENSEARCH_USER}
            - ${OPENSEARCH_PASSWORD}
          embedding_dim: 768
          similarity: cosine
      top_k: 20
  document_joiner:
    type: haystack.components.joiners.document_joiner.DocumentJoiner
    init_parameters:
      join_mode: concatenate
  DeepsetNvidiaTextEmbedder:
    type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
    init_parameters:
      model: intfloat/multilingual-e5-base
      prefix: ''
      suffix: ''
      truncate: null
      normalize_embeddings: false
      timeout: null
      backend_kwargs: null
  DeepsetNvidiaRanker:
    type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
    init_parameters:
      model: intfloat/simlm-msmarco-reranker
      query_prefix: ''
      document_prefix: ''
      top_k: 10
      score_threshold: null
      meta_fields_to_embed: null
      embedding_separator: \n
      scale_score: true
      calibration_factor: 1
      timeout: null
      backend_kwargs: null
connections:
  - sender: bm25_retriever.documents
    receiver: document_joiner.documents
  - sender: embedding_retriever.documents
    receiver: document_joiner.documents
  - sender: DeepsetNvidiaTextEmbedder.embedding
    receiver: embedding_retriever.query_embedding
  - sender: document_joiner.documents
    receiver: DeepsetNvidiaRanker.documents
max_runs_per_component: 100
metadata: {}
inputs:
  query:
    - bm25_retriever.query
    - DeepsetNvidiaTextEmbedder.text
    - DeepsetNvidiaRanker.query
  filters:
    - bm25_retriever.filters
    - embedding_retriever.filters
outputs:
  documents: DeepsetNvidiaRanker.documents

Init Parameters

ParameterTypePossible valuesDescription
modelStringDefault: intfloat/simlm-msmarco-rerankerThe model to use for ranking. Currently only the intfloat/simlm-msmarco-reranker model is supported.
Required.
query_prefixStringDefault: ""String to prepend to queries.
Required.
document_prefixStringDefault: ""String to prepend to documents.
Required.
top_kIntegerDefault: 10Maximum number of documents to return.
Required.
score_thresholdFloatDefault: NoneMinimum score threshold for returned documents.
Required.
meta_fields_to_embedList of stringsDefault: NoneList of metadata fields to include in embedding. Required.
embedding_separatorStringDefault: "\n"Separator for concatenating metadata fields.
Required.
scale_scoreBooleanDefault: TrueWhether to scale the scores using a sigmoid function.
Required.
calibration_factorFloat Default: 1.0Factor to calibrate probabilities when scaling scores.
Required.
timeoutFloat or NoneDefault: NoneTimeout in seconds for the Triton server requests. Required.
backend_kwargsDictionaryDefault: NoneAdditional keyword arguments to pass to the backend.
Required.