RetrievalScoreAdjuster

Use RetrievalScoreAdjuster in your query pipelines to adjust the document scores assigned by EmbeddingRetriever or Ranker.

Suggest Edits

RetrievalScoreAdjuster adjusts the scores assigned to documents by EmbeddingRetriever or Ranker. It can undo score adjustments EmbeddingRetriever or Ranker made and apply new score scaling using the sigmoid function. You can customize what raw score is the midpoint (50% score) and how spread out you want the scores to be.

Basic Information

Pipeline type: Used in query pipelines
Nodes that can precede it in a pipeline: Used after Retriever, Ranker
Nodes that can follow it in a pipeline: Retriever, Ranker, Reader, PromptNode
Input: Documents
Output: Documents
Available node classes: RetrievalScoreAdjuster

Usage Example

Here's how you could configure RetrievalScoreAdjuster and use it in a pipeline:

components:
  - name: DocumentStore
    type: DeepsetCloudDocumentStore # The only supported document store in deepset Cloud
    params:
      similarity: cosine
  - name: Retriever # Selects the most relevant documents from the document store
    type: EmbeddingRetriever # Uses one Transformer model to encode the document and the query
    params:
      document_store: DocumentStore
      embedding_model: intfloat/multilingual-e5-base # Model optimized for semantic search
      model_format: sentence_transformers
      top_k: 20 # The number of results to return
  - name: Reranker
    type: SentenceTransformersRanker
    params:
      model_name_or_path: svalabs/cross-electra-ms-marco-german-uncased # German-language re-ranker
      top_k: 10
      scale_score: false
  - name: RetrievalScoreAdjuster
    type: RetrievalScoreAdjuster
    params:
      midpoint: 2
      spread_factor: 1.5
      ...
      
 pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: Reranker
        inputs: [Retriever]
      - name: RetrievalScoreAdjuster
        inputs: [Reranker]
        ...

Parameters

Here are the parameters you can pass to RetrievalScoreAdjuster in pipeline YAML:

Parameter	Type	Possible Values	Description
`undo_scale_score`	Boolean	`True` `False` Default: `False`	Unscales the scores assigned by EmbeddingRetriever or Ranker using the logit function. Mandatory.
`scale_score`	Boolean	`True` `False` Default: `True`	Rescales the scores using the sigmoid function. Mandatory.
`midpoint`	Float	Default: `0.0`	Specifies the midpoint value for scaling score. The midpoint value is the value mapped to 50%. Used if `scale_score=True`. Mandatory.
`spread_factor`	Float	Default: `1.0`	If `scale_score=True`, pushes scores closer or farther away from 50%. A higher spread factor results in scores closer to 0% or 100%. A lower spread factor results in scores closer to 50%. This value must be greater than 0. Mandatory.
`top_spread_factor`	Float	Default: `None`	If `scale_score=True`, pushes scores that are above `midpoint` closer or farther away from 100%. A higher top spread factor results in scores closer to 100%, while a lower top spread factor results in scores closer to 50%. This value must be greater than 0. `top_spread_factor` takes precedence over `spread_factor` if both are specified. Optional.
`bottom_spread_factor`	Float	Default: `None`	If `scale_score=True`, pushes scores that are below `midpoint` closer or farther away from 0%. A higher `bottom_spread_factor` results in scores closer to 0%. A lower `bottom_spread_factor` results in scores closer to 50%. This value must be greater than 0. `bottom_spread_factor` takes precedence over `spread_factor` if both are specified. Optional.
`threshold`	Float	Default: `None`	Sets a threshold for document score. If set, only documents above this threshold (after all adjustments) are returned. Optional.

Updated 3 months ago