ReferencePredictor

Use this component in generative question answering pipelines to predict references for the generated answer.

ReferencePredictor shows references to documents that the LLM's answer is based on. Pipelines that contain ReferencePredictor return answers with references next to them. You can easily view the reference to check if the answer is based on it and make sure the model didn't hallucinate.

An answer on the Search page with references next to each sentence.

Basic Information

  • Pipeline type: Used in generative query pipelines.
  • Nodes that can precede it in a pipeline: PromptNode
  • Nodes that can follow it in a pipeline: PromptNode, but it's typically used as the last node in a pipeline
  • Input: Documents, Answers
  • Output: Answers

Usage Example

When using ReferencePredictor in a pipeline with PromptNode, make sure the prompt doesn't instruct the model to generate references. ReferencePredictor takes care of this.

Use this component in generative QA pipelines to show possible references for the answer. First, configure it in the components section, and then add it to your query pipeline. This pipeline uses the default settings of ReferencePredictor:

components:
  ...
  - name: PromptNode
    type: PromptNode
    params:
      default_prompt_template: deepset/question-answeringe
      max_length: 400 # The maximum number of tokens the generated answer can have
      model_kwargs: # Specifies additional model settings
        temperature: 0 # Lower temperature works best for fact-based qa
      model_name_or_path: gpt-3.5-turbo
  - name: ReferencePredictor
    type: ReferencePredictor # This component uses the default settings
    
pipelines:
  - name: query
    nodes:
      - name: BM25Retriever
        inputs: [Query]
      - name: EmbeddingRetriever
        inputs: [Query]
      - name: JoinResults
        inputs: [BM25Retriever, EmbeddingRetriever]
      - name: Reranker
        inputs: [JoinResults]
      - name: PromptNode
        inputs: [Reranker]
      - name: ReferencePredictor # ReferencePredictor comes after PromptNode and processes PromptNode's output
        inputs: [PromptNode]    
        ...
# here comes the indexing pipeline
        
        
   
 

Here's an example of a pipeline where ReferencePredictor uses a custom model and is configured to work with German documents:

components:
  ...
  - name: PromptNode
    type: PromptNode
    params:
      default_prompt_template: deepset/question-answering
      max_length: 400 # The maximum number of tokens the generated answer can have
      model_kwargs: # Specifies additional model settings
        temperature: 0 # Lower temperature works best for fact-based qa
      model_name_or_path: gpt-3.5-turbo
  - name: ReferencePredictor
    type: ReferencePredictor # This component uses settings to make it work in german
		params:
			model_name_or_path: svalabs/cross-electra-ms-marco-german-uncased
			language: de
			answer_window_size: 2
			answer_stride: 2

pipelines:
  - name: query
    nodes:
      - name: BM25Retriever
        inputs: [Query]
      - name: EmbeddingRetriever
        inputs: [Query]
      - name: JoinResults
        inputs: [BM25Retriever, EmbeddingRetriever]
      - name: Reranker
        inputs: [JoinResults]
      - name: PromptNode
        inputs: [Reranker]
      - name: ReferencePredictor # ReferencePredictor comes after PromptNode and processes PromptNode's output
        inputs: [PromptNode]
        ...
# here comes the indexing pipeline

Arguments

Here are the arguments you can set for ReferencePredictor:

ArgumentTypePossible ValuesDescription
model_name_or_pathStringDefault: cross-encoder/ms-marco-MiniLM-L-6-v2The name identifier of the model hosted in the Hugging Face Hub or the path to a locally saved model.
Mandatory.
model_versionStringDefault: NoneThe version of the model to use.
Optional.
max_seq_lenIntegerDefault: 512Specifies the maximum number of tokens the sequence text can have. The sequence text is the answer and the document span combined. Longer sequences are truncated.
Mandatory.
languageStringDefault: enThe language of the data for which you want to generate references. It's needed to apply the correct sentence-splitting rules.
Mandatory.
batch_sizeIntegerDefault: 16The number of batches to be processed at once. A batch is the number of answers and document spans that get processed.
Mandatory.
answer_window_sizeIntegerDefault: 1The length of the answer span for which you want ReferencePredictor to generate a reference. The length is in sentences, so setting it to 1 means that the answer span is one sentence, so there'll be a reference generated for each sentence in the answer.
If answer_window_size=2, it means the answer span contains two sentences, so there's a reference generated for each answer span that consists of two sentences.
Mandatory.
answer_strideIntegerDefault: 1The number of sentences that overlap between adjacent answer spans. For example, if answer_window_size=3 (meaning the answer span is three sentences) and answer_stride=1, there is an overlap of one sentence between each answer span. So in this scenario, the first answer span would be sentences 1 to 3, the second answer span would be sentences 2 to 4, the third 3 to 5, and so on.
Mandatory.
document_window_sizeIntegerDefault: 3The length of the document span for which you want ReferencePredictor to generate a reference. The length is in sentences, so setting it to 1 means that the document span is one sentence, so there'll be a reference generated for each sentence in the answer.
If document_window_size=3, it means the document span contains three sentences, so there's a reference generated for each document span that consists of three sentences.
Mandatory.
document_strideIntegerDefault: 3The number of sentences that overlap between adjacent document spans. For example, if document_window_size=3 (meaning the document span is three sentences) and document_stride=1, there is an overlap of one sentence between each document span. So, in this scenario, the first document span would be sentences 1 to 3, the second document span would be sentences 2 to 4, the third 3 to 5, and so on.
Mandatory.
use_auth_tokenUnion[string, Boolean]Default: NoneThe token needed to access private models on Hugging Face. Use only if you're using a private model hosted on Hugging Face.
Optional.
function_to_applyStringsigmoid
softmax
none
Default: sigmoid
The activation function to use on top of the logits.
Mandatory.
min_score_2_label_thresholdsDictionaryDefault: NoneThe minimum prediction score threshold for each corresponding label.
Optional.
label_2_score_mapDictionarylabel: score
(example: positive: 0.75)
Default: None
If using a model with a multi-label prediction head, pass a dictionary mapping label names to a float value that will be used as a score. You do this to make it possible to aggregate and compare scores later on.
Optional.
reference_thresholdFloatDefault: NoneIf you're using this component to generate references for answer spans, you can pass a minimum threshold that determines if you want the model to include a prediction as a reference for the answer or not. If you don't set any threshold, the model chooses the reference by picking the maximum score.
Optional.
default_classStringDefault: not_groundedA class to be used if the predicted score doesn't match any threshold.
Mandatory.