ReferencePredictor Parameters

Customize ReferencePredictor using initiation parameters.

YAML Init Parameters

These are the parameters you can pass to this component in the pipeline YAML configuration:

Parameter

Type

Possible values

Description

model

String

Default: cross-encoder/ms-marco-MiniLM-L-6-v2

The name identifier of the model from Hugging Face or the path to a local model folder.
Required.

revision

String

Default: None

The revision of the model to be used. Optional.

max_seq_len

Integer

Default: 512

The maximum number of tokens that a sequence should be truncated to before inference.
Required.

language

Language

Default: en

The language of the data that you want to generate references for. Needed to apply the right sentence splitting rules.
Required.

device

ComponentDevice

Default: None

The device on which the model is loaded. If None, the default device is automatically selected.
Optional.

batch_size

Integer

Default: 16

The batch size that should be used for inference.
Required.

answer_window_size

Integer

Default: 1

The number of sentences of an answer that should be included in one span for inference.
Required.

answer_stride

Integer

Default: 1

The stride size for answer window. Required.

document_window_size

Integer

Default: 3

The number of sentences of a document that should be included in one span for inference.
Required.

document_stride

Integer

Default: 3

The stride size for document window.
Required.

token

Secret

Default: Secret.from_env_var("HF_API_TOKEN", strict=False)

The token to use as HTTP bearer authorization for remote files.
Optional.

function_to_apply

String

sigmoid, softmax, none
Default: sigmoid

The activation function to use on top of the logits.
Required.

min_score_2_label_thresholds

Dictionary

Default: None

The minimum prediction score threshold for each corresponding label.
Optional.

label_2_score_map

Dictionary

Default: None

If using a model with a multi label prediction head, pass in a dictionary mapping label names to a float value.
Optional.

reference_threshold

Integer

Default: None

The minimum score threshold to determine if a prediction should be included as reference or not.
Optional.

default_class

String

Default: not_grounded

A fallback class that should be used if the predicted score doesn't match any threshold.
Required.

verifiability_model

String

Default: tstadel/answer-classification-setfit-v2-binary

The name identifier of the verifiability model to be used on the Hugging Face hub or the path to a local model folder.
Optional.

verifiability_revision

String

Default: None

The revision of the verifiability model to be used.
Optional.

verifiability_batch_size

Integer

Default: 32

The batch size that should be used for verifiability inference.
Required.

needs_verification_classes

List of strings

Default: ["needs_verification"]

The class names to be used to determine if a sentence needs verification.
Required.

use_split_rules

Boolean

True, False
Default: False

If True, additional rules for better splitting answers are applied to the sentence splitting tokenizer.
Required.

extend_abbreviations

Boolean

True, False
Default: False

If True, the abbreviations used by NLTK's PunktTokenizer are extended by a list of curated abbreviations if available. If False, the default abbreviations are used. Required.

model_kwargs

Dictionary

Default: None

Additional keyword arguments for the model.
Optional.

verifiability_model_kwargs

Dictionary

Default: None

Additional keyword arguments for the verifiability model.
Optional.

REST API Runtime Parameters

There are no runtime parameters you can pass to this component when making a request to the Search REST API endpoint.