ReferencePredictor Parameters

Customize ReferencePredictor using initiation parameters.

YAML Init Parameters

These are the parameters you can pass to this component in the pipeline YAML configuration:

ParameterTypePossible valuesDescription
modelStringDefault: cross-encoder/ms-marco-MiniLM-L-6-v2The name identifier of the model from Hugging Face or the path to a local model folder.
Required.
revisionStringDefault: NoneThe revision of the model to be used. Optional.
max_seq_lenIntegerDefault: 512The maximum number of tokens that a sequence should be truncated to before inference.
Required.
languageLanguageDefault: enThe language of the data that you want to generate references for. Needed to apply the right sentence splitting rules.
Required.
deviceComponentDeviceDefault: NoneThe device on which the model is loaded. If None, the default device is automatically selected.
Optional.
batch_sizeIntegerDefault: 16The batch size that should be used for inference.
Required.
answer_window_sizeIntegerDefault: 1The number of sentences of an answer that should be included in one span for inference.
Required.
answer_strideIntegerDefault: 1The stride size for answer window. Required.
document_window_sizeIntegerDefault: 3The number of sentences of a document that should be included in one span for inference.
Required.
document_strideIntegerDefault: 3The stride size for document window.
Required.
tokenSecretDefault: Secret.from_env_var("HF_API_TOKEN", strict=False)The token to use as HTTP bearer authorization for remote files.
Optional.
function_to_applyStringsigmoid, softmax, none
Default: sigmoid
The activation function to use on top of the logits.
Required.
min_score_2_label_thresholdsDictionaryDefault: NoneThe minimum prediction score threshold for each corresponding label.
Optional.
label_2_score_mapDictionaryDefault: NoneIf using a model with a multi label prediction head, pass in a dictionary mapping label names to a float value.
Optional.
reference_thresholdIntegerDefault: NoneThe minimum score threshold to determine if a prediction should be included as reference or not.
Optional.
default_classStringDefault: not_groundedA fallback class that should be used if the predicted score doesn't match any threshold.
Required.
verifiability_modelStringDefault: tstadel/answer-classification-setfit-v2-binaryThe name identifier of the verifiability model to be used on the Hugging Face hub or the path to a local model folder.
Optional.
verifiability_revisionStringDefault: NoneThe revision of the verifiability model to be used.
Optional.
verifiability_batch_sizeIntegerDefault: 32The batch size that should be used for verifiability inference.
Required.
needs_verification_classesList of stringsDefault: ["needs_verification"]The class names to be used to determine if a sentence needs verification.
Required.
use_split_rulesBooleanTrue, False
Default: False
If True, additional rules for better splitting answers are applied to the sentence splitting tokenizer.
Required.
extend_abbreviationsBooleanTrue, False
Default: False
If True, the abbreviations used by NLTK's PunktTokenizer are extended by a list of curated abbreviations if available. If False, the default abbreviations are used. Required.
model_kwargsDictionaryDefault: NoneAdditional keyword arguments for the model.
Optional.
verifiability_model_kwargsDictionaryDefault: NoneAdditional keyword arguments for the verifiability model.
Optional.

REST API Runtime Parameters

There are no runtime parameters you can pass to this component when making a request to the Search REST API endpoint.