Customize ReferencePredictor using initiation parameters.
YAML Init Parameters
These are the parameters you can pass to this component in the pipeline YAML configuration:
Parameter | Type | Possible values | Description |
---|---|---|---|
model | String | Default: cross-encoder/ms-marco-MiniLM-L-6-v2 | The name identifier of the model from Hugging Face or the path to a local model folder. Required. |
revision | String | Default: None | The revision of the model to be used. Optional. |
max_seq_len | Integer | Default: 512 | The maximum number of tokens that a sequence should be truncated to before inference. Required. |
language | Language | Default: en | The language of the data that you want to generate references for. Needed to apply the right sentence splitting rules. Required. |
device | ComponentDevice | Default: None | The device on which the model is loaded. If None , the default device is automatically selected.Optional. |
batch_size | Integer | Default: 16 | The batch size that should be used for inference. Required. |
answer_window_size | Integer | Default: 1 | The number of sentences of an answer that should be included in one span for inference. Required. |
answer_stride | Integer | Default: 1 | The stride size for answer window. Required. |
document_window_size | Integer | Default: 3 | The number of sentences of a document that should be included in one span for inference. Required. |
document_stride | Integer | Default: 3 | The stride size for document window. Required. |
token | Secret | Default: Secret.from_env_var("HF_API_TOKEN", strict=False) | The token to use as HTTP bearer authorization for remote files. Optional. |
function_to_apply | String | sigmoid , softmax , none Default: sigmoid | The activation function to use on top of the logits. Required. |
min_score_2_label_thresholds | Dictionary | Default: None | The minimum prediction score threshold for each corresponding label. Optional. |
label_2_score_map | Dictionary | Default: None | If using a model with a multi label prediction head, pass in a dictionary mapping label names to a float value. Optional. |
reference_threshold | Integer | Default: None | The minimum score threshold to determine if a prediction should be included as reference or not. Optional. |
default_class | String | Default: not_grounded | A fallback class that should be used if the predicted score doesn't match any threshold. Required. |
verifiability_model | String | Default: tstadel/answer-classification-setfit-v2-binary | The name identifier of the verifiability model to be used on the Hugging Face hub or the path to a local model folder. Optional. |
verifiability_revision | String | Default: None | The revision of the verifiability model to be used. Optional. |
verifiability_batch_size | Integer | Default: 32 | The batch size that should be used for verifiability inference. Required. |
needs_verification_classes | List of strings | Default: ["needs_verification"] | The class names to be used to determine if a sentence needs verification. Required. |
use_split_rules | Boolean | True , False Default: False | If True , additional rules for better splitting answers are applied to the sentence splitting tokenizer.Required. |
extend_abbreviations | Boolean | True , False Default: False | If True , the abbreviations used by NLTK's PunktTokenizer are extended by a list of curated abbreviations if available. If False , the default abbreviations are used. Required. |
model_kwargs | Dictionary | Default: None | Additional keyword arguments for the model. Optional. |
verifiability_model_kwargs | Dictionary | Default: None | Additional keyword arguments for the verifiability model. Optional. |
REST API Runtime Parameters
There are no runtime parameters you can pass to this component when making a request to the Search REST API endpoint.