Check the inint and runtime parameters you can specify for different reader types available in deepset Cloud.
YAML Init Parameters
These are the parameters you can specify in pipeline YAML:
FARM Reader Parameters
Parameter | Type | Possible Values | Description |
---|---|---|---|
model_name_or_path | String | Example: deepset/bert-base-cased-squad2 | Specifies the model that the reader should use. Type a path to a locally saved model or the name of a public model from Hugging Face. For a list of available models, see Hugging Face Models. Mandatory. |
model_version | String | Tag name, branch name, or commit hash | Specifies the version of the model from the Hugging Face model hub. Optional. |
context_window_size | Integer | Default: 150 | Specifies the size of the window that defines how many of the surrounding characters are considered as the context of an answer text. Used when displaying the context around the answer. Mandatory. |
batch_size | Integer | Default: 50 | Specifies the number of samples that the model receives in one batch for inference. Memory consumption is lower in inference mode, so we recommend that you use a single batch. Mandatory. |
use_gpu | Boolean | True (default)False | Uses GPU if available. Mandatory. |
devices | A list of string and torch devices. | Default: None | List of torch devices (for example cuda, cpu, mps) you want to limit inference to. Supports a list containing torch device objects or strings (for example [torch.device('cuda:0'), "mps", "cuda:1"] ). If you set use_gpu=False , the devices parameter is not used, and a single CPU device is used for inference.Optional. |
no_ans_boost | Float | Default: 0.0 | Specifies how much the no_answer logit is increased. If set to 0, it is unchanged. If set to a negative number, there's a lower chance of no_answer being predicted. If set to a positive number, there is an increased chance of no_answer. Mandatory. |
return_no_answer | Boolean | True False (default) | Includes no_answer predictions in the results. Mandatory. |
top_k | Integer | Default: 10 | Specifies the maximum number of answers to return. Mandatory. |
top_k_per_candidate | Integer | Default: 3 | Specifies the number of answers to extract for each candidate document coming from the retriever. This is not the number of final answers that you receive (see top_k ). FARM includes no_answer in the sorted list of predictions.Mandatory. |
top_k_per_sample | Integer | Default: 1 | Specifies the number of answers to extract from each small text passage that the model can process at once. You usually want a small value here, as bigger values slow down inference. Mandatory. |
num_processes | Integer | Default: None | Specifies the number of processes for multiprocessing.Pool . When set to 0 , disables multiprocessing. When set to None , the inferencer determines the optimum number of processes.To debug the language model, you may need to disable multiprocessing. Optional. |
max_seq_len | Integer | Default: 256 | Specifies the maximum sequence length of one input text for the model. Mandatory. |
doc_stride | Integer | Default: 128 | Specifies the length of the striding window for splitting long texts (used if len(text) > max_seq_len ).Mandatory. |
progress_bar | Boolean | True (default)False | Shows a tqdm progress bar. You may want to disable it in production deployments to keep the logs clean. Mandatory. |
duplicate_filtering | Integer | Default: 0 | Specifies how to handle duplicates. Answers are filtered based on their position. Both the start and the end positions of the answers are considered. The higher the value, the more answers that are more apart are filtered out. 0 corresponds to exact duplicates. -1 turns off duplicate removal.Mandatory. |
use_confidence_scores | Boolean | True (default)False | Sets the type of score that is returned with every predicted answer.True - Returns a scaled confidence score of a value between 0 and 1.False - Returns an unscaled, raw score which is the sum of start and end logits from the model for the predicted span.Using confidence scores can change the ranking of no_answer compared to using the unscaled raw scores. Mandatory. |
confidence_threshold | Float | Default: None | Filters out predictions below confidence_threshold. The value should be between 0 and 1. Optional. |
proxies | Dictionary | A dictionary of proxy servers. Example: `{'http': 'some.proxy:1234'} | Specifies a dictionary of proxy servers to use for downloading external models. Optional. |
local_files_only | Boolean | True False (default) | Forces checking for local files only and forbids downloads. Mandatory. |
force_download | Boolean | True False (default) | Forces a download even if the model exists locally in the cache. Mandatory. |
use_auth_token | A union of string and Boolean | Specifies the API token used to download private models from Hugging Face. If set to True , the local token is used. You must create it using transformer-cli login . For more information, see Hugging Face.Optional. | |
max_query_length | Integer | Default 64 | The maximum number of tokens the question can have. Mandatory. |
model_kwargs | Dictionary | Default: None | Additional keyword arguments passed to AutoModelForQuestionAnswering.from_pretrained when loading the model specified in model_name_or_path . For details on what kwargs you can pass,see the model's documentation. Optional. |
TableReader Parameters
Parameter | Type | Possible Values | Description |
---|---|---|---|
model_name_or_path | String | google/tapas-base-finetuned-wtq (default)google/tapas-base-finetuned-wikisql-supervised deepset/tapas-large-nq-hn-reader deepset/tapas-large-nq-reader | Mandatory. Specifies the model that the reader should use. For a list of available models, see Hugging Face Table Question Answering Models. Mandatory. |
model_version | String | Tag name, branch name, or commit hash | Specifies the version of the model from the Hugging Face model hub. Optional. |
tokenizer | String | Specifies the name of the tokenizer. Usually the same as the model. Optional. | |
use_gpu | Boolean | True (default)False | Uses GPU. Falls back on CPU if GPU is unavailable. Mandatory. |
top_k | Integer | Default: 10 | Specifies the number of answers to return. Mandatory. |
top_k_per_candidate | Integer | Default: 3 | Specifies the number of answers to extract for each candidate table coming from the retriever. Mandatory. |
return_no_answer | Boolean | True False (default) | Includes noanswer prediction in the results. Only applicable with _deepset/tapas-large-nq-hn-reader and deepset/tapas-large-nq-reader models. Mandatory. |
max_seq_len | Integer | Default: 256 | Specifies the maximum sequence length of one input table for the model. If the number of tokens of the query and the table exceedsmax_seq_len , the table is truncated by removing rows until the input size fits the model.Mandatory. |
use_auth_token | A union of string and Boolean | Default: None | The API token to use to download private models from Hugging Face. When set to True , uses the token generated when running transformers-cli login (stored in ~/.huggingface). For more information, see Hugging Face.Optional. |
devices | A list of strings and torch devices. | Default: None | A list of torch devices to which you want to limit inference. Supports a list containing torch device objects, for example: [torch.device('cuda:0'), "mps", "cuda:1"] .If you set use_gpu=False , the devices parameter is not used and a single CPU device is used for inference.Optional. |
TansformersReader Parameters
Parameter | Type | Possible Values | Description |
---|---|---|---|
model_name_or_path | String | Default: distilbert-base-uncased-distilled-squad" | Specifies the model that the reader should use. Can be a path to a locally saved model or the name of a public model on Hugging Face. For a list of available models, see Hugging Face Models. Mandatory. |
model_version | String | Tag name, branch name, or commit hash | Specifies the version of the model from the Hugging Face model hub. Optional. |
tokenizer | String | Name of the tokenizer (usually the same as the model). Optional. | |
context_window_size | Integer | Default: 70 | Specifies the size of the window that defines how many of the surrounding characters are considered as the context of an answer text. Used when displaying the context around the answer. Mandatory. |
use_gpu | Boolean | True (default)False | Uses GPU if available. Mandatory. |
top_k | Integer | Default: 10 | Specifies the maximum number of answers to return. Mandatory. |
top_k_per_candidate | Integer | Default: 3 | Specifies the number of answers to extract for each candidate document coming from the retriever. This is not the number of final answers that you receive (see top_k ). FARM includes no_answer in the sorted list of predictions.Mandatory. |
return_no_answers | Boolean | True False (default) | Includes no_answer predictions in the results. Mandatory. |
max_seq_len | Integer | Default: 256 | Specifies the maximum sequence length of one input text for the model. Mandatory. |
doc_stride | Integer | Default: 128 | Specifies the length of the striding window for splitting long texts (used if len(text) > max_seq_len ).Mandatory. |
batch_size | Integer | Default: 16 | Specifies the number of samples that the model receives in one batch for inference. Memory consumption is lower in inference mode, so we recommend that you use a single batch. Mandatory. |
use_auth_token | Unionk:paramete | Default: None | Optional. |
devices | A list of strings and torch devices. | Default: None | List of torch devices (for example cuda, cpu, mps) you want to limit inference to. Supports a list containing torch device objects or strings (for example [torch.device('cuda:0'), "mps", "cuda:1"] ). If you set use_gpu=False , the devices parameter is not used, and a single CPU device is used for inference.Optional. |
REST API Runtime Parameters
You can pass the following parameters to all Readers at runtime using the Search API endpoint:
Parameter | Type | Possible Values | Description |
---|---|---|---|
top_k | Integer | Default: 10 | The maximum number of documents to return. Mandatory. |