Readers are useful for extractive question answering when you want to know the exact position of the answer within the document. If you use a reader in your pipeline, it highlights phrases and sentences as answers to your query.

deepset Cloud Readers are build on the latest transformer-based langugage models, are strong in their grasp of semantics, and are sensitive to syntactic structure. Our Readers contain all the components of end-to-end, open-domain QA systems, including loading of model weights, tokenization, embedding computation, or span prediction.

Readers use models to perform question answering and require a GPU to run quickly. For model recommendations, see Language Models in deepset Cloud.

Basic Information

Pipeline type: Used in query pipelines.
Nodes that can precede it in a pipeline: Retriever, JoinDocuments, Ranker
Nodes that can follow it in a pipeline: Reader is the last node in query pipelines.
Node input: Documents
Node output: Answer
Available node classes: FARMReader, TransformersReader, TableReader

Readers Overview

FARMReader

FARMReader uses the FARM framework and the tokenizers from the Hugging Face Transformers library. Unlike TransformersReader, they sum start and end logits per passage without normalizing them. FARMReaders remove duplicates, which means they don't predict the same text span twice.

TableReader

This Reader can fetch answers from tables. It uses the TAPAS model created by Google. These models can return a single cell as an answer or pick a set of cells and then perform an aggregation operation to form a final answer.

TransformersReader

An alternative to the FARMReader that uses the tokenizers from the Hugging Face Tokenizers library. Unlike FARMReader, it normalizes start and end logits per passage and multiplies them. It doesn't remove duplicates so it may happen that it predicts the same text span twice.

Usage Examples

Readers use models to perform question answering, so when declaring a Reader, you must always specify the model to use.

...
components:
	-name: MyReader
   type: FARMReader
   params:
  	 model: "deepset/roberta-base-squad2"
     use_gpu: True
 ...
 pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: Reader
        inputs: [Retriever]
  ...

Parameters

FARMReader Parameters

You can only use encoder models with FARM reader, such as BERT, ELECTRA, RoBERTa, ALBERT, XML, DistilBERT, DeBERTa.
These are the parameters you can pass for FARMReader in pipeline YAML:

Parameter	Type	Possible Values	Description
`model_name_or_path`	String	Example: `deepset/bert-base-cased-squad2`	Specifies the model that the reader should use. Type a path to a locally saved model or the name of a public model from Hugging Face. For a list of available models, see Hugging Face Models. Mandatory.
`model_version`	String	Tag name, branch name, or commit hash	Specifies the version of the model from the Hugging Face model hub. Optional.
`context_window_size`	Integer	Default: `150`	Specifies the size of the window that defines how many of the surrounding characters are considered as the context of an answer text. Used when displaying the context around the answer. Mandatory.
`batch_size`	Integer	Default: `50`	Specifies the number of samples that the model receives in one batch for inference. Memory consumption is lower in inference mode, so we recommend that you use a single batch. Mandatory.
`use_gpu`	Boolean	`True` (default) `False`	Uses GPU if available. Mandatory.
`devices`	A list of string and torch devices.	Default: `None`	List of torch devices (for example cuda, cpu, mps) you want to limit inference to. Supports a list containing torch device objects or strings (for example `[torch.device('cuda:0'), "mps", "cuda:1"]`). If you set `use_gpu=False`, the `devices` parameter is not used, and a single CPU device is used for inference. Optional.
`no_ans_boost`	Float	Default: `0.0`	Specifies how much the no_answer logit is increased. If set to 0, it is unchanged. If set to a negative number, there's a lower chance of no_answer being predicted. If set to a positive number, there is an increased chance of no_answer. Mandatory.
`return_no_answer`	Boolean	`True` `False` (default)	Includes no_answer predictions in the results. Mandatory.
`top_k`	Integer	Default: `10`	Specifies the maximum number of answers to return. Mandatory.
`top_k_per_candidate`	Integer	Default: `3`	Specifies the number of answers to extract for each candidate document coming from the retriever. This is not the number of final answers that you receive (see `top_k`). FARM includes no_answer in the sorted list of predictions. Mandatory.
`top_k_per_sample`	Integer	Default: `1`	Specifies the number of answers to extract from each small text passage that the model can process at once. You usually want a small value here, as bigger values slow down inference. Mandatory.
`num_processes`	Integer	Default: `None`	Specifies the number of processes for `multiprocessing.Pool`. When set to `0`, disables multiprocessing. When set to `None`, the inferencer determines the optimum number of processes. To debug the language model, you may need to disable multiprocessing. Optional.
`max_seq_len`	Integer	Default: `256`	Specifies the maximum sequence length of one input text for the model. Mandatory.
`doc_stride`	Integer	Default: `128`	Specifies the length of the striding window for splitting long texts (used if `len(text) > max_seq_len`). Mandatory.
`progress_bar`	Boolean	`True` (default) `False`	Shows a tqdm progress bar. You may want to disable it in production deployments to keep the logs clean. Mandatory.
`duplicate_filtering`	Integer	Default: `0`	Specifies how to handle duplicates. Answers are filtered based on their position. Both the start and the end positions of the answers are considered. The higher the value, the more answers that are more apart are filtered out. `0` corresponds to exact duplicates. `-1` turns off duplicate removal. Mandatory.
`use_confidence_scores`	Boolean	`True` (default) `False`	Sets the type of score that is returned with every predicted answer. `True` - Returns a scaled confidence score of a value between 0 and 1. `False` - Returns an unscaled, raw score which is the sum of start and end logits from the model for the predicted span. Using confidence scores can change the ranking of no_answer compared to using the unscaled raw scores. Mandatory.
`confidence_threshold`	Float	Default: `None`	Filters out predictions below confidence_threshold. The value should be between 0 and 1. Optional.
`proxies`	Dictionary	A dictionary of proxy servers. Example: `{'http': 'some.proxy:1234'}	Specifies a dictionary of proxy servers to use for downloading external models. Optional.
`local_files_only`	Boolean	`True` `False` (default)	Forces checking for local files only and forbids downloads. Mandatory.
`force_download`	Boolean	`True` `False` (default)	Forces a download even if the model exists locally in the cache. Mandatory.
`use_auth_token`	A union of string and Boolean		Specifies the API token used to download private models from Hugging Face. If set to `True`, the local token is used. You must create it using `transformer-cli login`. For more information, see Hugging Face. Optional.
`max_query_length`	Integer	Default `64`	The maximum number of tokens the question can have. Mandatory.
`model_kwargs`	Dictionary	Default: `None`	Additional keyword arguments passed to `AutoModelForQuestionAnswering.from_pretrained` when loading the model specified in `model_name_or_path`. For details on what kwargs you can pass, see the model's documentation. Optional.

TableReader Parameters

These are the parameters you can pass for TableReader in pipeline YAML:

Parameter	Type	Possible Values	Description
`model_name_or_path`	String	`google/tapas-base-finetuned-wtq` (default) `google/tapas-base-finetuned-wikisql-supervised` `deepset/tapas-large-nq-hn-reader` `deepset/tapas-large-nq-reader`	Mandatory. Specifies the model that the reader should use. For a list of available models, see Hugging Face Table Question Answering Models. Mandatory.
`model_version`	String	Tag name, branch name, or commit hash	Specifies the version of the model from the Hugging Face model hub. Optional.
`tokenizer`	String		Specifies the name of the tokenizer. Usually the same as the model. Optional.
`use_gpu`	Boolean	`True` (default) `False`	Uses GPU. Falls back on CPU if GPU is unavailable. Mandatory.
`top_k`	Integer	Default: `10`	Specifies the number of answers to return. Mandatory.
`top_k_per_candidate`	Integer	Default: `3`	Specifies the number of answers to extract for each candidate table coming from the retriever. Mandatory.
`return_no_answer`	Boolean	`True` `False` (default)	Includes noanswer prediction in the results. Only applicable with _deepset/tapas-large-nq-hn-reader and deepset/tapas-large-nq-reader models. Mandatory.
`max_seq_len`	Integer	Default: `256`	Specifies the maximum sequence length of one input table for the model. If the number of tokens of the query and the table exceeds`max_seq_len`, the table is truncated by removing rows until the input size fits the model. Mandatory.
`use_auth_token`	A union of string and Boolean	Default: `None`	The API token to use to download private models from Hugging Face. When set to `True`, uses the token generated when running `transformers-cli login` (stored in ~/.huggingface). For more information, see Hugging Face. Optional.
`devices`	A list of strings and torch devices.	Default: `None`	A list of torch devices to which you want to limit inference. Supports a list containing torch device objects, for example: `[torch.device('cuda:0'), "mps", "cuda:1"]`. If you set `use_gpu=False`, the `devices` parameter is not used and a single CPU device is used for inference. Optional.

TransformersReader Parameters

These are the parameters you can pass in pipeline YAML:

Parameter	Type	Possible Values	Description
`model_name_or_path`	String	Default: `distilbert-base-uncased-distilled-squad"`	Specifies the model that the reader should use. Can be a path to a locally saved model or the name of a public model on Hugging Face. For a list of available models, see Hugging Face Models. Mandatory.
`model_version`	String	Tag name, branch name, or commit hash	Specifies the version of the model from the Hugging Face model hub. Optional.
`tokenizer`	String		Name of the tokenizer (usually the same as the model). Optional.
`context_window_size`	Integer	Default: `70`	Specifies the size of the window that defines how many of the surrounding characters are considered as the context of an answer text. Used when displaying the context around the answer. Mandatory.
`use_gpu`	Boolean	`True` (default) `False`	Uses GPU if available. Mandatory.
`top_k`	Integer	Default: `10`	Specifies the maximum number of answers to return. Mandatory.
`top_k_per_candidate`	Integer	Default: `3`	Specifies the number of answers to extract for each candidate document coming from the retriever. This is not the number of final answers that you receive (see `top_k`). FARM includes no_answer in the sorted list of predictions. Mandatory.
`return_no_answers`	Boolean	`True` `False` (default)	Includes no_answer predictions in the results. Mandatory.
`max_seq_len`	Integer	Default: `256`	Specifies the maximum sequence length of one input text for the model. Mandatory.
`doc_stride`	Integer	Default: `128`	Specifies the length of the striding window for splitting long texts (used if `len(text) > max_seq_len`). Mandatory.
`batch_size`	Integer	Default: `16`	Specifies the number of samples that the model receives in one batch for inference. Memory consumption is lower in inference mode, so we recommend that you use a single batch. Mandatory.
`use_auth_token`	Unionk:paramete	Default: `None`	Optional.
`devices`	A list of strings and torch devices.	Default: `None`	List of torch devices (for example cuda, cpu, mps) you want to limit inference to. Supports a list containing torch device objects or strings (for example `[torch.device('cuda:0'), "mps", "cuda:1"]`). If you set `use_gpu=False`, the `devices` parameter is not used, and a single CPU device is used for inference. Optional.