Reader
Reader is the core component that fetches the right answers to you. There are several types of readers that you can use for your search system.
deepset Cloud uses readers that are:
- Built on the latest transformer-based language models
- Strong in their grasp of semantics
- Sensitive to syntactic structure
- State-of-the-art in question-answering (QA) tasks like SQuAD and Natural Questions
Our readers contain all the components of end-to-end, open-domain QA systems, including:
- Loading of model weights
- Tokenization
- Embedding computation
- Span prediction
- Candidate aggregation
If you use a reader in your pipeline, it highlights phrases and sentences as answers to your query.
Usage
Readers are usually combined with retrievers in pipelines. To define a reader:
#Import the reader:
from haystack.nodes import FARMReader
#Specify the model that you want to use with your reader:
model = "deepset/roberta-base-squad2"
#Specify the reader:
reader = FARMReader(model, use_gpu=True)
Or in YAML:
components:
-name: MyReader
type: FARMReader
params:
model: "deepset/roberta-base-squad2"
use_gpu: True
Models
A reader always takes a model as an argument. deepset Cloud readers handle loading model weights so to use a pre-trained QA model with your reader, simply provide its Hugging Face model hub name.
There are plenty of models out there and it can be difficult to select one. Here are some tips to help you get started:
- RoBERTa (base): An optimized variant of BERT and a great starting point. Can be handled by any machine with a single NVidia V100 GPU.
- PRO: Strong all-round model
- CON: There are faster and more accurate models
- HUB NAME:
deepset/roberta-base-squad1
- MiniLM: A cleverly distilled model that sacrifices some accuracy for speed. Recommended if you prioritize speed and GPU memory over accuracy. Outperforms BERT base on SQuAD.
- PRO: Inference speed up to 50% faster than BERT base
- CON: Doesn't match the best base-sized models in accuracy.
- HUB NAME:
deepset/minilm-uncased-squad2
- ALBERT (XXL): A large and powerful SotA model. If you want the best performance and you have the computational resources, this is the model for you.
- PRO: Better accuracy than any other open-source model in QA.
- CON: Needs a lot of computational power which makes it impractical in most use cases.
- HUB NAME:
ahotrod/albert_xxlargev1_squad2_512
Models for TableReader
The nq-reader models used with TableReader can provide confidence scores but cannot handle questions that need aggregation over multiple cells. The answers are sorted by a general table score first, and then by answer span scores.
If you want to learn more about models, see Language Models.
Reader Types
TableReader
This reader retrieves answers to your questions even if they are buried in a table. It is designed to use the TAPAS model by Google (google/tapas-base-finetuned-wtq
). This model can return a single cell as an answer or can pick a set of cells and then aggregate them to get the final answer. It uses the Hugging Face transformers framework.
For a full list of models available for this reader, see Hugging Face Models.
Usage
These are the arguments that you can specify for TableReader:
Argument | Type | Possible Values | Description |
---|---|---|---|
model_name_or_path | String | Path to a saved model or the name of a public model. | Mandatory. Specifies the model that the reader should use. For a list of available models, see Hugging Face Models. |
model_version | String | Tag name, branch name, or commit hash | Specifies the version of the model from the Hugging Face model hub. |
tokenizer | String | Specifies the name of the tokenizer. Usually the same as the model. | |
use_gpu | Boolean | True/False | Uses GPU. Falls back on CPU if GPU is unavailable. |
top_k | Integer | Specifies the number of answers to return. | |
top_k_per_candidate | Integer | Specifies the number of answers to extract for each candidate table coming from the retriever. | |
return_no_answer | Boolean | True/False | Includes no_answer prediction in the results. Only applicable with nq-reader models. |
max_seq_len | Integer | Specifies the maximum sequence length of one input table for the model. If the number of tokens of the query and the table exceedsmax_seq_len , the table is truncated by removing rows until the input size fits the model. |
FARMReader
A transformer-based model for extractive QA using the FARM framework. You can only use encoder models with FARM reader, such as BERT, ELECTRA, RoBERTa, ALBERT, XML, DistilBERT, DeBERTa.
Main Features
- Removes duplicates
- Uses the tokenizers from the Hugging Face transformers library
- Start and end logits are summed and not normalized
Usage
These are the arguments that you can specify for FARMReader:
Argument | Type | Possible Values | Description |
---|---|---|---|
model_name_or_path | String | Path to a saved model or the name of a public model. Example: deepset/bert-base-cased-squad2 | Mandatory. Specifies the model that the reader should use. For a list of available models, see Hugging Face Models . |
model_version | String | Tag name, branch name, or commit hash | Specifies the version of the model from the Hugging Face model hub. |
context_window_size | Integer | The number of characters | Specifies the size of the window that defines how many of the surrounding characters are considered as the context of an answer text. Used when displaying the context around the answer. |
batch_size | Integer | Specifies the number of samples that the model receives in one batch for inference. Memory consumption is lower in inference mode so we recommend that you use a single batch. | |
use_gpu | Boolean | True/False | Uses GPU if available. |
no_ans_boost | Float | 0 (default) negative number positive number | Specifies how much the no_answer logit is increased. If set to 0, it is unchanged. If set to a negative number, there's a lower chance of no_answer being predicted. If set to a positive number, there is an increased chance of no_answer. |
return_no_answer | Boolean | True/False | Includes no_answer predictions in the results. |
top_k | Integer | Specifies the maximum number of answers to return. | |
top_k_per_candidate | Integer | Specifies the number of answers to extract for each candidate document coming from the retriever. This is not the number of final answers that you receive (see top_k ). FARM includes no_answer in the sorted list of predictions. | |
top_k_per_sample | Integer | Specifies the number of answers to extract from each small text passage that the model can process at once. You usually want a small value here as it slows down inference. | |
num_processes | Integer | 0 None | Specifies the number of processes for multiprocessing.Pool . When set to 0, disables multiprocessing. When set to None , the inferencer determines the optimum number of processes.To debug the language model, you may need to disable multiprocessing. |
max_seq_len | Integer | Specifies the maximum sequence length of one input text for the model. | |
doc_stride | Integer | Specifies the length of the striding window for splitting long texts (used if len(text) > max_seq_len ). | |
progress_bar | Boolean | True/False | Shows a tqdm progress bar. You may want to disable it in production deployments to keep the logs clean. |
duplicate_filtering | Integer | Specifies how to handle duplicates. Answers are filtered based on their position. Both start and end positions of the answers are considered. The higher the value, the answers that are more apart are filtered out. 0 corresponds to exact duplicates. -1 turns off duplicate removal. | |
use_confidence_scores | Boolean | True/False | Sets the type of score that is returned with every predicted answer. If set to True , a scaled confidence score between [0, 1] is returned.If set to False , an unscaled, raw score [-inf, +inf], which is the sum of start and end logit from the model for the predicted span, is returned. |
proxies | Dictionary | A dictionary of proxy servers. Example: `{'http': 'some.proxy:1234', 'http://hostname ': 'my.proxy:3111'}` | Specifies a dictionary of proxy servers to use for downloading external models. |
local_files_only | Boolean | True/False | Forces checking for local files only and forbids downloads. |
force_download | Boolean | True/False | Forces a download even if the model exists locally in the cache. |
use_auth_token | Boolean | True/False | Specifies the API token used to download private models from Hugging Face. If set to True , the local token is used. You must create it using transformer-cli login . For more information, see Hugging Face. |
TransformersReader
An alternative to the FARMReader that directly uses the Transformers library. It has less features than the FARMReader and so we only recommend it if you would like to bypass FARM. For a comparison of the two Readers, see FARM vs Transformers.
Usage
These are the arguments that you can specify for TransformersReader:
Argument | Type | Possible Values | Description |
---|---|---|---|
model_name_or_path | String | Path to a saved model or the name of a public model. Example: deepset/bert-base-cased-squad2 | Mandatory. Specifies the model that the reader should use. For a list of available models, see Hugging Face Models . |
model_version | String | Tag name, branch name, or commit hash | Specifies the version of the model from the Hugging Face model hub. |
tokenizer | String | Name of the tokenizer (usually the same as model) | |
context_window_size | Integer | The number of characters | Specifies the size of the window that defines how many of the surrounding characters are considered as the context of an answer text. Used when displaying the context around the answer. |
batch_size | Integer | Specifies the number of samples that the model receives in one batch for inference. Memory consumption is lower in inference mode so we recommend that you use a single batch. | |
use_gpu | Boolean | True/False | Uses GPU if available. |
return_no_answer | Boolean | True/False | Includes no_answer predictions in the results. |
top_k | Integer | Specifies the maximum number of answers to return. | |
top_k_per_candidate | Integer | Specifies the number of answers to extract for each candidate document coming from the retriever. This is not the number of final answers that you receive (see top_k ). FARM includes no_answer in the sorted list of predictions. | |
max_seq_len | Integer | Specifies the maximum sequence length of one input text for the model. | |
doc_stride | Integer | Specifies the length of the striding window for splitting long texts (used if len(text) > max_seq_len ). |
Updated 3 months ago