ExtractiveReader
ExtractiveReader locates and extracts exact answer spans from a collection of documents in response to a query. Unlike implementations that normalize scores per document, it scores each answer span independently, making scores directly comparable across all documents.
Key Features
- Performs extractive question answering using Hugging Face transformer models.
- Scores answer spans independently across all documents for direct comparison.
- Returns a configurable number of top answers with confidence scores.
- Filters answers using a score threshold to return only high-confidence results.
- Handles long documents by splitting them into overlapping sequences and deduplicating answers.
- Optionally returns a "no answer" score to indicate when top answers may be incorrect.
Configuration
- Drag the
ExtractiveReadercomponent onto the canvas from the Component Library. - Click the component to open the configuration panel.
- On the General tab:
- Select the model: enter a Hugging Face model identifier or a local path. The default is
deepset/roberta-base-squad2-distilled.
- Select the model: enter a Hugging Face model identifier or a local path. The default is
- Go to the Advanced tab to configure the device, API token, top_k, score threshold, maximum sequence length, stride, batch size, answers per sequence, no-answer scoring, calibration factor, overlap threshold, and model keyword arguments.
Connections
ExtractiveReader accepts a query string and a list of documents as inputs. It outputs a list of answers sorted by descending confidence score.
Typically, you connect a document retriever (such as InMemoryBM25Retriever or OpenSearchBM25Retriever) to the documents input and pass the query from your pipeline's input to the query input.
Usage Example
We're working on adding pipeline examples and the most common component connections.
components:
ExtractiveReader:
type: components.readers.extractive.ExtractiveReader
init_parameters:
Example usage in Python:
from haystack import Document
from haystack.components.readers import ExtractiveReader
docs = [
Document(content="Python is a popular programming language"),
Document(content="python ist eine beliebte Programmiersprache"),
]
reader = ExtractiveReader()
reader.warm_up()
question = "What is a popular programming language?"
result = reader.run(query=question, documents=docs)
assert "Python" in result["answers"][0].data
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | Query string. | |
| documents | List[Document] | List of Documents in which you want to search for an answer to the query. | |
| top_k | Optional[int] | None | The maximum number of answers to return. An additional answer is returned if no_answer is set to True (default). |
| score_threshold | Optional[float] | None | Returns only answers with the score above this threshold. |
| max_seq_length | Optional[int] | None | Maximum number of tokens. If a sequence exceeds it, the sequence is split. |
| stride | Optional[int] | None | Number of tokens that overlap when sequence is split because it exceeds max_seq_length. |
| max_batch_size | Optional[int] | None | Maximum number of samples that are fed through the model at the same time. |
| answers_per_seq | Optional[int] | None | Number of answer candidates to consider per sequence. This is relevant when a Document was split into multiple sequences because of max_seq_length. |
| no_answer | Optional[bool] | None | Whether to return no answer scores. |
| overlap_threshold | Optional[float] | None | If set this will remove duplicate answers if they have an overlap larger than the supplied threshold. For example, for the answers "in the river in Maine" and "the river" we would remove one of these answers since the second answer has a 100% (1.0) overlap with the first answer. However, for the answers "the river in" and "in Maine" there is only a max overlap percentage of 25% so both of these answers could be kept if this variable is set to 0.24 or lower. If None is provided then all answers are kept. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| answers | List[ExtractedAnswer] | List of answers sorted by (desc.) answer score. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | Union[Path, str] | deepset/roberta-base-squad2-distilled | A Hugging Face transformers question answering model. Can either be a path to a folder containing the model files or an identifier for the Hugging Face hub. |
| device | Optional[ComponentDevice] | None | The device on which the model is loaded. If None, the default device is automatically selected. |
| token | Optional[Secret] | Secret.from_env_var(['HF_API_TOKEN', 'HF_TOKEN'], strict=False) | The API token used to download private models from Hugging Face. |
| top_k | int | 20 | Number of answers to return per query. It is required even if score_threshold is set. An additional answer with no text is returned if no_answer is set to True (default). |
| score_threshold | Optional[float] | None | Returns only answers with the probability score above this threshold. |
| max_seq_length | int | 384 | Maximum number of tokens. If a sequence exceeds it, the sequence is split. |
| stride | int | 128 | Number of tokens that overlap when sequence is split because it exceeds max_seq_length. |
| max_batch_size | Optional[int] | None | Maximum number of samples that are fed through the model at the same time. |
| answers_per_seq | Optional[int] | None | Number of answer candidates to consider per sequence. This is relevant when a Document was split into multiple sequences because of max_seq_length. |
| no_answer | bool | True | Whether to return an additional no answer with an empty text and a score representing the probability that the other top_k answers are incorrect. |
| calibration_factor | float | 0.1 | Factor used for calibrating probabilities. |
| overlap_threshold | Optional[float] | 0.01 | If set this will remove duplicate answers if they have an overlap larger than the supplied threshold. For example, for the answers "in the river in Maine" and "the river" we would remove one of these answers since the second answer has a 100% (1.0) overlap with the first answer. However, for the answers "the river in" and "in Maine" there is only a max overlap percentage of 25% so both of these answers could be kept if this variable is set to 0.24 or lower. If None is provided then all answers are kept. |
| model_kwargs | Optional[Dict[str, Any]] | None | Additional keyword arguments passed to AutoModelForQuestionAnswering.from_pretrained when loading the model specified in model. For details on what kwargs you can pass, see the model's documentation. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | Query string. | |
| documents | List[Document] | List of Documents in which you want to search for an answer to the query. | |
| top_k | Optional[int] | None | The maximum number of answers to return. An additional answer is returned if no_answer is set to True (default). |
| score_threshold | Optional[float] | None | Returns only answers with the score above this threshold. |
| max_seq_length | Optional[int] | None | Maximum number of tokens. If a sequence exceeds it, the sequence is split. |
| stride | Optional[int] | None | Number of tokens that overlap when sequence is split because it exceeds max_seq_length. |
| max_batch_size | Optional[int] | None | Maximum number of samples that are fed through the model at the same time. |
| answers_per_seq | Optional[int] | None | Number of answer candidates to consider per sequence. This is relevant when a Document was split into multiple sequences because of max_seq_length. |
| no_answer | Optional[bool] | None | Whether to return no answer scores. |
| overlap_threshold | Optional[float] | None | If set this will remove duplicate answers if they have an overlap larger than the supplied threshold. For example, for the answers "in the river in Maine" and "the river" we would remove one of these answers since the second answer has a 100% (1.0) overlap with the first answer. However, for the answers "the river in" and "in Maine" there is only a max overlap percentage of 25% so both of these answers could be kept if this variable is set to 0.24 or lower. If None is provided then all answers are kept. |
Was this page helpful?