AnswerBuilder

Convert a query and Generator's replies into a GeneratedAnswer object. AnswerBuilder is used as the last component in query pipelines.

Basic Information

Type: components.builders.answer_builder.AnswerBuilder
Components it can connect with:
- Generators: AnswerBuilder accepts Generator's replies and converts them into GeneratedAnswer objects.
- Input: AnswerBuilder receives the user query to add it to the GeneratedAnswer.

Inputs

Parameter	Type	Default	Description
query	str		The user query.
replies	Union[List[str], List[ChatMessage]]		The output of the Generator. Can be a list of strings or a list of `ChatMessage` objects.
meta	Optional[List[Dict[str, Any]]]	None	The metadata returned by the Generator. If not specified, the generated answer contains no metadata.
documents	Optional[List[Document]]	None	The documents used as input for the Generator. If specified, they are added to the`GeneratedAnswer` objects. If both `documents` and `reference_pattern` are specified, the documents referenced in the Generator's output are extracted from the input documents and added to the `GeneratedAnswer` objects.
pattern	Optional[str]	None	The regular expression pattern to extract the answer text from the Generator. If not specified, the entire response is used as the answer. The regular expression can have one capture group at most. If present, the capture group text is used as the answer. If no capture group is present, the whole match is used as the answer. Examples: `[^\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer". `Answer: (.*)` finds "this is an answer" in a string "this is an argument. Answer: this is an answer".
reference_pattern	Optional[str]	None	The regular expression pattern used for parsing the document references. If not specified, no parsing is done, and all documents are referenced. References need to be specified as indices of the input documents and start at [1]. Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]".

Outputs

Parameter	Type	Default	Description
answers	List[GeneratedAnswer]		The answers received from the output of the Generator, may include documents.

Overview

Use AnswerBuilder to parse Generator's replies using custom regular expressions. It can also take documents and metadata from the Generator and add them to the GeneratedAnswer objects. AnswerBuilder works with both Generators and Chat Generators.

To include references in answers, use DeepsetAnswerBuilder. For details on which builder to choose, see Enable references for generated answers.

Usage Example

Initializing the Component

components:
  AnswerBuilder:
    type: components.builders.answer_builder.AnswerBuilder
    init_parameters:

Using the Component in a Pipeline

This is a RAG pipeline with AnswerBuilder. Note that the answers this pipeline generates won't include references.

# If you need help with the YAML format, have a look at https://docs.cloud.deepset.ai/v2.0/docs/create-a-pipeline#create-a-pipeline-using-pipeline-editor.
# This section defines components that you want to use in your pipelines. Each component must have a name and a type. You can also set the component's parameters here.
# The name is up to you, you can give your component a friendly name. You then use components' names when specifying the connections in the pipeline.
# Type is the class path of the component. You can check the type on the component's documentation page.
components:
  retriever: # Selects the most similar documents from the document store
    type: haystack_integrations.components.retrievers.opensearch.open_search_hybrid_retriever.OpenSearchHybridRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          hosts:
          index: ''
          max_chunk_bytes: 104857600
          embedding_dim: 768
          return_embedding: false
          method:
          mappings:
          settings:
          create_index: true
          http_auth:
          use_ssl:
          verify_certs:
          timeout:
      top_k: 20 # The number of results to return
      fuzziness: 0
      embedder:
        type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
        init_parameters:
          normalize_embeddings: true
          model: intfloat/e5-base-v2

  ranker:
    type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
    init_parameters:
      model: intfloat/simlm-msmarco-reranker
      top_k: 8

  meta_field_grouping_ranker:
    type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
    init_parameters:
      group_by: file_id
      subgroup_by:
      sort_docs_by: split_id

  prompt_builder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
      required_variables: "*"
      template: |-
        You are a technical expert.
        You answer questions truthfully based on provided documents.
        If the answer exists in several documents, summarize them.
        Ignore documents that don't contain the answer to the question.
        Only answer based on the documents provided. Don't make things up.
        If no information related to the question can be found in the document, say so.
        Always use references in the form [NUMBER OF DOCUMENT] when using information from a document, e.g. [3] for Document [3] .
        Never name the documents, only enter a number in square brackets as a reference.
        The reference must only refer to the number that comes in square brackets after the document.
        Otherwise, do not use brackets in your answer and reference ONLY the number of the document without mentioning the word document.

        These are the documents:
        {%- if documents|length > 0 %}
        {%- for document in documents %}
        Document [{{ loop.index }}] :
        Name of Source File: {{ document.meta.file_name }}
        {{ document.content }}
        {% endfor -%}
        {%- else %}
        No relevant documents found.
        Respond with "Sorry, no matching documents were found, please adjust the filters or try a different question."
        {% endif %}

        Question: {{ question }}
        Answer:

  llm:
    type: deepset_cloud_custom_nodes.generators.deepset_amazon_bedrock_generator.DeepsetAmazonBedrockGenerator
    init_parameters:
      model: anthropic.claude-3-5-sonnet-20241022-v2:0
      aws_region_name: us-west-2
      max_length: 650
      temperature: 0

  file_downloader:
    type: deepset_cloud_custom_nodes.augmenters.deepset_file_downloader.DeepsetFileDownloader
    init_parameters:
      file_extensions:
      - .txt
      - .pdf
      - .md
      - .docx
      - .csv
      - .xlsx
      - .html
      - .htm
      - .pptx
      sources_target_type: haystack.dataclasses.ByteStream

  attachments_joiner:
    type: haystack.components.joiners.document_joiner.DocumentJoiner
    init_parameters:
      join_mode: concatenate
      weights:
      top_k:
      sort_by_score: true

  multi_file_converter:
    type: haystack.core.super_component.super_component.SuperComponent
    init_parameters:
      input_mapping:
        sources:
        - file_classifier.sources
      is_pipeline_async: false
      output_mapping:
        score_adder.output: documents
      pipeline:
        components:
          file_classifier:
            type: haystack.components.routers.file_type_router.FileTypeRouter
            init_parameters:
              mime_types:
              - text/plain
              - application/pdf
              - text/markdown
              - text/html
              - application/vnd.openxmlformats-officedocument.wordprocessingml.document
              - application/vnd.openxmlformats-officedocument.presentationml.presentation
              - application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
              - text/csv

          text_converter:
            type: haystack.components.converters.txt.TextFileToDocument
            init_parameters:
              encoding: utf-8

          pdf_converter:
            type: haystack.components.converters.pdfminer.PDFMinerToDocument
            init_parameters:
              line_overlap: 0.5
              char_margin: 2
              line_margin: 0.5
              word_margin: 0.1
              boxes_flow: 0.5
              detect_vertical: true
              all_texts: false
              store_full_path: false

          markdown_converter:
            type: haystack.components.converters.txt.TextFileToDocument
            init_parameters:
              encoding: utf-8

          html_converter:
            type: haystack.components.converters.html.HTMLToDocument
            init_parameters:
              # A dictionary of keyword arguments to customize how you want to extract content from your HTML files.
              # For the full list of available arguments, see
              # the [Trafilatura documentation](https://trafilatura.readthedocs.io/en/latest/corefunctions.html#extract).
              extraction_kwargs:
                output_format: markdown # Extract text from HTML. You can also also choose "txt"
                target_language:       # You can define a language (using the ISO 639-1 format) to discard documents that don't match that language.
                include_tables: true  # If true, includes tables in the output
                include_links: true  # If true, keeps links along with their targets

          docx_converter:
            type: haystack.components.converters.docx.DOCXToDocument
            init_parameters:
              link_format: markdown

          pptx_converter:
            type: haystack.components.converters.pptx.PPTXToDocument
            init_parameters: {}

          xlsx_converter:
            type: haystack.components.converters.xlsx.XLSXToDocument
            init_parameters: {}

          csv_converter:
            type: haystack.components.converters.csv.CSVToDocument
            init_parameters:
              encoding: utf-8

          splitter:
            type: haystack.components.preprocessors.document_splitter.DocumentSplitter
            init_parameters:
              split_by: word
              split_length: 250
              split_overlap: 30
              respect_sentence_boundary: true
              language: en

          score_adder:
            type: haystack.components.converters.output_adapter.OutputAdapter
            init_parameters:
              template: |
                {%- set scored_documents = [] -%}
                {%- for document in documents -%}
                  {%- set doc_dict = document.to_dict() -%}
                  {%- set _ = doc_dict.update({'score': 100.0}) -%}
                  {%- set scored_doc = document.from_dict(doc_dict) -%}
                  {%- set _ = scored_documents.append(scored_doc) -%}
                {%- endfor -%}
                {{ scored_documents }}
              output_type: List[haystack.Document]
              custom_filters:
              unsafe: true

          text_joiner:
            type: haystack.components.joiners.document_joiner.DocumentJoiner
            init_parameters:
              join_mode: concatenate
              sort_by_score: false

          tabular_joiner:
            type: haystack.components.joiners.document_joiner.DocumentJoiner
            init_parameters:
              join_mode: concatenate
              sort_by_score: false
        connections:
        - sender: file_classifier.text/plain
          receiver: text_converter.sources
        - sender: file_classifier.application/pdf
          receiver: pdf_converter.sources
        - sender: file_classifier.text/markdown
          receiver: markdown_converter.sources
        - sender: file_classifier.text/html
          receiver: html_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.wordprocessingml.document
          receiver: docx_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.presentationml.presentation
          receiver: pptx_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
          receiver: xlsx_converter.sources
        - sender: file_classifier.text/csv
          receiver: csv_converter.sources
        - sender: text_joiner.documents
          receiver: splitter.documents
        - sender: text_converter.documents
          receiver: text_joiner.documents
        - sender: pdf_converter.documents
          receiver: text_joiner.documents
        - sender: markdown_converter.documents
          receiver: text_joiner.documents
        - sender: html_converter.documents
          receiver: text_joiner.documents
        - sender: pptx_converter.documents
          receiver: text_joiner.documents
        - sender: docx_converter.documents
          receiver: text_joiner.documents
        - sender: xlsx_converter.documents
          receiver: tabular_joiner.documents
        - sender: csv_converter.documents
          receiver: tabular_joiner.documents
        - sender: splitter.documents
          receiver: tabular_joiner.documents
        - sender: tabular_joiner.documents
          receiver: score_adder.documents

  AnswerBuilder:
    type: haystack.components.builders.answer_builder.AnswerBuilder
    init_parameters:
      pattern:
      reference_pattern:
      last_message_only: false

connections:  # Defines how the components are connected
- sender: retriever.documents
  receiver: ranker.documents
- sender: ranker.documents
  receiver: meta_field_grouping_ranker.documents
- sender: prompt_builder.prompt
  receiver: llm.prompt
- sender: file_downloader.sources
  receiver: multi_file_converter.sources
- sender: multi_file_converter.documents
  receiver: attachments_joiner.documents

- sender: meta_field_grouping_ranker.documents
  receiver: attachments_joiner.documents
- sender: attachments_joiner.documents
  receiver: prompt_builder.documents
- sender: llm.replies
  receiver: AnswerBuilder.replies

inputs:  # Define the inputs for your pipeline
  query:  # These components will receive the query as input
  - "retriever.query"
  - "ranker.query"
  - "prompt_builder.question"
  - "AnswerBuilder.query"
  filters:  # These components will receive a potential query filter as input
  - "retriever.filters_bm25"
  - "retriever.filters_embedding"

  files:
  - file_downloader.sources

outputs:  # Defines the output of your pipeline
  documents: "attachments_joiner.documents"  # The output of the pipeline is the retrieved documents
  answers: "AnswerBuilder.answers"   # The output of the pipeline is the generated answers

max_runs_per_component: 100

metadata: {}

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

Parameter	Type	Default	Description
pattern	Optional[str]	None	The regular expression pattern to extract the answer text from the Generator. If not specified, the entire response is used as the answer. The regular expression can have one capture group at most. If present, the capture group text is used as the answer. If no capture group is present, the whole match is used as the answer. Examples: `[^\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer". `Answer: (.*)` finds "this is an answer" in a string "this is an argument. Answer: this is an answer".
reference_pattern	Optional[str]	None	The regular expression pattern used for parsing the document references. If not specified, no parsing is done, and all documents are referenced. References need to be specified as indices of the input documents and start at [1]. Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]".
last_message_only	bool	False	If `False` (default value), all messages are used as the answer. If `True`, only the last message is used as the answer.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

Parameter	Type	Default	Description
query	str		The user query.
replies	Union[List[str], List[ChatMessage]]		The output of the Generator. Can be a list of strings or a list of `ChatMessage` objects.
meta	Optional[List[Dict[str, Any]]]	None	The metadata returned by the Generator. If not specified, the generated answer will contain no metadata.
documents	Optional[List[Document]]	None	The documents used as the Generator inputs. If specified, they are added to the`GeneratedAnswer` objects. If both `documents` and `reference_pattern` are specified, the documents referenced in the Generator output are extracted from the input documents and added to the `GeneratedAnswer` objects.
pattern	Optional[str]	None	The regular expression pattern to extract the answer text from the Generator. If not specified, the entire response is used as the answer. The regular expression can have one capture group at most. If present, the capture group text is used as the answer. If no capture group is present, the whole match is used as the answer. Examples: `[^\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer". `Answer: (.*)` finds "this is an answer" in a string "this is an argument. Answer: this is an answer".
reference_pattern	Optional[str]	None	The regular expression pattern used for parsing the document references. If not specified, no parsing is done, and all documents are referenced. References need to be specified as indices of the input documents and start at [1]. Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]".

Was this page helpful?

Basic Information​

Inputs​

Outputs​

Overview​

Usage Example​

Initializing the Component​

Using the Component in a Pipeline​

Parameters​

Init Parameters​

Run Method Parameters​