DeepsetAnswerBuilder

Use DeepsetAnswerBuilder combined with a Generator instructed to produce references to its replies to convert these replies into a format you can visualize in deepset Cloud.

Basic Information

  • Pipeline type: Query
  • Type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
  • Components it can connect with:
    • Rankers: It can receive documents from Rankers and add them to the generated answers.
    • Generators: It receives replies from a generator and transforms them into GeneratedAnswer objects.

Inputs

Required Inputs

NameTypeDescription
queryStringThe query string. If DeepsetAnswerBuilder doesn't receive the query from a component it's connected to, you must list it in the inputs section of the pipeline YAML under query. You can see an example in the Usage Examples section below.
repliesList of stringsA list of replies from a generator.

Optional Inputs

NameTypeDefaultDescription
metaList of dictionaries of string and anyNoneA list of metadata the generator returns. If not received, the generated answer contains no metadata.
documentsList of Document objectsNoneA list of documents the generator returns. If received, they're added to the GeneratedAnswer objects.
patternStringNoneThe regular expression to extract the answer text from the generator output. If not specified, the whole string is used as the answer. The regular expression can have at most one capture group. If a capture group is present, the text matched by the capture group is used as the answer. If no capture group is present, the whole match is used as the answer.
Examples:
[^\\n]+$ finds "this is an answer" in a string "this is an argument.\nthis is an answer".
Answer: (.*) finds "this is an answer" in a string "this is an argument. Answer: this is an answer".
reference_patternStringNoneThe regular expression pattern to use for parsing the document references. It's assumed that references are specified as indices of the input documents and that indices start at 1.
Example: \\[(\\d+)\\] finds "1" in a string "this is an answer[1]".
If not specified, no parsing is done, and all documents are referenced.
You can use the following abbreviation:
acm: \\[(?:(\\d+),?\\s*)+\\] finds "1" and "2" in a string "this is an answer[1, 2]".
promptStringNoneThe prompt the Generator uses. If specifies, it's added to the metadata of the GeneratedAnswer objects.

Outputs

NameTypeDescription
answersList of GeneratedAnswer objectsAnswers obtained from the output of the generator.

Overview

DeepsetAnswerBuilder takes a query and the replies from a Generator as input and turns them into GeneratedAnswer objects. Optionally, you can configure it to enhance the generated answer with documents and metadata from the Generator.

DeepsetAnswerBuilder is used in RAG pipelines to enhance generated responses with references. You use it after a Generator instructed to produce references. DeepsetAnswerBuilder then takes the replies from such Generator as input and adds the references to the answer's _references metadata field so that they can be displayed in deepset Cloud's user interface.

The difference between ReferencePredictor and DeepsetAnswerBuilder is that ReferencePredictor uses a dedicated model that filters for documents' ID to create references, while DeepsetAnswerBuilder is used with an LLM (through a Generator) instructed to create the references.

Usage Example

In this example, DeepsetAnswerBuilder receives documents from the Ranker so that it can attach these documents to the generated answers, and it receives replies from the Generator so that it can convert them into the GeneratedAnswer objects with references that the deepset Cloud interface can display.

query_yaml: |
  components:
  ...
    ranker:
      type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
      init_parameters:
        model: "svalabs/cross-electra-ms-marco-german-uncased"
        top_k: 8
        device: null
        model_kwargs:
          torch_dtype: "torch.float16"
          
    generator:
      type: haystack.components.generators.openai.OpenAIGenerator
      init_parameters:
        api_key: {"type": "env_var", "env_vars": ["OPENAI_API_KEY"], "strict": False}
        model: "gpt-4-turbo-preview"
        generation_kwargs:
          max_tokens: 400
          temperature: 0.0
          seed: 0
          
    answer_builder:
      type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
      init_parameters: 
        reference_pattern: acm
      ...
      
  connections:  # Defines how the components are connected
  ...
  - sender: ranker.documents
    receiver: answer_builder.documents # DeepsetAnswerBuilder receives documents from ranker
  - sender: generator.replies
    receiver: answer_builder.replies # DeepsetAnswerBuilder receives replies from the generator
    ...
    
  inputs:
   query:
   ..
   - "ranker.query"
   - "answer_builder.query" # We're listing AnswerBuilder here because it needs "query" as input and it's not
														# getting it from any other component it's connected to. This means AnswerBuilder
														# will receive "query" as input from the pipeline.
   ...
   
   outputs:
    answers: "answer_builder.answers" # This means we want AnswerBuilder's answers to be the output of the pipeline
  

Init Parameters

ParameterTypePossible ValuesDescription
patternStringDefault: NoneThe regular expression you want to use to extract the answer text from the Generator's output. If not specified, uses the whole string as the answer.

The regular expression can have one capturing group at a maximum. If a capturing group is defined, the text that matches it is used as the answer. If there's no capturing group, the whole match is used as the answer.

For example: [^\n]+$ finds this is an answer in a string this is an argument.\\nthis is an answer. Answer: (.\*) finds this is an answer in a string this is an argument. Answer: this is an answer.

Optional
reference_patternStringDefault: NoneThe regular expression you want to use to parse document references. It assumes references are specified as indices of the documents and indices start at 1.
For example: \[(\\d+)\] finds 1 in a string this is an answer[1].
You can use the following abbreviation:
acm \\[(?:(\\d+),?\\s*)+\\] finds "1" and "2" in a string "this is an answer[1, 2]".
If not specified, no parsing is done, and all documents are referenced.

Optional.