DeepsetAnswerBuilder

Use DeepsetAnswerBuilder combined with a Generator instructed to produce references to its replies to convert these replies into a format you can visualize in deepset AI Platform.

Suggest Edits

Basic Information

Type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
Components it can connect with:
- Rankers: It can receive documents from Rankers and add them to the generated answers.
- Generators: It receives replies from a generator and transforms them into GeneratedAnswer objects.

Inputs

Required Inputs

Name	Type	Description
`query`	String	The query string. If DeepsetAnswerBuilder doesn't receive the `query` from a component it's connected to, you must list it in the `inputs` section of the pipeline YAML under `query`. You can see an example in the Usage Examples section below.
`replies`	List of strings	A list of replies from a generator.

Optional Inputs

Name	Type	Default	Description
`meta`	List of dictionaries of string and any	`None`	A list of metadata the generator returns. If not received, the generated answer contains no metadata.
`documents`	List of Document objects	`None`	A list of documents the generator returns. If received, they're added to the `GeneratedAnswer` objects.
`pattern`	String	`None`	The regular expression to extract the answer text from the generator output. If not specified, the whole string is used as the answer. The regular expression can have at most one capture group. If a capture group is present, the text matched by the capture group is used as the answer. If no capture group is present, the whole match is used as the answer. Examples: `[^\\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer". `Answer: (.*)` finds "this is an answer" in a string "this is an argument. Answer: this is an answer".
`reference_pattern`	String	`None`	The regular expression pattern to use for parsing the document references. It's assumed that references are specified as indices of the input documents and that indices start at 1. Example: `\\[(\\d+)\\]` finds "1" in a string "this is an answer[1]". If not specified, no parsing is done, and all documents are referenced. You can use the following abbreviation: `acm`: `\\[(?:(\\d+),?\\s*)+\\]` finds "1" and "2" in a string "this is an answer[1, 2]".
`prompt`	String	`None`	The prompt the Generator uses. If specifies, it's added to the metadata of the GeneratedAnswer objects.

Outputs

Name	Type	Description
`answers`	List of GeneratedAnswer objects	Answers obtained from the output of the generator.

Overview

DeepsetAnswerBuilder takes a query and the replies from a Generator as input and turns them into GeneratedAnswer objects. Optionally, you can configure it to enhance the generated answer with documents and metadata from the Generator.

DeepsetAnswerBuilder is used in RAG pipelines to enhance generated responses with references. You use it after a Generator instructed to produce references. DeepsetAnswerBuilder then takes the replies from such Generator as input and adds the references to the answer's _references metadata field so that they can be displayed in deepset's user interface.

The difference between ReferencePredictor and DeepsetAnswerBuilder is that ReferencePredictor uses a dedicated model that filters for documents' ID to create references, while DeepsetAnswerBuilder is used with an LLM (through a Generator) instructed to create the references.

Usage Example

In this example, DeepsetAnswerBuilder receives documents from the Ranker so that it can attach these documents to the generated answers, and it receives replies from the Generator so that it can convert them into the GeneratedAnswer objects with references that the deepset interface can display.

query_yaml: |
  components:
  ...
    ranker:
      type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
      init_parameters:
        model: "svalabs/cross-electra-ms-marco-german-uncased"
        top_k: 8
        device: null
        model_kwargs:
          torch_dtype: "torch.float16"
          
    generator:
      type: haystack.components.generators.openai.OpenAIGenerator
      init_parameters:
        api_key: {"type": "env_var", "env_vars": ["OPENAI_API_KEY"], "strict": False}
        model: "gpt-4-turbo-preview"
        generation_kwargs:
          max_tokens: 400
          temperature: 0.0
          seed: 0
          
    answer_builder:
      type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
      init_parameters: 
        reference_pattern: acm
      ...
      
  connections:  # Defines how the components are connected
  ...
  - sender: ranker.documents
    receiver: answer_builder.documents # DeepsetAnswerBuilder receives documents from ranker
  - sender: generator.replies
    receiver: answer_builder.replies # DeepsetAnswerBuilder receives replies from the generator
    ...
    
  inputs:
   query:
   ..
   - "ranker.query"
   - "answer_builder.query" # We're listing AnswerBuilder here because it needs "query" as input and it's not
														# getting it from any other component it's connected to. This means AnswerBuilder
														# will receive "query" as input from the pipeline.
   ...
   
   outputs:
    answers: "answer_builder.answers" # This means we want AnswerBuilder's answers to be the output of the pipeline

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

Parameter	Type	Possible Values	Description
`pattern`	String	Default: `None`	The regular expression you want to use to extract the answer text from the Generator's output. If not specified, uses the whole string as the answer. The regular expression can have one capturing group at a maximum. If a capturing group is defined, the text that matches it is used as the answer. If there's no capturing group, the whole match is used as the answer. For example: `[^\n]+$` finds `this is an answer` in a string `this is an argument.\\nthis is an answer`. `Answer: (.\*)` finds `this is an answer` in a string `this is an argument. Answer: this is an answer`. Optional
`reference_pattern`	String	Default: `None`	The regular expression you want to use to parse document references. It assumes references are specified as indices of the documents and indices start at 1. For example: `\[(\\d+)\]` finds `1` in a string `this is an answer[1]`. You can use the following abbreviation: `acm` `\\[(?:(\\d+),?\\s*)+\\]` finds "1" and "2" in a string "this is an answer[1, 2]". If not specified, no parsing is done, and all documents are referenced. Optional.
`extract_xml_tags`	List of strings	Default: `None`	A list of XML tags to extract the content from. Optional.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

Run() method parameters take precedence over initialization parameters.

Parameter	Type	Possible values	Description
`query`	String		The user query. Required.
`replies`	List of strings		The Generator's `replies`. Required.
`meta`	List of dictionaries	Default: `None`	The metadata returned by the Generator. If not specified, the generated answer contains no metadata. Optional.
`documents`	List of `Document` objects	Default: `None`	The documents used as input to the Generator. If `documents` are specified, they are added to the `Answer` objects. If both `documents` and `reference_pattern` are specified, the documents referenced in the Generator output are extracted from the input documents and added to the `Answer` objects. Optional.
`pattern`	String	Default: `None`	The regular expression pattern used to extract the answer text from the generator output. If not specified, the whole string is used as the answer. The regular expression can have at most one capture group. If a capture group is present, the text matching the capture group is used as the answer. If no capture group is present, the whole match is used as the answer. Examples: `[^\\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer". `Answer: (.*)` finds "this is an answer" in a string "this is an argument. Answer: this is an answer". Optional.
`reference_pattern`	String	Default: `None`	The regular expression pattern used for parsing the document references. We assume that references are specified as indices of the input documents and that indices start at 1. Example: `\\[(\\d+)\\]` finds "1" in a string "this is an answer[1]". If not specified, no parsing is done, and all documents are referenced. You can use the following shortcuts: - "acm": `\\[(?:(\\d+),?\\s*)+\\]` finds "1" and "2" in a string "this is an answer[1, 2]". Optiona.
`prompt`	String	Default: `None`	The prompt used in the Generator. If specified, it is added to the metadata of the `Answer` objects. Optional.

Updated 4 months ago