AnswerGenerator

The AnswerGenerator generates a novel text as an answer to your query. It does so based on the documents you feed to it.

While extractive question answering highlights a span of text as an answer, AnswerGenerator generates a completely new text. It composes this text based on the knowledge it gained during the pretraining and the documents it got from the retriever.

Basic Information

  • Pipeline type: Used in query pipelines
  • Position in a pipeline: After the retriever. You can use it as a substitute for the reader.
  • Input: Query and Documents
  • Output: Answer
  • Available Classes: OpenAIAnswerGenerator

Usage

Initializing the Node

Here's the code to initialize OpenAIAnswerGenerator:

# Here's how you configure the node in YAML:

components:
	- name: AnswerGenerator
    type: OpenAIAnswerGenerator
    params:
    	api_key: my_api_key
from haystack.nodes import OpenAIAnswerGenerator

generator = OpenAIAnswerGenerator(api_key=MY_API_KEY)

To initialize Seq2SeqGenerator, run:

# Here how you configure it in YAML:

components:
	- name: AnswerGenerator
    type: Seq2SeqGenerator
    params:
    	model_name_or_path: your_locally_hosted_model
from haystack.nodes import Seq2SeqGenerator

generator = Seq2SeqGenerator(model_name_or_path="vblagoje/bart_lfqa")

In a Pipeline

This is an example of how you can use OpenAIAnswerGenerator in a pipeline:

# This example just contains the query pipeline part.
# Normally, you'd also need to specify the indexing pipeline and its components here

components:
# In YAML, you must set up the DocumentStore and a Retriever to fetch the documents
  - name: DocumentStore
    type: DeepsetCloudDocumentStore
  - name: Retriever # Selects the most relevant documents from the document store so that the OpenAI model can base it's generation on it. 
    type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
    params:
      document_store: DocumentStore
      embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search 
      model_format: sentence_transformers
      top_k: 3 # The number of documents to return
  - name: AnswerGenerator # Generates candidate answers based on the documents it gets from the retriever
    type: OpenAIAnswerGenerator 
    params:
      model: text-davinci-003
      api_key: your_openai_api_key # You can also set the api key in the Connections tab, then you don't need to add it here
      max_tokens: 50 # The maximum number of tokens allowed for each generated Answer.
      temperature: 0.9 # Determines the randomness of the model. Higher values mean the model will take more risk
      presence_penalty: 0.1 #  Positive values penalize new tokens based on whether they have already appeared in the text.
      top_k: 3 # The number of results to return


# Here you define how the nodes are organized in the pipelines
# For each node, specify its input
pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: AnswerGen
        inputs: [Retriever]
  - name: indexing
    # Here comes the indexing pipeline
from haystack.pipelines import Pipeline
from haystack.nodes import OpenAIAnswerGenerator
from haystack.schema import Document

# These docs could also come from a retriever
# Here we explicitly specify them to avoid the setup steps for Retriever and DocumentStore
doc_1 = "Contrails are a manmade type of cirrus cloud formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail. The exhaust can also trigger the formation of cirrus by providing ice nuclei when there is an insufficient naturally-occurring supply in the atmosphere. One of the environmental impacts of aviation is that persistent contrails can form into large mats of cirrus, and increased air traffic has been implicated as one possible cause of the increasing frequency and amount of cirrus in Earth's atmosphere."
doc_2 = "Because the aviation industry is especially sensitive to the weather, accurate weather forecasting is essential. Fog or exceptionally low ceilings can prevent many aircraft from landing and taking off. Turbulence and icing are also significant in-flight hazards. Thunderstorms are a problem for all aircraft because of severe turbulence due to their updrafts and outflow boundaries, icing due to the heavy precipitation, as well as large hail, strong winds, and lightning, all of which can cause severe damage to an aircraft in flight. Volcanic ash is also a significant problem for aviation, as aircraft can lose engine power within ash clouds. On a day-to-day basis airliners are routed to take advantage of the jet stream tailwind to improve fuel efficiency. Aircrews are briefed prior to takeoff on the conditions to expect en route and at their destination. Additionally, airports often change which runway is being used to take advantage of a headwind. This reduces the distance required for takeoff, and eliminates potential crosswinds."

# Let's initiate the OpenAIAnswerGenerator 
node = OpenAIAnswerGenerator(
    api_key=api_key,
    model="text-davinci-003",
    max_tokens=50,
    presence_penalty=0.1,
    frequency_penalty=0.1,
    top_k=3,
    temperature=0.9
)

# Let's create a pipeline with OpenAIAnswerGenerator
pipe = Pipeline()
pipe.add_node(component=node, name="prompt_node", inputs=["Query"])

output = pipe.run(query="Why do airplanes leave contrails in the sky?", documents=[Document(doc_1), Document(doc_2)])
output["answers"]

# Printed results
[<Answer {'answer': ' Contrails are created when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail.', 'type': 'generative', 'score': None, 'context': None, 'offsets_in_document': None, 'offsets_in_context': None, 'document_id': None, 'meta': {'doc_ids': ['6a371f0bbb37c291befaaaf4704dc694', '2a2f7c49e1bec7864dd4bb447d5d0bfa'], 'doc_scores': [None, None], 'content': ["Contrails are a manmade type of cirrus cloud formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail. The exhaust can also trigger the formation of cirrus by providing ice nuclei when there is an insufficient naturally-occurring supply in the atmosphere. One of the environmental impacts of aviation is that persistent contrails can form into large mats of cirrus, and increased air traffic has been implicated as one possible cause of the increasing frequency and amount of cirrus in Earth's atmosphere.", 'Because the aviation industry is especially sensitive to the weather, accurate weather forecasting is essential. Fog or exceptionally low ceilings can prevent many aircraft from landing and taking off. Turbulence and icing are also significant in-flight hazards. Thunderstorms are a problem for all aircraft because of severe turbulence due to their updrafts and outflow boundaries, icing due to the heavy precipitation, as well as large hail, strong winds, and lightning, all of which can cause severe damage to an aircraft in flight. Volcanic ash is also a significant problem for aviation, as aircraft can lose engine power within ash clouds. On a day-to-day basis airliners are routed to take advantage of the jet stream tailwind to improve fuel efficiency. Aircrews are briefed prior to takeoff on the conditions to expect en route and at their destination. Additionally, airports often change which runway is being used to take advantage of a headwind. This reduces the distance required for takeoff, and eliminates potential crosswinds.'], 'titles': ['', '']}}>,
 <Answer {'answer': ' Airplanes leave contrails in the sky because water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail.', 'type': 'generative', 'score': None, 'context': None, 'offsets_in_document': None, 'offsets_in_context': None, 'document_id': None, 'meta': {'doc_ids': ['6a371f0bbb37c291befaaaf4704dc694', '2a2f7c49e1bec7864dd4bb447d5d0bfa'], 'doc_scores': [None, None], 'content': ["Contrails are a manmade type of cirrus cloud formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail. The exhaust can also trigger the formation of cirrus by providing ice nuclei when there is an insufficient naturally-occurring supply in the atmosphere. One of the environmental impacts of aviation is that persistent contrails can form into large mats of cirrus, and increased air traffic has been implicated as one possible cause of the increasing frequency and amount of cirrus in Earth's atmosphere.", 'Because the aviation industry is especially sensitive to the weather, accurate weather forecasting is essential. Fog or exceptionally low ceilings can prevent many aircraft from landing and taking off. Turbulence and icing are also significant in-flight hazards. Thunderstorms are a problem for all aircraft because of severe turbulence due to their updrafts and outflow boundaries, icing due to the heavy precipitation, as well as large hail, strong winds, and lightning, all of which can cause severe damage to an aircraft in flight. Volcanic ash is also a significant problem for aviation, as aircraft can lose engine power within ash clouds. On a day-to-day basis airliners are routed to take advantage of the jet stream tailwind to improve fuel efficiency. Aircrews are briefed prior to takeoff on the conditions to expect en route and at their destination. Additionally, airports often change which runway is being used to take advantage of a headwind. This reduces the distance required for takeoff, and eliminates potential crosswinds.'], 'titles': ['', '']}}>,
 <Answer {'answer': ' Contrails are formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail.', 'type': 'generative', 'score': None, 'context': None, 'offsets_in_document': None, 'offsets_in_context': None, 'document_id': None, 'meta': {'doc_ids': ['6a371f0bbb37c291befaaaf4704dc694', '2a2f7c49e1bec7864dd4bb447d5d0bfa'], 'doc_scores': [None, None], 'content': ["Contrails are a manmade type of cirrus cloud formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail. The exhaust can also trigger the formation of cirrus by providing ice nuclei when there is an insufficient naturally-occurring supply in the atmosphere. One of the environmental impacts of aviation is that persistent contrails can form into large mats of cirrus, and increased air traffic has been implicated as one possible cause of the increasing frequency and amount of cirrus in Earth's atmosphere.", 'Because the aviation industry is especially sensitive to the weather, accurate weather forecasting is essential. Fog or exceptionally low ceilings can prevent many aircraft from landing and taking off. Turbulence and icing are also significant in-flight hazards. Thunderstorms are a problem for all aircraft because of severe turbulence due to their updrafts and outflow boundaries, icing due to the heavy precipitation, as well as large hail, strong winds, and lightning, all of which can cause severe damage to an aircraft in flight. Volcanic ash is also a significant problem for aviation, as aircraft can lose engine power within ash clouds. On a day-to-day basis airliners are routed to take advantage of the jet stream tailwind to improve fuel efficiency. Aircrews are briefed prior to takeoff on the conditions to expect en route and at their destination. Additionally, airports often change which runway is being used to take advantage of a headwind. This reduces the distance required for takeoff, and eliminates potential crosswinds.'], 'titles': ['', '']}}>]

Arguments

This generator uses the GPT-3 models hosted by Open AI to generate the answers. You need an API key from an active Open AI account to use these models.

ArgumentTypePossible ValuesDescription
api_keyStringYour API key from an active Open AI account.
Mandatory.
modelStringModel name.
Default: text-davinci-003
The name of the Open AI model you want to use.
Mandatory.
max_tokensIntegerDefault: 50The maximum number of tokens the generated answer can have.
Setting a number higher than the default allows for longer answers without exceeding the maximum prompt length of the Open AI model.
Setting a number lower than the default allows for longer prompts with more documents passed as context, but the generated answer might be cut once it reaches max_tokens.
Mandatory.
top_kIntegerDefault: 5The number of generated answers.
Mandatory.
temperatureFloatDefault: 0.2The sampling temperature you want to use. Higher values mean the model will take more risks. Value 0 works better for scenarios with a well-defined answer.
Mandatory.
presence_penaltyFloatA number between -2.0 and 2.0
Default: 0.1
Positive values penalize new tokens based on whether they have already appeared in the text. This increases the model's likelihood of talking about new topics. For more information about frequency and presence penalties, see parameter details in OpenAI.
Mandatory.
frequency_penaltyFloatA number between -2.0 and 2.0
Default: 0.1
Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood of repeating the same line verbatim. See more information about frequency and presence penalties.
Mandatory.
examples_contextStringA text snippet containing the contextual information used to generate the answers for the examples you provide. If not supplied, the default from OpenAI API docs is used: "In 2017, U.S. life expectancy was 78.6 years."
Optional.
examplesA list of stringsList of (question, answer) pairs that help steer the model towards the tone and answer format you'd like. We recommend adding 2 to 3 examples. If not supplied, the default from OpenAI API docs is used: [Your API key from an active Open AI account. \nMandatory.",
"1-0": "`]
Optional.
stop_wordsA list of stringsUp to four sequences where the API stops generating further tokens. The returned text does not contain the stop sequence. If you don't provide any stop words, the default value from OpenAI API docs is used: r API key from an activ.
Optional.
progress_barBooleanTrue/False
Default: True
Shows the progress bar indicating the progress of answer generation.
Mandatory.
prompt_templatePromptTemplateA PromptTemplate that tells the model how to generate answers given a context and query supplied at runtime. The context is automatically constructed at runtime from a list of provided documents. Use example_context and a list of examples to steer the model towards the tone and answer format you want. If not supplied, the default prompt template is:
PromptTemplate( name="question-answering-with-examples", prompt_text="Please answer the question according to the above context." "\\n===\\nContext: $examples_context\\n===\\n$examples\\n\\n" "===\\nContext: $context\\n===\\n$query", prompt_params=["examples_context", "examples", "context", "query"], )
Optional.
context_join_strStringThe separation string used when joining the input documents to create the context used by the PromptTemplate.