PromptNode
PromptNode is an easy-to-use, customizable node that brings you the power of large language models. You can use it in your pipelines for various NLP tasks.
With PromptNode, you can use large language models directly in your pipelines.
What are large language models?
Large language models are huge models trained on enormous amounts of data. Interacting with such a model resembles talking to another person. These models have general knowledge of the world. You can ask them anything, and they'll be able to answer.
Large language models are trained to perform many NLP tasks with little training data. What's astonishing about them is that a single model can perform various NLP tasks with good accuracy.
Some examples of large language models include flan-t5-base, flan-paLM, chinchilla, and GPT-3 variants, such as text-davinci-003.
Basic Information
PromptNode is a very versatile node. It's used in query pipelines, but its position depends on what you want it to do. You can pass a prompt template to specify the NLP task the PromptNode should perform, and a model to use. For more information, see the Usage section.
- Pipeline type: Used in query pipelines.
- Position in a pipeline: The position depends on the NLP task you want it to do. See Usage for examples.
- Input and Output: Depends on the NLP task it performs. Some examples are
query
,documents
, and the output of the preceding node. You define the input in the PromptTemplate you pass to PromptNode. If you're using one of the ready-made PromptTemplates, here's the input and output they take:
Prompt Template | Input | Output |
---|---|---|
question-answering | documents Type: List or string questions Type: List or string | answer |
question-generation | documents Type: List or string | question |
conditioned-question-generation | documents Type: List or string answers Type: List or string | question |
summarization | documents Type: List or string | summary |
question-answering-check | documents Type: List or string questions Type: List or string | answer |
sentiment-analysis | documents Type: List or string | answer |
multiple-choice-question-answering | questions Type: List or string options Type: List or list of lists | answer |
topic-classification | options Type: List or list of lists documents Type: List or string | answer |
language-detection | documents Type: List or string | answer |
translation | target_language Type: List or string documents Type: List or string | translation |
The output is usually a string, but in the prompt, you can tell the model to generate a specific output type.
- Available classes: PromptNode
Usage
You can use PromptNode as a stand-alone node or in a pipeline. If you don't specify the model you want to use for the node, it uses flan t5 base.
Stand Alone
You can run PromptNode in your SDK. Just initialize the node and ask a question. The model has general knowledge about the world, so you can ask it anything.
from haystack.nodes import PromptNode
# Initialize the node:
prompt_node = PromptNode()
# Run a prompt
prompt_node("What is the capital of Germany?")
# Here's the output:
['berlin']
With a Prompt Template
PromptNode comes with out-of-the-box PromptTemplates. The templates contain instructions for the node to perform some of the most common NLP tasks. For better results, specify the template you want PromptNode to use. You can pass additional variables, like documents or questions, to the node. The template combines all inputs into a single prompt:
from haystack.nodes import PromptNode, PromptTemplate
# Initalize the node
prompt_node = PromptNode()
# Specify the template using the `prompt` method
# and pass your documents and questions:
prompt_node.prompt(prompt_template="question-answering",
documents=["Berlin is the capital of Germany.", "Paris is the capital of France."],
questions=["What is the capital of Germany?", "What is the capital of France"])
# Here's the output:
['Berlin', 'Paris']
To explore the real power of templates, see the Templates section.
With a Model Specified
By default, PromptNode uses the flan t5 base model. You can also use other google/flan-t5 models and the davinci model by OpenAI.
from haystack.nodes import PromptNode
# Initalize the node passing the model:
prompt_node = PromptNode(model_name_or_path="google/flan-t5-xl")
# Go ahead and ask a question:
prompt_node("What is the best city in Europe to live in?")
In a Pipeline
The real power of PromptNode shows when you use it in a pipeline. Look at the example to get an idea of what's possible.
PromptNode and Shaper
When used in a pipeline, PromptNode often requires Shaper to make sure the input of the preceding node is what PromptNode expects. Or to make sure the output of PromptNode is what the next node expects. Shaper has out-of-the-box functions you can use to modify the input and output of PromptNode's PromptTemplate.
To understand which Shaper function to use, let's look at how PromptNode and PromptTemplates work. When a PromptTemplate takes one parameter, PromptNode simply processes this parameter. If the parameter is a list of strings, PromptNode processes the list entries one by one. An example of such a template is summarization
or language_detection
.
When a PromptTemplate takes more than one parameter, it gets a bit more complicated. If each parameter is a single string, they simply get injected into the PromptTemplate and PromptNode executes them. For example, in the question-answering
template, you may pass one question and one document. The PromptNode then takes the question and runs it against the document. But if the parameters are lists, PromptNode then processes the first item from the first list against the first item from the second list, it then moves on to the second item from the first list, against the second item from the second list. It goes on like that until the shorter list finishes and then it stops. The remaining items from the longer list remain unprocessed.

For example, in the question-answering
template, if you pass a question as a string and a list of 5 documents, PromptNode will run the question against the first document and it will stop.

To make it run through the question through all documents, you must use Shaper. You can do it in two ways:
- Using Shaper's
value_to_list
function. This changes the question into a list, where the question is repeated 5 times (because you have 5 documents). PromptNode then takes the first occurrence of the question in the list and runs it against the first document, then takes the second occurrence of the list and runs it against the second document, and so on. So it's just a trick that repeats the question as many times as there are documents to make sure PromptNode searches for the question in each document. - Using Shaper's
join_documents
function. This function joins all the documents into one large document. PromptNode then runs the question against this document.
PromptNode always processes parameters this way. So if you're passing a list in at least one of your parameters, use an appropriate Shaper function to control how PromptNode processes this list.
Examples
Long-Form Question Answering
Long-form QA is one use of the PromptNode, but certainly not the only one. In this QA type, PromptNode handles complex questions by synthesizing information from various documents to retrieve an answer.
from haystack.pipelines import Pipeline
from haystack.nodes import Shaper, PromptNode, PromptTemplate
from haystack.schema import Document
# Let's create a custom LFQA prompt using PromptTemplate
lfqa_prompt = PromptTemplate(name="lfqa",
prompt_text="""Synthesize a comprehensive answer from the following topk most relevant paragraphs and the given question.
Provide a clear and concise response that summarizes the key points and information presented in the paragraphs.
Your answer should be in your own words and be no longer than 50 words.
\n\n Paragraphs: $documents \n\n Question: $query \n\n Answer:""")
# These docs could also come from a retriever
# Here we explicitly specify them to avoid the setup steps for Retriever and DocumentStore
doc_1 = "Contrails are a manmade type of cirrus cloud formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail. The exhaust can also trigger the formation of cirrus by providing ice nuclei when there is an insufficient naturally-occurring supply in the atmosphere. One of the environmental impacts of aviation is that persistent contrails can form into large mats of cirrus, and increased air traffic has been implicated as one possible cause of the increasing frequency and amount of cirrus in Earth's atmosphere."
doc_2 = "Because the aviation industry is especially sensitive to the weather, accurate weather forecasting is essential. Fog or exceptionally low ceilings can prevent many aircraft from landing and taking off. Turbulence and icing are also significant in-flight hazards. Thunderstorms are a problem for all aircraft because of severe turbulence due to their updrafts and outflow boundaries, icing due to the heavy precipitation, as well as large hail, strong winds, and lightning, all of which can cause severe damage to an aircraft in flight. Volcanic ash is also a significant problem for aviation, as aircraft can lose engine power within ash clouds. On a day-to-day basis airliners are routed to take advantage of the jet stream tailwind to improve fuel efficiency. Aircrews are briefed prior to takeoff on the conditions to expect en route and at their destination. Additionally, airports often change which runway is being used to take advantage of a headwind. This reduces the distance required for takeoff, and eliminates potential crosswinds."
# Shaper concatenates the most relevant docs into one doc used as context for the generated answer
shaper = Shaper(func="join_documents", inputs={"documents": "documents"}, outputs=["documents"])
# Let's initiate the PromptNode
node = PromptNode("text-davinci-003", default_prompt_template=lfqa_prompt, api_key=api_key)
# Let's create a pipeline with Shaper and PromptNode
pipe = Pipeline()
pipe.add_node(component=shaper, name="shaper", inputs=["Query"])
pipe.add_node(component=node, name="prompt_node", inputs=["shaper"])
output = pipe.run(query="Why do airplanes leave contrails in the sky?", documents=[Document(doc_1), Document(doc_2)])
output["results"]
# Here's the answer:
["Contrails are manmade clouds formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, creating a visible trail. Increased air traffic has been linked to the greater frequency and amount of these cirrus clouds in Earth's atmosphere."]
version: 1.14.0
name: LongFormQA
components:
#what about the documents? do we need to configure the docstore?
- name: Shaper
type: Shaper
params:
func: join_documents
inputs:
- documents
outputs:
- documents
- name: PromptNode
type: PromptNode
params:
default_prompt_template: lfqa_prompt
model_name_or_path: text-davinci-003
api_key: api_key
- name: FileTypeClassifier # Routes files based on their extension to appropriate converters, by default txt, pdf, md, docx, html
type: FileTypeClassifier
- name: TextConverter # Converts files into documents
type: TextConverter
- name: PDFConverter # Converts PDFs into documents
type: PDFToTextConverter
- name: Preprocessor # Splits documents into smaller ones and cleans them up
type: PreProcessor
params:
split_by: word # The unit by which you want to split the documents
split_length: 250 # The max number of words in a document
split_overlap: 20 # Enables the sliding window approach
language: en
split_respect_sentence_boundary: True
pipelines:
- name: query
nodes:
- name: Shaper
inputs: [Query]
- name: PromptNode
inputs: [Shaper]
- name: indexing
nodes:
- name: FileTypeClassifier
inputs: [File]
- name: TextConverter
inputs: [FileTypeClassifier.output_1] # Ensures that this converter receives txt files
- name: PDFConverter
inputs: [FileTypeClassifier.output_2] # Ensures that this converter receives PDFs
- name: Preprocessor
inputs: [TextConverter, PDFConverter]
- name: Retriever
inputs: [Preprocessor]
- name: DocumentStore
inputs: [Retriever]
To learn more about the template structure, see the Templates section below.
Arguments
Use these parameters to configure the PromptNode:
Parameter | Type | Possible Values | Description |
---|---|---|---|
model_name_or_path | String | Different sizes of the flan t5 model or a model by Open AI. Default: google/flan-t5-base | The name of the model you want to use with the PromptNode. Mandatory. |
default_prompt_template | String | Any of the out-of-the-box templates or a template you created. The out-of-the-box templates are: - question-answering - question-generation - summarization - conditioned-question-generation - question-answering-check - sentiment-analysis - topic-classification - multiple-choice-question-answering - language-detection - translation | The prompt template you want to use with the PromptNode. The template contains instructions for the model. If you don't specify it, the model tries to guess what task you want it to do based on your query. For best results, we recommend specifying the template. Optional. |
output_variable | String | - | The name of the output variable in which you want to store the inference results. Optional. |
max_length | Integer | Default: 100 | The maximum length of the text output the PromptNode generates. Optional. |
api_key | String | - | The API key for the model. Specify it to use a model by Open AI. Optional. |
use_auth_token | String | - | The Hugging Face authentication token for your private model. Optional. |
use_gpu | Boolean | True/False | Specifies if you want to use GPU when running PromptNode. Optional. |
devices | List of strings | - | The list of torch devices to use. Optional. |
stop_words | List of strings | - | If the PromptNode encounters any of the words you specify here, it stops generating text. Optional. |
top_k | Integer | Default: 1 | The number of answers (generated texts) you want PromptNode to return. Mandatory. |
model_kwargs | Dictionary | - | Any additional keyword arguments you want to pass to the model. Optional. |
Prompt Templates
PromptNode comes with out-of-the-box prompt templates ready for you to use. A prompt template corresponds to an NLP task. Each template contains the prompt text, which is the instruction for the model. Prompt text may contain variables that get filled in with actual values at runtime. Here are the templates currently available for the PromptNode:
question-answering
PromptTemplate(
name="question-answering",
prompt_text="Given the context please answer the question. Context: $documents; Question: "
"$questions; Answer:",
)
question-generation
PromptTemplate(
name="question-generation",
prompt_text="Given the context please generate a question. Context: $documents; Question:",
)
summarization
PromptTemplate(name="summarization", prompt_text="Summarize this document: $documents Summary:")
conditioned-question-generation
PromptTemplate(
name="conditioned-question-generation",
prompt_text="Please come up with a question for the given context and the answer. "
"Context: $documents; Answer: $answers; Question:",
)
question-answering-check
PromptTemplate(
name="question-answering-check",
prompt_text="Does the following context contain the answer to the question? "
"Context: $documents; Question: $questions; Please answer yes or no! Answer:",
)
sentiment-analysis
PromptTemplate(
name="sentiment-analysis",
prompt_text="Please give a sentiment for this context. Answer with positive, "
"negative or neutral. Context: $documents; Answer:",
)
topic-classification
PromptTemplate(
name="topic-classification",
prompt_text="Categories: $options; What category best describes: $documents; Answer:",
)
multiple-choice-question-answering
PromptTemplate(
name="multiple-choice-question-answering",
prompt_text="Question:$questions ; Choose the most suitable option to answer the above question. "
"Options: $options; Answer:",
)
language-detection
PromptTemplate(
name="language-detection",
prompt_text="Detect the language in the following context and answer with the "
"name of the language. Context: $documents; Answer:",
)
translation
PromptTemplate(
name="translation",
prompt_text="Translate the following context to $target_language. Context: $documents; Translation:",
)
If you don't specify the template, the node tries to guess what task you want it to perform. By indicating the template, you ensure it performs the right task.
Adding a New Template
You can also create your own template. Follow this structure:
from haystack.nodes import PromptTemplate, PromptNode
# In `prompt_text`, tell the model what you want it to do.
PromptNode.add_prompt_template(
PromptTemplate(
name="a meaningful template name"
prompt_text="Instructions for the model. You can add variables here.'
)
)
The prompt_text
parameter contains the prompt template text for the task you want the model to do. It also specifies input variables. At runtime, these variables must be present in the execution context of the node.
When specifying parameters for your template, remember how PromptNode processes them. If there's more than one parameter and one of the parameters is a list, PromptNode processes the first item from the list against the second parameter. You may need Shaper to use PromptNode in a pipeline. For more information, see PromptNode and Shaper.
Setting a Default Template
You can set a default template for a PromptNode instance. This way, you can reuse the same PromptNode in your pipeline for different tasks:
from haystack.nodes import PromptTemplate, PromptNode
from haystack.schema import Document
prompt_node = PromptNode()
sa = prompt_node.set_default_prompt_template("sentiment-analysis-new")
sa(documents=[Document("I am in love and I feel great!")])
# Node output:
['positive']
# You can then switch to another template:
summarizer = sa.set_default_prompt_template("summarization")
Models
The default model for PromptModel and PromptNode is google/flan-t5-base
but you can use any other LLM model. To do this, specify the model's name and the API key.
Using Another Model
You can replace the default model with a flan t5 model of a different size or a model by OpenAI.
This example uses a version of the GPT-3 model:
from haystack.nodes import PromptModel, PromptNode
openai_api_key = "<type your OpenAI API key>"
# Specify the model you want to use:
prompt_open_ai = PromptModel(model_name_or_path="text-davinci-003", api_key=openai_api_key)
# Make PromptNode use the model:
pn_open_ai = PromptNode(prompt_open_ai)
pn_open_ai("What's the coolest city to live in Germany?")
components:
- name: PromptNode
type: PromptNode
params:
default_prompt_template: question-answering
model_name_or_path: text-davinci-003
api_key: my_openai_key
Using Different Models in One Pipeline
You can also specify different LLMs for each PromptNode in your pipeline.
from haystack.nodes. import PromptTemplate, PromptNode, PromptModel
from haystack.pipelines import Pipeline
api_key = "<type your OpenAI API key>"
# Specify the model you want to use:
prompt_open_ai = PromptModel(model_name_or_path="text-davinci-003", api_key=api_key)
# This sets up the default model:
prompt_model = PromptModel()
# Now let make one PromptNode use the default model and the other one the OpenAI model:
node_default_model = PromptNode(prompt_model, default_prompt_template="question-generation", output_variable="questions")
node_openai = PromptNode(prompt_open_ai, default_prompt_template="question-answering")
pipeline = Pipeline()
pipeline.add_node(component=node_default_model, name="prompt_node1", inputs=["Query"])
pipe.add_node(component=node_openai, name="prompt_node_2", inputs=["prompt_node1"])
output = pipe.run(query="not relevant", documents=[Document("Berlin is the capital of Germany")])
output["results"]
# In YAML, you simply specify two PromptNodes, each with a different name and a different model
# Bear in mind that this example is not a complete pipeline, you'd still need to create the indexing pipeline
# and define its components
#what about documents??
components:
- name: PromptNodeOpenAI
type: PromptNode
params:
default_prompt_template: question-answering
model_name_or_path: text-davinci-003
api_key: my_openai_key
- name: PromptNodeDefault
type: PromptNode
params:
default_prompt_template: question-generation
model_name_or_path: google/flan-t5-large
# And now you could put the two nodes together in the query pipeline:
pipelines:
- name: query
nodes:
- name: PromptNodeDefault
inputs: [Query]
- name: PromptNodeOpenAI
inputs: [PromptNodeDefault]
Updated 7 days ago