PromptNode

PromptNode is an easy-to-use, customizable node that brings you the power of large language models. You can use it in your pipelines for various NLP tasks.

With PromptNode, you can use large language models directly in your pipelines. It's a very versatile node used in query pipelines, but its position depends on what you want it to do. You pass a prompt to it to specify the NLP task the PromptNode should perform and a model to use.

What are large language models?

Large language models are huge models trained on enormous amounts of data. Interacting with such a model resembles talking to another person. These models have general knowledge of the world. You can ask them anything, and they'll be able to answer.

Large language models are trained to perform many NLP tasks with little training data. What's astonishing about them is that a single model can perform various NLP tasks with good accuracy.

Some examples of large language models include Claude, Llama 2, GPT-4, and GPT-3 variants, such as gpt-3.5-turbo.

For more information, see also Large Language Models.

Basic Information

  • Pipeline type: Used in query pipelines.
  • Position in a pipeline: The position depends on the NLP task you want it to do.
  • Examples of nodes that can precede it in a pipeline: PromptNode, Retriever (a mandatory node in RAG pipelines), Ranker
  • Examples of nodes that can follow it in a pipeline: Ranker, PromptNode
  • Node input: PromptNode is very flexible and takes the output of the preceding node as its input. In RAG pipelines, this is usually Query or Documents.
  • Node output: String. You can set its key name using the output_variable parameter. See PromptNode arguments for more details.
    • To get an Answer object as output, use AnswerParser as the output parser. For more information, see PromptTemplate arguments.
    • You can also modify the output using Shaper.
  • Available classes: PromptNode, CNPromptNodeAsClassifier
  • Supported models:
    • Anthropic's Claude models
    • OpenAI InstructGPT models, including gpt-3.5., gpt-4, and gpt-4o
    • Cohere's command and generation models
    • Llama 2 and Llama3, including Llama 3.1
    • Hugging Face transformers (all text2text-generation models)
    • Models you can run remotely:

PromptNodeAsClassifier

This is a subclass of PromptNode that functions as a classifier. It needs a predefined set of labels to categorize the responses it generates. Based on the assigned label, answers are routed to various branches of the pipeline. Such branches are called output edges. You can then configure subsequent nodes in the pipeline to accept input only from a designated output edge.

The answer with the first label from the list assigned is routed to output_1, the answer with the second label from the list assigned goes to output_2, and so on. If an answer doesn't match any of the labels, it goes to the output edge numbered as (the number of labels in the list ) + 1. For example, if there are two labels in the list: ["good", "bad"], then an answer labeled "good" is sent to output_1, an answer labeled "bad" goes to output_2, and any answer that doesn't match these labels, goes to output_3.

When using Classifier, make sure you pass a prompt to it with instructions on how to classify e.g. the query. For example:

You are a query classifier.
You determine if a question addresses past or recent events.
You can output one of the following labels: ["past", "recent"].
Query: {query}
Label: 

You could then route the questions about the "past" to a pipeline branch that searches through older documents, route "recent" questions to a branch that searches in just recent documents, and send all other questions to a branch that searches across all documents.

For more information, see Usage Examples.

Usage Examples

With a Custom Prompt

To use a custom prompt, declare PromptTemplate as your pipeline component and paste the prompt text in the prompt parameter. Then, pass the name of the PromptTemplate in the default_prompt_template parameter of PromptNode:

...
components:
  - name: question-answering # give this prompt any name
    type: PromptTemplate
    params: 
    	prompt: "You are a technical expert. \
        You answer questions truthfully based on provided documents. \
        For each document check whether it is related to the question. \
        Only use documents that are related to the question to answer it. \
        Ignore documents that are not related to the question. \
        If the answer exists in several documents, summarize them. \
        Only answer based on the documents provided. Don't make things up. \
        Always use references in the form [NUMBER OF DOCUMENT] when using information from a document. e.g. [3], for Document[3]. \
        The reference must only refer to the number that comes in square brackets after passage. \
        Otherwise, do not use brackets in your answer and reference ONLY the number of the passage without mentioning the word passage. \
        If the documents can't answer the question or you are unsure say: 'The answer can't be found in the text'. \
        {new_line}\
        These are the documents:\
        {join(documents, delimiter=new_line, pattern=new_line+'Document[$idx]:'+new_line+'$content')}\
        {new_line}\
        Question: {query}\
        {new_line}\
        Answer:\
        {new_line}"
	- name: PromptNode 
    type: PromptNode
    params:
      default_prompt_template: question-answering # This tells PromptNode to use the PromptTemplate you configured above
   ...
   pipelines:
  - name: query
    nodes:
     - name: Retriever
     	 inputs: [Query]
     - name: PromptNode
       inputs: [Retriever]
    ...
...
components:
  - name: classification # give this prompt any name
    type: PromptTemplate
    params: 
    	prompt: "You are a query classifier. \
        You determine if a question is a keyword or a natural language question. \
        You can output the following labels: ["keyword", "natural"] \
	- name: Classifier
    type: PromptNode
    params:
      default_prompt_template: classification # This tells PromptNode to use the PromptTemplate you configured above
      labels: 
       - past
       - recent
   ...
  
   pipelines:
  - name: query
    nodes:
     - name: Classifier
     	 inputs: [Query]
     - name: BM25Retriever
       inputs: [Classifier.output_1] # This means all questions classified as "keyword" go to BM25Retriever
     - name: EmbeddingRetriever
       inputs: [Classifier.output_2] # This means all questions classified as "natural" go to EmbeddingRetriever
    ...

In your pipeline, you can have multiple PromptNodes, each with a different prompt. This way, each of them performs a different task. See the In a Pipeline section.

With a Model Specified

To use a proprietary model, make sure you first Connect to Model Providers. That's safer than adding the API key in the YAML.

You specify the model in the model_name_or_path parameter. Additional model parameters go into the model_kwargs, for example:

components:
	- name: PromptNode 
    type: PromptNode
    params:
      model_name_or_path: google/flan-t5-xl
      model_kwargs:
      	temperature: 0.6

For information on using hosted models, see Using Hosted LLMs in Your Pipelines.

For more information about model parameters, see the PromptModel Arguments section.

In a Pipeline

The real power of PromptNode shows when you use it in a pipeline. Look at the example to get an idea of what's possible.

Conversational Summary

Here's an example of how you could use PromptNode to generate a summary of a chat transcript:



components: 
	- name: DocumentStore
    type: DeepsetCloudDocumentStore # The only supported document store in deepset Cloud
  - name: Retriever # Selects the most relevant documents from the document store so that the OpenAI model can base it's generation on it. 
    type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
    params:
      document_store: DocumentStore
      embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search 
      model_format: sentence_transformers
      top_k: 1 # The number of documents
 	- name: PromptNode
   	type: PromptNode
   	params:
    	default_prompt_template: deepset/conversational-summary # this is a ready-made prompt template for summarizing transcripts. For more templates, see Prompt Studio.
    	model_name_or_path: gpt-3.5-turbo
    	api_key: <api_key>
	 - name: FileTypeClassifier # Routes files based on their extension to appropriate converters, by default txt, pdf, md, docx, html
   	 type: FileTypeClassifier
	 - name: TextConverter # Converts files into documents
   	 type: TextConverter
	 - name: PDFConverter # Converts PDFs into documents
	   type: PDFToTextConverter
	 - name: Preprocessor # Splits documents into smaller ones and cleans them up
	   type: PreProcessor
	   params:
	     split_by: word # The unit by which you want to split the documents
 	     split_length: 250 # The max number of words in a document
 	     split_overlap: 20 # Enables the sliding window approach
 	     language: en
 	     split_respect_sentence_boundary: True 
     
 pipelines:
  - name: query
    nodes:
     - name: Retriever
     	 inputs: [Query]
     - name: PromptNode
       inputs: [Retriever]
   - name: indexing
     nodes: 
      - name: FileTypeClassifier
        inputs: [File]
      - name: TextConverter
        inputs: [FileTypeClassifier.output_1] # Ensures that this converter receives txt files
      - name: PDFConverter
        inputs: [FileTypeClassifier.output_2] # Ensures that this converter receives PDFs
      - name: Preprocessor
        inputs: [TextConverter, PDFConverter]
      - name: Retriever
        inputs: [Preprocessor]
      - name: DocumentStore
        inputs: [Retriever]

PromptNodeAsClassifier

Here's an example of CNPromptNodeAsClassifier used as query classifier in a pipeline:


components:
  - name: DocumentStore
    type: DeepsetCloudDocumentStore
    params:
      embedding_dim: 768
      similarity: cosine
  - name: BM25Retriever # The keyword-based retriever
    type: BM25Retriever
    params:
      document_store: DocumentStore
      top_k: 10 # The number of results to return
  - name: EmbeddingRetriever # Selects the most relevant documents from the document store
    type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
    params:
      document_store: DocumentStore
      embedding_model: intfloat/e5-base-v2 # Model optimized for semantic search. It has been trained on 215M (question, answer) pairs from diverse sources.
      model_format: sentence_transformers
      top_k: 10 # The number of results to return
  - name: query_classification
    type: PromptTemplate
    params:
      output_parser:
        type: AnswerParser
      prompt: >
        You are a query classifier. \
        Classify queries into valid queries and invalid queries. \
        Assign the label ["valid"] to the valid queries, and the label ["error"] to invalid queries. \
  - name: LLMRouter
    type: CNPromptNodeAsClassifier
    params:
      default_prompt_template: query_classification # We're telling the prompt node to use the prompt from line 35 above
      max_length: 400 # The maximum number of tokens the generated answer can have
      model_kwargs: # Specifies additional model settings
        temperature: 0 # Lower temperature works best for fact-based qa
      model_name_or_path: gpt-3.5-turbo
      api_key: XX
      labels: 
        - valid
        - error
  - name: JoinResults # Joins the results from both retrievers
    type: JoinDocuments
    params:
      join_mode: concatenate # Combines documents from multiple retrievers
  - name: Reranker # Uses a cross-encoder model to rerank the documents returned by the two retrievers
    type: SentenceTransformersRanker
    params:
      model_name_or_path: intfloat/simlm-msmarco-reranker # Fast model optimized for reranking
      top_k: 4 # The number of results to return
      batch_size: 20  # Try to keep this number equal or larger to the sum of the top_k of the two retrievers so all docs are processed at once
  - name: qa_template
    type: PromptTemplate
    params:
      output_parser:
        type: AnswerParser
      prompt: >
        You are a technical expert.
        {new_line}You answer questions truthfully based on provided documents.
        {new_line}For each document check whether it is related to the question.
        {new_line}Only use documents that are related to the question to answer it.
        {new_line}Ignore documents that are not related to the question.
        {new_line}If the answer exists in several documents, summarize them.
        {new_line}Only answer based on the documents provided. Don't make things up.
        {new_line}Always use references in the form [NUMBER OF DOCUMENT] when using information from a document. e.g. [3], for Document[3].
        {new_line}The reference must only refer to the number that comes in square brackets after passage.
        {new_line}Otherwise, do not use brackets in your answer and reference ONLY the number of the passage without mentioning the word passage.
        {new_line}If the documents can't answer the question or you are unsure say: 'The answer can't be found in the text'.
        {new_line}These are the documents:
        {join(documents, delimiter=new_line, pattern=new_line+'Document[$idx]:'+new_line+'$content')}
        {new_line}Question: {query}
        {new_line}Answer:
  - name: PromptNode
    type: PromptNode
    params:
      default_prompt_template: qa_template
      max_length: 400 # The maximum number of tokens the generated answer can have
      model_kwargs: # Specifies additional model settings
        temperature: 0 # Lower temperature works best for fact-based qa
      model_name_or_path: gpt-3.5-turbo
  - name: FileTypeClassifier # Routes files based on their extension to appropriate converters, by default txt, pdf, md, docx, html
    type: FileTypeClassifier
  - name: TextConverter # Converts files into documents
    type: TextConverter
  - name: PDFConverter # Converts PDFs into documents
    type: PDFToTextConverter
  - name: Preprocessor # Splits documents into smaller ones and cleans them up
    type: PreProcessor
    params:
      # With a vector-based retriever, it's good to split your documents into smaller ones
      split_by: word # The unit by which you want to split the documents
      split_length: 250 # The max number of words in a document
      split_overlap: 20 # Enables the sliding window approach
      language: en
      split_respect_sentence_boundary: True # Retains complete sentences in split documents
  - name: error2 #This is where questions classified as "error" will be routed
    type: ReturnError
    params:
      error_message: "This is an invalid question."
  - name: error3 #This is where questions not matchin any label will be routed
    type: ReturnError
    params:
      error_message: "This is an invalid question."

# Here you define how the nodes are organized in the pipelines
# For each node, specify its input
pipelines:
  - name: query
    nodes:
      - name: LLMRouter
        inputs: [Query]
      - name: BM25Retriever
        inputs: [LLMRouter.output_1] # BM25Retriever takes questions labeled as "valid"
      - name: EmbeddingRetriever
        inputs: [LLMRouter.output_1] # EmbeddinRetriever takes qustions labeled as "valid"
      - name: error2
        inputs: [LLMRouter.output_2] # This is where questions labeled as "error" go
      - name: error3
        inputs: [LLMRouter.output_3] # This is where questions not matchin any labels go
      - name: JoinResults
        inputs: [BM25Retriever, EmbeddingRetriever]
      - name: Reranker
        inputs: [JoinResults]
      - name: PromptNode
        inputs: [Reranker]
  - name: indexing
    nodes:
    # Depending on the file type, we use a Text or PDF converter
      - name: FileTypeClassifier
        inputs: [File]
      - name: TextConverter
        inputs: [FileTypeClassifier.output_1] # Ensures that this converter receives txt files
      - name: PDFConverter
        inputs: [FileTypeClassifier.output_2] # Ensures that this converter receives PDFs
      - name: Preprocessor
        inputs: [TextConverter, PDFConverter]
      - name: EmbeddingRetriever
        inputs: [Preprocessor]
      - name: DocumentStore
        inputs: [EmbeddingRetriever]

Mulitple PromptNodes Reusing a Model

You can reuse the instance of a model for multiple PromptNodes in your pipeline. This saves resources. To do this, configure the PromptModel as a component and then pass its name in the model_name_or_path parameter in each instance of PromptNode.

...
components: 
	- name: DocumentStore
    type: DeepsetCloudDocumentStore # The only supported document store in deepset Cloud
  - name: Retriever # Selects the most relevant documents from the document store so that the OpenAI model can base it's generation on it. 
    type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
    params:
      document_store: DocumentStore
      embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search 
      model_format: sentence_transformers
      top_k: 1 # The number of documents
  - name: MyPromptModel
  	type: PromptModel
    params:
    	model_name_or_path: text-davinci-003
      use_gpu: True
 	- name: PromptNode1
   	type: PromptNode
   	params:
    	default_prompt_template: deepset/question-generation
    	model_name_or_path: MyPromptModel
    	api_key: <api_key>
 	- name: PromptNode2
   	type: PromptNode
   	params:
    	default_prompt_template: deepset/question-answering-per-document
    	model_name_or_path: MyPromptModel
    	api_key: <api_key>
  # You would also need to configure the components for the indexing pipeline
  # We're skipping this part in this example
  
  pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: PromptNode1
        inputs: [Retriever]
      - name: PromptNode2
        inputs: [PromptNode1]
# Here would come the indexing pipeline, which we're skipping in this example

Using Different Models in One Pipeline

You can also specify different LLMs for each PromptNode in your pipeline. This way, you create multiple PromptNode instances that use a single PromptNode, which saves computational resources.

# In YAML, you simply specify two PromptNodes, each with a different name and a different model
# Bear in mind that this example is not a complete pipeline, you'd still need to create the indexing pipeline
# and define its components

...
components:
 - name: PromptNodeOpenAI 
   type: PromptNode
   params:
   	default_prompt_template: deepset/question-answering
    model_name_or_path: text-davinci-003
    api_key: <my_openai_key>
 - name: PromptNodeDefault
   type: PromptNode
   params:
    default_prompt_template: deepset/question-generation
    model_name_or_path: google/flan-t5-large
    ...

# And now you could put the two nodes together in the query pipeline:
pipelines:
 - name: query
   nodes:
    - name: PromptNodeDefault
      inputs: [Query]
    - name: PromptNodeOpenAI
      inputs: [PromptNodeDefault]
      ...
      

Chaining PromptNodes

This pipeline has two PromptNodes. The first checks the spelling of the query and corrects it if necessary. Then, it sends the corrected query to the second PromptNode to perform question answering:

...
components:
- name: spell_check
  type: PromptTemplate
  params:
    output_parser:
      type: AnswerParser
    prompt: >
      You are a spellchecker.
      {new_line}You get a question and correct it.
      {new_line}Put out only the corrected question.
      {new_line}Question: {query}
      {new_line}Corrected question:
- name: SpellChecker #This is the PromptNode that checks the spelling of the query
  type: PromptNode
  params:
    default_prompt_template: spell_check #Here you're telling this prompt node to use the spell_check prompt template
    max_length: 650
    model_kwargs:
      temperature: 0
    model_name_or_path: gpt-3.5-turbo
- name: qa_template
  type: PromptTemplate
  params:
    output_parser:
      type: AnswerParser
      prompt: >
        You are a technical expert.
        {new_line}You answer questions truthfully based on provided documents.
        {new_line}For each document check whether it is related to the question.
        {new_line}Only use documents that are related to the question to answer it.
        {new_line}Ignore documents that are not related to the question.
        {new_line}If the answer exists in several documents, summarize them.
        {new_line}Only answer based on the documents provided. Don't make things up.
        {new_line}Always use references in the form [NUMBER OF DOCUMENT] when using information from a document. e.g. [3], for Document[3].
        {new_line}The reference must only refer to the number that comes in square brackets after passage.
        {new_line}Otherwise, do not use brackets in your answer and reference ONLY the number of the passage without mentioning the word passage.
        {new_line}If the documents can't answer the question or you are unsure say: 'The answer can't be found in the text'.
        {new_line}These are the documents:
        {join(documents, delimiter=new_line, pattern=new_line+'Document[$idx]:'+new_line+'$content')}
        {new_line}Question: {query}
        {new_line}Answer:
- name: PromptNode #This is the PromptNode that performs question answering using the corrected query
  type: PromptNode
  params:
    default_prompt_template: qa_template
    max_length: 650
    model_kwargs:
      temperature: 0
    model_name_or_path: gpt-3.5-turbo
  - name: EmbeddingRetriever
  type: EmbeddingRetriever
  params:
    document_store: DocumentStore
    embedding_model: intfloat/multilingual-e5-large
    model_format: sentence_transformers
    top_k: 5
    scale_score: false
    ...
    
  pipelines:
    - name: query
      nodes:
       - name: SpellChecker
         inputs: [Query}
       - name: EmbeddingRetriever
         inputs: [SpellChecker]
       - PromptNode
         inputs: [EmbeddingRetriever]
         ...

For more information, see also Pipeline Examples.

Models

You can pass any of the supported models in the model_name_or_path parameter. deepset Cloud downloads the model and runs it for you. Additional model parameters, such as temperature go into kwargs, for example:

...
components:
	- name: PromptNode 
    type: PromptNode
    params:
      model_name_or_path: google/flan-t5-xl
      model_kwargs:
      	temperature: 0.6
 ...

For a list of parameters you can pass in kwargs, see PromptModel Arguments below.

Remote Models

You can use LLMs hosted by model providers in your pipelines. The way you do it depends on the provider you want to use. For details, see Using Hosted LLMs in Your Pipelines.

Prompt Templates

PromptTemplates work a bit like pipeline components. To use a custom prompt, you declare it like a pipeline component and then you pass its name to PromptNode. See the Usage Example: With a Custom Prompt section.

deepset Cloud comes with a library of out-of-the-box prompt templates ready for you to use for multiple NLP tasks. Each template contains the prompt, which is the instruction for the model. You can browse the available templates in Prompt Studio by clicking Templates. You can also save your custom prompts there.

PromptTemplate Structure

You can create your own prompt template and save it in the library of Prompt Studio for future use. If you create a template in Prompt Studio, it guides you through the structure it should have. But you can also create a prompt directly in your pipeline. If you do this, follow this structure:

components:
 - name: my_prompt 
   type: PromptTemplate
   params:
   	prompt: "Here comes the prompt text"
 
  • prompt contains the prompt for the task you want the model to do. It also specifies input variables in curly brackets. The variables can be document, query, label.
    At runtime, these variables must be present in the execution context of the node. You can apply functions to those variables. For example, you can combine the list of documents into a string by applying the join function to execute only one prompt instead of one prompt per one document.
    You can use \n, \t, and \r characters in the prompt text.
  • output_parser converts the output to an Answerobject. If you need PromptNode to output an Answer object, set this parameter to AnswerParser(). AnswerParser adds the document_ids of the documents used to generate the answer and the prompts used to the Answer object. You can pass a reference_pattern to extract the document_ids of the answer from the model output. For details, see PromptTemplate parameters below.

Functions in Prompts

You can add the join and to_strings functions to your template to control how the documents, query, or any other variables are rendered in the output. Functions are useful if you want to, for example, make the model reference a document ID or metadata in the output.

Functions use the Python f-string format and you can use any list comprehension inside a function. You can't use \n, \t, or \r inside a function. Also, double quotes (") are automatically replaced with single quotes (') in the function. Here are the variables you can use in the functions instead:

Special characters not allowed in functionsPromptTemplate variable to use instead
\nnew_line
\ttab
\rcarriage_return
"double_quote

For detailed parameters, you can use in a function, see the Function Arguments section.

The join() Function

The join function joins all documents into a single string with the content of each document separated by the delimiter you specify. Here's a simple example (line 6):

- name: summary-custom
    type: PromptTemplate
    params:
      prompt: >
        Summarize this document: join{documents} \n\n 
        If the document is not there, answer with: "I can''t answer the question based on the information provided." \n\n
        Summary: 

The function here joins all documents into a single string. This way, PromptNode only sends one prompt, which is faster. If you don't join the documents, PromptNode sends one prompt for each document.

This example joins all documents in a single string, but it also formats the resulting string so that the output is the meta name of each document followed by its contents:

Context: {' '.join([d.meta['name']+': '+d.content for d in documents])}
' '.join([d.meta['name']+': '+d.content for d in documents])

Here's what the resulting string would look like:

DocumentMetaName1: DocumentContent1 DocumentMetaName2: DocumentContent2 # and so on, for each document 

Now, you could tell the model in the prompt to print answers together with the meta name of the document containing the answer.

The to_strings() Function

This function extracts the content field of documents and returns a list of strings. In the example below, it renders each document by its name (document.meta["name"]) followed by a new line and the contents of the document:

{to_strings(documents, pattern='$name'+new_line+'$content', str_replace={new_line: ' ', '[': '(', ']': ')'})}
{to_strings(documents, pattern='$name'+new_line+'$content', str_replace={new_line: ' ', '[': '(', ']': ')'})}

Parameters

PromptNode Parameters

Use these parameters to configure the PromptNode in the pipeline YAML:

ParameterTypePossible ValuesDescription
model_name_or_pathStringDefault: google/flan-t5-baseThe name of the model you want to use with the PromptNode or the path to a locally stored model.
To use an OpenAI model through Microsoft Azure service, pass the api_version, azure_base_url and azure_deployment_name parameters in model_kwargs. For more information, see the model_kwargs argument description.
For a list of supported models, see the PromptNode documentation.
Mandatory.
default_prompt_templateStringName of the out-of-the-box template or the name of your custom template.The prompt template you want to use with the PromptNode. The template contains instructions for the model. If you don't specify it, the model tries to guess what task you want it to do based on your query. For best results, we recommend specifying the template.
Optional.
output_variableStringDefault: resultsThe name of the output variable where you want to store the inference results. If not specified, PromptNode uses the output_variable from the PromptTemplate. If PromptTemplate's output_variable is not set, the default name is results.
Optional.
max_lengthIntegerDefault: 100The maximum number of tokens of the text output the PromptNode generates.
The length of the prompt and the length of the generated output must not be larger than the number of tokens the LLM can process.
Optional.
api_keyStringDefault: NoneThe API key for the model provider, like OpenAI, Cohere, etc.
Optional.
use_auth_tokenStringDefault: NoneThe Hugging Face authentication token for your private model.
Optional.
use_gpuBooleanTrue
False
Default: None
Specifies if you want to use GPU when running PromptNode.
Optional.
devicesList of stringsExample: [torch.device('cuda:0'), "mps", "cuda:1"]The list of torch devices to use.
Optional.
stop_wordsList of stringsDefault: NoneIf the PromptNode encounters any of the words you specify here, it stops generating text.
Optional.
top_kIntegerDefault: 1The number of answers (generated texts) you want PromptNode to return.
Mandatory.
debugBooleanTrue
False (default)
Enables debug output.
Optional.
model_kwargsDictionaryDefault: NoneAny additional keyword arguments you want to pass to the model, for example: model_kwargs: temperature: 0.7
To use a remote OpenAI model through Microsoft Azure, pass api_version, azure_base_url and azure_deployment_name in model_kwargs.
You can find the azure_base_url parameter in the Keys and Endpoint tab of your Azure account. You choose the azure_deployment_name when you deploy a model through your Azure account in the Model deployments tab. For available models and versions for the service, check Azure documentation.

Here are the parameters you can specify:
- For OpenAI models, you can specify the following parameters: max_tokens, temperature, top_p, n, stop, presence_penalty, frequency_penalty, logit_bias, response_format, seed. See the OpenAI documentation for these parameters.

- For Azure OpenAI models, you can specify the following parameters: max_tokens, temperature, top_p, n, stop, presence_penalty, frequency_penalty, logit_bias. See Azure OpenAI documentation for these parameters.

- For Cohere models, you can specify the following parameters: end_sequences, frequency_penalty, k, max_tokens, model, num_generations, p, presence_penalty, return_likelihoods, temperature, and truncate. See Cohere documentation for these parameters.
- For AWS Bedrock models:

- Anthropic Claude 2, you can specify all parameters listed in Anthropic documentation except for prompt and stream.
- Anthropic Claude 3, you can specify all parameters listed in Anthropic documentation except for prompt, stream, and tools.
- Cohere Command, you can specify all parameters listed in Cohere documentation except for stream and prompt.
- Meta Llama2, deepset Cloud supports all parameters documented in AWS documentation .
- Amazon Titan, you can specify all parameters listed in the Amazon documentation listed in textGenerationConfig.
Optional.
truncateBooleanTrue
False
Default: True
Truncates the prompt to the maximum token limit before sending it to the model.
Mandatory.
timeoutFloatDefault: NoneSets the timeout for PromptNode.
Optional.

CNPromptNodeAsClassifier Parameters

In pipeline YAML, CNPromptNodeAsClassifier takes the same parameters as PromptNode and an additional labels parameter:

ParameterTypePossible ValuesDescription
labelsList of stringsA list of labelsList of labels used for classification. Each label is routed to a different output edge.
Required.

PromptTemplate Parameters

These are the parameters you can use when creating your own PromptTemplate through pipeline YAML:

ParameterTypePossible ValuesDescription
promptStringThe instructions for the model.
It can contain variables in the f-string syntax, for example: {documents}. These variables need to be filled in the prompt_text for the model to perform the task. Note that other than strict f-string syntax, you can safely use the following backslash characters in text parts of the prompt text: \n, \t, \r. If you want to use them in f-string expressions, use new_line, tab, carriage_return instead. Double quotes (") are automatically replaced with single quotes (') in the prompt text. To use double quotes in the prompt text, use {double_quote} instead.
Mandatory.
output_parserStringAnswerParser()Applies a parser that converts the model to an Answer object. If you want to use it, always set it to AnswerParser().
AnswerParser parses the model output to extract the answer into a proper Answer object using regex patterns. It adds the document_ids of the documents used to generate the answer and the prompts used to the Answer object.
It takes reference_pattern as an argument. It extracts the document_ids of the answer from the model output. It's a regex pattern you want to use to parse the answer.
Examples: [^\\n]+$ finds "this is an answer" in string "this is an argument.\this is an answer".
Answer: (.*) finds "this is an answer" in the string "this is an argument. Answer: this is an answer".
You can also use it to add references to the answers in your RAG pipelines. For example, a reference pattern \[(?:(\\d+),?\\s\*)+\] matches any references in the formats: [1], [2][3], [4,5], [6, 7]. We've added an abbreviation acm for this pattern. So to show references like [1], [2], [3], set reference_pattern: acm.
If not specified, the whole string is used as the answer. If specified, the first group of the regex is used as the answer. If there is no group, the whole match is used as the answer.

Optional.

Function Parameters

These are the parameters you can use in prompt template functions:

ParameterTypeDescription
documentsListThe documents whose rendering you want to format. Mandatory.
patternStringThe regex pattern used for parsing.
You can use the following placeholders in pattern:
- $content: The content of the document.
- $idx: The index of the document in the list.
- $id: The ID of the document.
- $META_FIELD: The values of the metadata field called META_FIELD (Only fields that match python's string Template pattern are supported. That is: [\_a-z][_a-z0-9]\*). (Only fields that match python's string Template pattern are supported. That is: [\_a-z][_a-z0-9]\*).
Optional.
delimiterString
Default: " " (single space)
Specifies the delimiter you want to use to separate documents. Used in the join function.
Mandatory.
str_replaceDictionary of stringsSpecifies the characters you want to replace. Use the format `str_replace={"r":"R"}.
Optional.

PromptModel Parameters

These are the parameters you can use when customizing the prompt model in pipeline YAML:

ParameterTypePossible ValuesDescription
model_name_or_pathStringDefault: google/flan-t5-baseThe name of the model or the path to a locally stored model. You can use the following models:

- Hugging Face transformers (all text2text-generation and text-generation models).
- OpenAI InstructGPT models.
- Azure OpenAI InstructGPT models.
- Cohere models
You can also use a remote model hosted on AWS SageMaker or an OpenAI model through Microsoft Azure.
To use a model hosted on SageMaker, contact your deepset Cloud representative.
To use a model through Azure, pass api_version, azure_deployment_name, and azure_base_url in model_kwargs. You can find the azure_base_url parameter in the Keys and Endpoint tab of your Azure account. You choose the azure_deployment_name when you deploy a model through your Azure account in the Model deployments tab. For available models and versions for the service, check Azure documentation.
See also the model_kwargs description below.
To use gpt-4o as a visual QA model, set the detail parameter through kwargs. Possible values are auto, high, and low.
Mandatory.
max_lengthIntegerDefault: 100The maximum length of the output text the model generates.
Optional.
api_keyStringDefault: NoneThe API key to use for the model.
Optional.
use_auth_tokenUnionk:parameterDefault: NoneIf the model is hosted on Hugging Face, this is the token to use for the model.
Optional.
use_gpuBooleanDefault: NoneUses GPU if available.
Optional.
devicesStringA list of GPU devices
Example: [torch.device('cuda:0'), "mps", "cuda:1"]
Contains a list of GPU devices to limit inference to certain GPUs and not use all available GPUs. As multi-GPU training is currently not implemented for DPR, training only uses the first device provided in this list.
Optional.
invocation_layer_classPromptModelInvocationLayerDefault: None
Possible values:
The custom invocation layer class to use. If you created your own invocation layer, you can pass it here.
If you don't specify it, it uses one of the known invocation layers (anthropic_claude, azure_chatgpt, azure_open_ai, chatgpt, cohere, hugging_face, hugging_face_inference, open_ai, sagemaker).
Optional.
model_kwargsDictionaryDefault: NoneAdditional parameters passed to the model, such as temperature. Model parameters are specific to the model.
To use a remote OpenAI model through Microsoft Azure service,
pass these two parameters: azure_base_url (the URL for the Azure OpenAI API endpoint, usually in the form https://<your-endpoint>.openai.azure.com, you can find it in the Keys and Endpoint tab of your Azure account) and azure_deployment_name (the name of the Azure OpenAI API deployment you specify when deploying a model through Azure in the Model deployments tab). You should pass these parameters in the model_kwargs dictionary.

- For OpenAI models, you can specify the following parameters: max_tokens, temperature, top_p, n, stop, presence_penalty, frequency_penalty, logit_bias, response_format, seed. See the OpenAI documentation for these parameters.

- For Azure OpenAI models, you can specify the following parameters: max_tokens, temperature, top_p, n, stop, presence_penalty, frequency_penalty, logit_bias. See Azure OpenAI documentation for these parameters.

- For Cohere models, you can specify the following parameters at runtime: end_sequences, frequency_penalty, k, max_tokens, model, num_generations, p, presence_penalty, return_likelihoods, temperature, and truncate. See Cohere documentation for these parameters.
- For AWS Bedrock models:

- Anthropic Claude 2, you can specify all parameters listed in Anthropic documentation except for prompt and stream.

- Anthropic Claude 3, you can specify all parameters listed in Anthropic documentation except for prompt, stream, and tools.
- Cohere Command, you can specify all parameters listed in Cohere documentation except for stream and prompt.
- Meta Llama2, deepset Cloud supports all parameters documented in AWS documentation .
- Amazon Titan, you can specify all parameters listed in the Amazon documentation listed in textGenerationConfig.