Shaper
Shaper is most often used with PromptNode to ensure the input and output the PromptNode receives matches the expected format. But you can also use it on its own.
Shaper comes with ready-to-use functions that you choose when initializing the node. These functions act on values renaming them or changing their type, for example, from a list to a string. Shaper functions come in handy when you want to use PromptNode in a pipeline. For more information, see PromptNode and Shaper.
Basic Information
- Pipeline Type: Used in query pipelines with a PromptNode
- Position in a pipeline: Before, after, or in-between PromptNodes
- Input: Depends on the function used.
- Output: Depends on the function used.
- Available Classes: Shaper
Usage
To initialize Shaper, specify the function you want it to use. In this example, Shaper takes the value query
and creates a list that contains this value as many times as it takes to match the length of the documents
list. This list is then passed down the pipeline under the namequestions
:
components:
- name: mapper
type: Shaper
params:
func: expand_value_to_list
inputs:
value: query
target_list: documents
outputs:
- questions
from haystack.nodes import Shaper
mapper = Shaper(
func="value_to_list"
inputs={"value": "query", "target_list":"documents"},
outputs=["questions"]
)
For more information about functions, see the Functions section.
Arguments
These are the parameters you can specify for Shaper:
Parameter | Type | Possible Values | Description |
---|---|---|---|
func | String | rename value_to_list join_strings join_documents join_lists strings_to_answers answers_to_strings strings_to_documents documents_to_strings | The function you want to use with Shaper. For more information, see the Functions section. Mandatory. |
outputs | List of strings | The key to store the outputs of the Shaper's function. The length of outputs must match the number of outputs produced by the function you specified for the Shaper.Mandatory. | |
inputs | Dictionary | Maps the function's input keyword arguments to the key-value pairs in the invocation context. For example, the value_to_list function expects two inputs: value and taget_list , so inputs for this function could be: {value : query , target_list : documents }.Optional. | |
params | Dictionary | Maps the function's input keyword arguments to fixed values. For example, the value_to_list function expects value and target_list parameters,so params might be {value : A , target_list : [1, 1, 1, 1] }. The node's output would be: ["A", "A", "A", "A"] .Optional. | |
publish_outputs | Union of Boolean and List of Strings | Default: True | Publishes Shaper's outputs to the pipeline's output.True - publishes all outputs.False - doesn't publish any output.Mandatory. |
Functions
To understand Shaper functions and how to use them, you first need to understand how PromptNode and Shaper work together. See PromptNode documentation.
These are the functions you can specify when initializing Shaper:
rename
Renames a value without changing it.
- Input: Any
- Output: The same as input
- Example: May come in handy if you use PromptNode at the beginning of the pipeline, where its input is the query. deepset Cloud pipelines accept
query
as the input, while PromptNode needsquestion
.- name: shaper type: Shaper params: func: rename inputs: value: query output: [question]
value_to_list
Use this function to turn a value into a list. The value is repeated in the list to match the length of the list. For example, if you set the list length to five, the value is repeated in this list five times.
- Input: Any
- Output: List
- Example: If your PromptTemplate has two parameters:
question
anddocuments
, and you want the question to be processed against each document, use Shaper with thevalue_to_list
function. It creates a list in which the question is repeated as many times as there are documents. PromptNode then processes each item from each list one by one against each other. See also PromptNode documentation.- name:QuestionsShaper type: Shaper params: func: value_to_list inputs: value: query outputs: - questions params: target_list: [5]
join_strings
Takes a list of strings and changes it into a single string. The string contains all the original strings separated by the specified delimiter.
- Input: List of strings
- Output: String
join_documents
Takes a list of documents and changes it into a list containing a single document. The new list contains all the original documents separated by the specified delimiter.
- Input: List of documents
- Ouput: List containing a single document
- Example: If you have a pipeline with PromptNode and a PromptTemplate with two parameters, for example,
question
anddocuments
. To make sure PromptNode runs the question against all documents, you can merge the documents into one:- name: joinDocs type: Shaper params: func: join_documents inputs: - documents outputs: - documents
join_lists
Joins multiple lists into a single list.
- Input: List of lists
- Output: List
strings_to_answers
Transforms a list of strings into a list of Answers.
- Input: List of strings
- Output: List of answers
- Example: This function may come in handy if PromptNode is the last node in a pipeline. The output of the PromptNode is a string, while deepset Cloud pipelines expect the Answer object. You may then add a Shaper with the strings_to_answers option at the end of the pipeline after PromptNode.
- name: OutputAnswerShaper type: Shaper params: func: strings_to_answers inputs: strings: results # the results PromptNode returns outputs: - answers
answers_to_strings
Extracts the content field of Answers and returns a list of strings.
- Input: Answers
- Output: List of strings
- Example:
- name: AnswerShaper type: Shaper params: func: answers_to_strings inputs: strings: answers outputs: - strings
strings_to_documents
Changes a list of strings into a list of documents.
- Input: List of strings
- Output: List of documents
documents_to_strings
Extracts the content
field of each document you pass to it and puts it in a list of strings. Each item in this list is the content of the content
field of one document.
- Input: String (a single document) or a list of strings (a list of documents)
- Output: List of strings
After performing a function, Shaper passes the new or modified values further down the pipeline.
Shaper and PromptNode
When used with PromptNode, Shaper acts as a PromptNode helper. Let's recall how PromptNode works:
- PromptNode uses PromptTemplate containing the prompt, or instruction, for the large language model.
- PromptTemplate contains variables that are substituted with real values when PromptNode runs.
When used in a pipeline, PromptNode receives these variables from the preceding node. It may happen that the variable names or shapes the PromptTemplate expects differ from the ones the PromptNode receives. That's when Shaper comes in and resolves this issue.
You can also use Shaper in a reverse situation. If the output of a PromptNode differs from the format the next node in the pipeline expects, Shaper can change it.
See also PromptNode documentation.
Example
Let's look at a typical example: question answering. The PromptTemplate named question-answering expects the input variables $questions
and $documents
:
Given the context please answer the question. Context: $documents;
Question: $questions; Answer:"
To pass a question forward, pipelines use the variable query
, not questions
. To make PromptNode generate one answer for each of the retrieved documents, you need to pass to the PromptTemplate one question and one document at a time.
You want the Shaper to:
- Rename
query
toquestions
. - Expand
questions="your query"
toquestions=["your query", ..., "your query"]
(a list of the same length asdocuments
).
This is how you configure the Shaper to do the renaming and expansion:
from haystack.pipelines import Pipeline
from haystack.nodes.other import Shaper
from haystack.schema import Document
# Shaper helps expand the `query` variable into a list of identical queries (length of documents)
# and store the list of queries in the `questions` variable
# (the variable used in the question answering template)
shaper = Shaper(func="value_to_list", inputs={"value": "query", "target_list":"documents"}, outputs=["questions"])
node = PromptNode(default_prompt_template="question-answering")
pipe = Pipeline()
pipe.add_node(component=shaper, name="shaper", inputs=["Query"])
pipe.add_node(component=node, name="prompt_node", inputs=["shaper"])
output = pipe.run(query="Which city is the capital city?",
documents=[Document("The capital of France is Paris"),
Document("The capital of Germany is Berlin")])
print(output["results"])
components:
- name: shaper
params:
func: value_to_list
inputs:
target_list: documents
value: query
outputs:
- questions
type: Shaper
- name: prompt_node
params:
default_prompt_template: question-answering
type: PromptNode
pipelines:
- name: query
nodes:
- inputs:
- Query
name: shaper
- inputs:
- shaper
name: prompt_node
Updated 6 months ago