Shaper

Shaper is most often used with PromptNode to ensure the input and output the PromptNode receives matches the expected format. But you can also use it on its own.

Shaper comes with ready-to-use functions that you choose when initializing the node. These functions act on values renaming them or changing their type, for example, from a list to a string. Shaper functions come in handy when you want to use PromptNode in a pipeline. For more information, see PromptNode and Shaper.

Basic Information

  • Pipeline Type: Used in query pipelines with a PromptNode
  • Position in a pipeline: Before, after, or in-between PromptNodes
  • Input: Depends on the function used.
  • Output: Depends on the function used.
  • Available Classes: Shaper

Usage

To initialize Shaper, specify the function you want it to use. In this example, Shaper takes the value query and creates a list that contains this value as many times as it takes to match the length of the documentslist. This list is then passed down the pipeline under the namequestions:

components:
 - name: mapper
   type: Shaper
   params:
     func: expand_value_to_list
     inputs:
       value: query
       target_list: documents
     outputs: 
      - questions
from haystack.nodes import Shaper

mapper = Shaper(
  func="value_to_list"
	inputs={"value": "query", "target_list":"documents"},
  outputs=["questions"]
)

For more information about functions, see the Functions section.

Arguments

These are the parameters you can specify for Shaper:

ParameterTypePossible ValuesDescription
funcStringrename
value_to_list
join_strings
join_documents
join_lists
strings_to_answers
answers_to_strings
strings_to_documents
documents_to_strings
The function you want to use with Shaper. For more information, see the Functions section.
Mandatory.
outputsList of stringsThe key to store the outputs of the Shaper's function. The length of outputs must match the number of outputs produced by the function you specified for the Shaper.
Mandatory.
inputsDictionaryMaps the function's input keyword arguments to the key-value pairs in the invocation context.
For example, the value_to_list function expects two inputs: value and taget_list, so inputs for this function could be: {value : query , target_list : documents}.
Optional.
paramsDictionaryMaps the function's input keyword arguments to fixed values.
For example, the value_to_list function expects value and target_list parameters,so params might be {value : A, target_list : [1, 1, 1, 1]}. The node's output would be: ["A", "A", "A", "A"].
Optional.
publish_outputsUnion of Boolean and List of StringsDefault: TruePublishes Shaper's outputs to the pipeline's output.
True - publishes all outputs.
False - doesn't publish any output.
Mandatory.

Functions

To understand Shaper functions and how to use them, you first need to understand how PromptNode and Shaper work together. See PromptNode documentation.

These are the functions you can specify when initializing Shaper:

rename

Renames a value without changing it.

  • Input: Any
  • Output: The same as input
  • Example: May come in handy if you use PromptNode at the beginning of the pipeline, where its input is the query. deepset Cloud pipelines accept query as the input, while PromptNode needs question.
    - name: shaper
      type: Shaper
      params:
        func: rename
        inputs:
          value: query
        output: [question]
    
value_to_list

Use this function to turn a value into a list. The value is repeated in the list to match the length of the list. For example, if you set the list length to five, the value is repeated in this list five times.

  • Input: Any
  • Output: List
  • Example: If your PromptTemplate has two parameters: question and documents, and you want the question to be processed against each document, use Shaper with the value_to_list function. It creates a list in which the question is repeated as many times as there are documents. PromptNode then processes each item from each list one by one against each other. See also PromptNode documentation.
    - name:QuestionsShaper 
        type: Shaper
        params:
          func: value_to_list 
          inputs:
            value: query
          outputs:
            - questions
          params:
            target_list: [5]
    
join_strings

Takes a list of strings and changes it into a single string. The string contains all the original strings separated by the specified delimiter.

  • Input: List of strings
  • Output: String
join_documents

Takes a list of documents and changes it into a list containing a single document. The new list contains all the original documents separated by the specified delimiter.

  • Input: List of documents
  • Ouput: List containing a single document
  • Example: If you have a pipeline with PromptNode and a PromptTemplate with two parameters, for example, question and documents. To make sure PromptNode runs the question against all documents, you can merge the documents into one:
    - name: joinDocs
       type: Shaper
       params:
        func: join_documents
        inputs:
         - documents
        outputs:
         - documents
    
join_lists

Joins multiple lists into a single list.

  • Input: List of lists
  • Output: List
strings_to_answers

Transforms a list of strings into a list of Answers.

  • Input: List of strings
  • Output: List of answers
  • Example: This function may come in handy if PromptNode is the last node in a pipeline. The output of the PromptNode is a string, while deepset Cloud pipelines expect the Answer object. You may then add a Shaper with the strings_to_answers option at the end of the pipeline after PromptNode.
    - name: OutputAnswerShaper 
        type: Shaper
        params:
          func: strings_to_answers 
          inputs:
            strings: results # the results PromptNode returns
          outputs:
            - answers
    
answers_to_strings

Extracts the content field of Answers and returns a list of strings.

  • Input: Answers
  • Output: List of strings
  • Example:
    - name: AnswerShaper 
        type: Shaper
        params:
          func: answers_to_strings
          inputs:
            strings: answers
          outputs:
            - strings
    
strings_to_documents

Changes a list of strings into a list of documents.

  • Input: List of strings
  • Output: List of documents
documents_to_strings

Extracts the content field of each document you pass to it and puts it in a list of strings. Each item in this list is the content of the content field of one document.

  • Input: String (a single document) or a list of strings (a list of documents)
  • Output: List of strings

After performing a function, Shaper passes the new or modified values further down the pipeline.

Shaper and PromptNode

When used with PromptNode, Shaper acts as a PromptNode helper. Let's recall how PromptNode works:

  • PromptNode uses PromptTemplate containing the prompt, or instruction, for the large language model.
  • PromptTemplate contains variables that are substituted with real values when PromptNode runs.

When used in a pipeline, PromptNode receives these variables from the preceding node. It may happen that the variable names or shapes the PromptTemplate expects differ from the ones the PromptNode receives. That's when Shaper comes in and resolves this issue.

You can also use Shaper in a reverse situation. If the output of a PromptNode differs from the format the next node in the pipeline expects, Shaper can change it.

See also PromptNode documentation.

Example

Let's look at a typical example: question answering. The PromptTemplate named question-answering expects the input variables $questions and $documents:

Given the context please answer the question. Context: $documents; 
Question: $questions; Answer:"

To pass a question forward, pipelines use the variable query, not questions. To make PromptNode generate one answer for each of the retrieved documents, you need to pass to the PromptTemplate one question and one document at a time.
You want the Shaper to:

  1. Rename query to questions.
  2. Expand questions="your query" to questions=["your query", ..., "your query"] (a list of the same length as documents).

This is how you configure the Shaper to do the renaming and expansion:

from haystack.pipelines import Pipeline
from haystack.nodes.other import Shaper
from haystack.schema import Document

# Shaper helps expand the `query` variable into a list of identical queries (length of documents)
# and store the list of queries in the `questions` variable 
# (the variable used in the question answering template)
shaper = Shaper(func="value_to_list", inputs={"value": "query", "target_list":"documents"}, outputs=["questions"])

node = PromptNode(default_prompt_template="question-answering")
pipe = Pipeline()
pipe.add_node(component=shaper, name="shaper", inputs=["Query"])
pipe.add_node(component=node, name="prompt_node", inputs=["shaper"])

output = pipe.run(query="Which city is the capital city?", 
                  documents=[Document("The capital of France is Paris"), 
                             Document("The capital of Germany is Berlin")])

print(output["results"])
components:
- name: shaper
  params:
    func: value_to_list
    inputs:
      target_list: documents
      value: query
    outputs:
    - questions
  type: Shaper
- name: prompt_node
  params:
    default_prompt_template: question-answering
  type: PromptNode
pipelines:
- name: query
  nodes:
  - inputs:
    - Query
    name: shaper
  - inputs:
    - shaper
    name: prompt_node

Related Links