Simplify Your Pipelines

Remove unnecessary components from your pipelines by taking advantage of smart connections. Use this guide to learn how to simplify your pipelines.

About This Task

Smart connections let the pipeline automatically merge lists and convert types between components. This means many "glue" components you used before are no longer needed. Your pipelines become shorter, easier to read, and simpler to debug.

Component Previously Needed	Smart Connection Instead
`DocumentJoiner`	Connect all `document` outputs from sender components directly to one `list[Document]` input (example receiving components are `Ranker`, `PromptBuilder`, `DocumentSplitter`, `DocumentWriter`, `Embedder`, `AnswerBuilder`, typically to their `documents` input).
`ListJoiner`	Connect all sender components' outputs directly to one `list[ChatMessage]` input (example receiver is Agent's `messages` input).
`OutputAdapter`	Connect the LLM's `replies` output directly to the `Retriever`, `Ranker`, and other downstream components.
`DeepsetChatHistoryParser`	Configure the `Input` component to produce `messages` and connect its `messages` output directly to the `Agent`'s `messages` input.

For a full list of simplified components, see Legacy Components.

For background on how smart connections work, see Smart Connections.

Remove DocumentJoiner

Components now accept multiple lists of documents, which eliminates the need for a DocumentJoiner in most cases. Below, you can find common use cases for DocumentJoiner and how to simplify them.

Hybrid Retrieval Pipelines

If your pipeline uses multiple retrievers (for example, a BM25 retriever and an embedding retriever), you probably have a DocumentJoiner sitting between the retrievers and the next component. With smart connections, you can remove it and connect the retrievers directly to the downstream component.

The pipeline automatically merges the document lists into one before passing them along.

To simplify this pipeline:

Remove the DocumentJoiner component.
Reconnect OpenSearchBM25Retriever's documents output to TransformersSimilarityRanker's documents input.
Reconnect OpenSearchEmbeddingRetriever's documents output to TransformersSimilarityRanker's documents input.

Before: with DocumentJoiner

components:
  OpenSearchBM25Retriever:
    type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          hosts:
          - ${OPENSEARCH_HOST}
          index: ''
          embedding_dim: 768
          http_auth:
          - ${OPENSEARCH_USER}
          - ${OPENSEARCH_PASSWORD}
          use_ssl: true
          verify_certs: false
      top_k: 20

  SentenceTransformersTextEmbedder:
    type: haystack.components.embedders.sentence_transformers_text_embedder.SentenceTransformersTextEmbedder
    init_parameters:
      model: intfloat/e5-base-v2

  OpenSearchEmbeddingRetriever:
    type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          hosts:
          - ${OPENSEARCH_HOST}
          index: ''
          embedding_dim: 768
          http_auth:
          - ${OPENSEARCH_USER}
          - ${OPENSEARCH_PASSWORD}
          use_ssl: true
          verify_certs: false
      top_k: 20

  DocumentJoiner:
    type: haystack.components.joiners.document_joiner.DocumentJoiner
    init_parameters:
      join_mode: concatenate

  TransformersSimilarityRanker:
    type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
    init_parameters:
      model: intfloat/simlm-msmarco-reranker
      top_k: 8

  PromptBuilder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
      template: |-
        Answer the question based on the provided documents.
        Documents:
        {% for document in documents %}
        {{ document.content }}
        {% endfor %}
        Question: {{ question }}

  OpenAIGenerator:
    type: haystack.components.generators.openai.OpenAIGenerator
    init_parameters:
      model: gpt-4o

  AnswerBuilder:
    type: haystack.components.builders.answer_builder.AnswerBuilder

connections:
- sender: OpenSearchBM25Retriever.documents
  receiver: DocumentJoiner.documents
- sender: SentenceTransformersTextEmbedder.embedding
  receiver: OpenSearchEmbeddingRetriever.query_embedding
- sender: OpenSearchEmbeddingRetriever.documents
  receiver: DocumentJoiner.documents
- sender: DocumentJoiner.documents
  receiver: TransformersSimilarityRanker.documents
- sender: TransformersSimilarityRanker.documents
  receiver: PromptBuilder.documents
- sender: PromptBuilder.prompt
  receiver: OpenAIGenerator.prompt
- sender: OpenAIGenerator.replies
  receiver: AnswerBuilder.replies

inputs:
  query:
  - OpenSearchBM25Retriever.query
  - SentenceTransformersTextEmbedder.text
  - TransformersSimilarityRanker.query
  - PromptBuilder.question
  - AnswerBuilder.query

outputs:
  answers: AnswerBuilder.answers

After: without DocumentJoiner

components:
  OpenSearchBM25Retriever:
    type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          hosts:
          - ${OPENSEARCH_HOST}
          index: ''
          embedding_dim: 768
          http_auth:
          - ${OPENSEARCH_USER}
          - ${OPENSEARCH_PASSWORD}
          use_ssl: true
          verify_certs: false
      top_k: 20

  SentenceTransformersTextEmbedder:
    type: haystack.components.embedders.sentence_transformers_text_embedder.SentenceTransformersTextEmbedder
    init_parameters:
      model: intfloat/e5-base-v2

  OpenSearchEmbeddingRetriever:
    type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          hosts:
          - ${OPENSEARCH_HOST}
          index: ''
          embedding_dim: 768
          http_auth:
          - ${OPENSEARCH_USER}
          - ${OPENSEARCH_PASSWORD}
          use_ssl: true
          verify_certs: false
      top_k: 20

  TransformersSimilarityRanker:
    type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
    init_parameters:
      model: intfloat/simlm-msmarco-reranker
      top_k: 8

  PromptBuilder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
      template: |-
        Answer the question based on the provided documents.
        Documents:
        {% for document in documents %}
        {{ document.content }}
        {% endfor %}
        Question: {{ question }}

  OpenAIGenerator:
    type: haystack.components.generators.openai.OpenAIGenerator
    init_parameters:
      model: gpt-4o

  AnswerBuilder:
    type: haystack.components.builders.answer_builder.AnswerBuilder

connections:
- sender: OpenSearchBM25Retriever.documents
  receiver: TransformersSimilarityRanker.documents
- sender: SentenceTransformersTextEmbedder.embedding
  receiver: OpenSearchEmbeddingRetriever.query_embedding
- sender: OpenSearchEmbeddingRetriever.documents
  receiver: TransformersSimilarityRanker.documents
- sender: TransformersSimilarityRanker.documents
  receiver: PromptBuilder.documents
- sender: PromptBuilder.prompt
  receiver: OpenAIGenerator.prompt
- sender: OpenAIGenerator.replies
  receiver: AnswerBuilder.replies

inputs:
  query:
  - OpenSearchBM25Retriever.query
  - SentenceTransformersTextEmbedder.text
  - TransformersSimilarityRanker.query
  - PromptBuilder.question
  - AnswerBuilder.query

outputs:
  answers: AnswerBuilder.answers

Indexes with Multiple File Converters

A common use case is to connect multiple file converters to a DocumentWriter in an index. To simplify such an index, do these steps:

Remove the DocumentJoiner component that collects documents from converters and sends them to DocumentSplitter.
Reconnect TextFileToDocument's documents output (the converter for text files) to DocumentSplitter's documents input.
Reconnect PPTXToDocument's documents output to DocumentSplitter's documents input.
Reconnect PDFMinerToDocument's documents output to DocumentSplitter's documents input.
Reconnect another TextFileToDocument's documents output (the converter for Markdown files) to DocumentSplitter's documents input.
Reconnect HTMLToDocument's documents output to DocumentSplitter's documents input.
Reconnect DOCXToDocument's documents output to DocumentSplitter's documents input.
Remove the second DocumentJoiner component that collects documents from CSVToDocument, XLSXToDocument, and DocumentSplitter, and sends them to DeepsetNvidiaDocumentEmbedder.
Reconnect DocumentSplitter's documents output to DeepsetNvidiaDocumentEmbedder's documents input.
Reconnect CSVToDocument's documents output to DeepsetNvidiaDocumentEmbedder's documents input.
Reconnect XLSXToDocument's documents output to DeepsetNvidiaDocumentEmbedder's documents input.

Before: with DocumentJoiner

components:
  FileTypeRouter:
    type: haystack.components.routers.file_type_router.FileTypeRouter
    init_parameters:
      mime_types:
      - text/plain
      - application/pdf
      - text/markdown
      - text/html
      - application/vnd.openxmlformats-officedocument.wordprocessingml.document
      - application/vnd.openxmlformats-officedocument.presentationml.presentation
      - application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
      - text/csv

  TextFileToDocument:
    type: haystack.components.converters.txt.TextFileToDocument
    init_parameters:
      encoding: utf-8

  PDFMinerToDocument:
    type: haystack.components.converters.pdfminer.PDFMinerToDocument
    init_parameters:
      line_overlap: 0.5
      char_margin: 2
      line_margin: 0.5
      word_margin: 0.1
      boxes_flow: 0.5
      detect_vertical: true
      all_texts: false
      store_full_path: false

  TextFileToDocument:
    type: haystack.components.converters.txt.TextFileToDocument
    init_parameters:
      encoding: utf-8

  HTMLToDocument:
    type: haystack.components.converters.html.HTMLToDocument
    init_parameters:
      extraction_kwargs:
        output_format: markdown
        target_language:
        include_tables: true
        include_links: true

  DOCXToDocument:
    type: haystack.components.converters.docx.DOCXToDocument
    init_parameters:
      link_format: markdown

  PPTXToDocument:
    type: haystack.components.converters.pptx.PPTXToDocument
    init_parameters: {}

  XLSXToDocument:
    type: haystack.components.converters.xlsx.XLSXToDocument
    init_parameters: {}

  CSVToDocument:
    type: haystack.components.converters.csv.CSVToDocument
    init_parameters:
      encoding: utf-8

  DocumentJoiner:
    type: haystack.components.joiners.document_joiner.DocumentJoiner
    init_parameters:
      join_mode: concatenate
      sort_by_score: false

  DocumentJoiner:
    type: haystack.components.joiners.document_joiner.DocumentJoiner
    init_parameters:
      join_mode: concatenate
      sort_by_score: false

  DocumentSplitter:
    type: haystack.components.preprocessors.document_splitter.DocumentSplitter
    init_parameters:
      split_by: word
      split_length: 250
      split_overlap: 30
      respect_sentence_boundary: true
      language: en

  DeepsetNvidiaDocumentEmbedder:
    type: deepset_cloud_custom_nodes.embedders.nvidia.document_embedder.DeepsetNvidiaDocumentEmbedder
    init_parameters:
      normalize_embeddings: true
      model: intfloat/e5-base-v2

  DocumentWriter:
    type: haystack.components.writers.document_writer.DocumentWriter
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          embedding_dim: 768
          hosts:
          index: ""
          max_chunk_bytes: 104857600
          return_embedding: false
          method:
          mappings:
          settings:
          create_index: true
          http_auth:
          use_ssl:
          verify_certs:
          timeout:
      policy: OVERWRITE

connections:
- sender: FileTypeRouter.text/plain
  receiver: TextFileToDocument.sources
- sender: FileTypeRouter.application/pdf
  receiver: PDFMinerToDocument.sources
- sender: FileTypeRouter.text/markdown
  receiver: TextFileToDocument.sources
- sender: FileTypeRouter.text/html
  receiver: HTMLToDocument.sources
- sender: FileTypeRouter.application/vnd.openxmlformats-officedocument.wordprocessingml.document
  receiver: DOCXToDocument.sources
- sender: FileTypeRouter.application/vnd.openxmlformats-officedocument.presentationml.presentation
  receiver: PPTXToDocument.sources
- sender: FileTypeRouter.application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
  receiver: XLSXToDocument.sources
- sender: FileTypeRouter.text/csv
  receiver: CSVToDocument.sources
- sender: TextFileToDocument.documents
  receiver: DocumentJoiner.documents
- sender: PDFMinerToDocument.documents
  receiver: DocumentJoiner.documents
- sender: TextFileToDocument.documents
  receiver: DocumentJoiner.documents
- sender: HTMLToDocument.documents
  receiver: DocumentJoiner.documents
- sender: DOCXToDocument.documents
  receiver: DocumentJoiner.documents
- sender: PPTXToDocument.documents
  receiver: DocumentJoiner.documents
- sender: XLSXToDocument.documents
  receiver: DocumentJoiner.documents
- sender: CSVToDocument.documents
  receiver: DocumentJoiner.documents
- sender: DocumentJoiner.documents
  receiver: DocumentSplitter.documents
- sender: DocumentSplitter.documents
  receiver: DocumentJoiner.documents
- sender: XLSXToDocument.documents
  receiver: DocumentJoiner.documents
- sender: CSVToDocument.documents
  receiver: DocumentJoiner.documents
- sender: DocumentJoiner.documents
  receiver: DeepsetNvidiaDocumentEmbedder.documents
- sender: DeepsetNvidiaDocumentEmbedder.documents
  receiver: DocumentWriter.documents

inputs:
  files:
  - FileTypeRouter.sources

max_runs_per_component: 100

metadata: {}

After: without DocumentJoiner

components:
  FileTypeRouter:
    type: haystack.components.routers.file_type_router.FileTypeRouter
    init_parameters:
      mime_types:
      - text/plain
      - application/pdf
      - text/markdown
      - text/html
      - application/vnd.openxmlformats-officedocument.wordprocessingml.document
      - application/vnd.openxmlformats-officedocument.presentationml.presentation
      - application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
      - text/csv

  TextFileToDocument:
    type: haystack.components.converters.txt.TextFileToDocument
    init_parameters:
      encoding: utf-8

  PDFMinerToDocument:
    type: haystack.components.converters.pdfminer.PDFMinerToDocument
    init_parameters:
      line_overlap: 0.5
      char_margin: 2
      line_margin: 0.5
      word_margin: 0.1
      boxes_flow: 0.5
      detect_vertical: true
      all_texts: false
      store_full_path: false

  TextFileToDocument:
    type: haystack.components.converters.txt.TextFileToDocument
    init_parameters:
      encoding: utf-8

  HTMLToDocument:
    type: haystack.components.converters.html.HTMLToDocument
    init_parameters:
      extraction_kwargs:
        output_format: markdown
        target_language:
        include_tables: true
        include_links: true

  DOCXToDocument:
    type: haystack.components.converters.docx.DOCXToDocument
    init_parameters:
      link_format: markdown

  PPTXToDocument:
    type: haystack.components.converters.pptx.PPTXToDocument
    init_parameters: {}

  XLSXToDocument:
    type: haystack.components.converters.xlsx.XLSXToDocument
    init_parameters: {}

  CSVToDocument:
    type: haystack.components.converters.csv.CSVToDocument
    init_parameters:
      encoding: utf-8

  DocumentSplitter:
    type: haystack.components.preprocessors.document_splitter.DocumentSplitter
    init_parameters:
      split_by: word
      split_length: 250
      split_overlap: 30
      respect_sentence_boundary: true
      language: en

  DeepsetNvidiaDocumentEmbedder:
    type: deepset_cloud_custom_nodes.embedders.nvidia.document_embedder.DeepsetNvidiaDocumentEmbedder
    init_parameters:
      normalize_embeddings: true
      model: intfloat/e5-base-v2

  DocumentWriter:
    type: haystack.components.writers.document_writer.DocumentWriter
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          embedding_dim: 768
          hosts:
          index: ""
          max_chunk_bytes: 104857600
          return_embedding: false
          method:
          mappings:
          settings:
          create_index: true
          http_auth:
          use_ssl:
          verify_certs:
          timeout:
      policy: OVERWRITE

connections:
- sender: FileTypeRouter.text/plain
  receiver: TextFileToDocument.sources
- sender: FileTypeRouter.application/pdf
  receiver: PDFMinerToDocument.sources
- sender: FileTypeRouter.text/markdown
  receiver: TextFileToDocument.sources
- sender: FileTypeRouter.text/html
  receiver: HTMLToDocument.sources
- sender: FileTypeRouter.application/vnd.openxmlformats-officedocument.wordprocessingml.document
  receiver: DOCXToDocument.sources
- sender: FileTypeRouter.application/vnd.openxmlformats-officedocument.presentationml.presentation
  receiver: PPTXToDocument.sources
- sender: FileTypeRouter.application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
  receiver: XLSXToDocument.sources
- sender: FileTypeRouter.text/csv
  receiver: CSVToDocument.sources
- sender: DeepsetNvidiaDocumentEmbedder.documents
  receiver: DocumentWriter.documents
- sender: PPTXToDocument.documents
  receiver: DocumentSplitter.documents
- sender: TextFileToDocument.documents
  receiver: DocumentSplitter.documents
- sender: PDFMinerToDocument.documents
  receiver: DocumentSplitter.documents
- sender: TextFileToDocument.documents
  receiver: DocumentSplitter.documents
- sender: HTMLToDocument.documents
  receiver: DocumentSplitter.documents
- sender: DOCXToDocument.documents
  receiver: DocumentSplitter.documents

inputs:
  files:
  - FileTypeRouter.sources

max_runs_per_component: 100

metadata: {}

Remove ListJoiner

Components now accept multiple lists of the same type, which means you can get rid of ListJoiner in most cases. Below, you can find common use cases for ListJoiner and how to simplify them.

Joining ChatMessages

If you're joining multiple list[ChatMessage] using a ListJoiner, you can remove it. The pipeline now handles these conversions automatically. A common use case is joining messages from a DeepsetChatHistoryParser with the current user's message to send them to the Agent, like in the RAG Research Agent template. You can now skip ListJoiner and connect the components directly. To do so, follow these steps:

Remove ListJoiner.
Connect DeepsetChatHistoryParser's messages output directly to the Agent's messages input.
Connect ChatPromptBuilder's prompt output directly to the Agent's messages input.

Before: Joining ChatMessages with ListJoiner

components:
  adapter:
    init_parameters:
      custom_filters: {}
      output_type: list[str]
      template: '{{ [(messages|last).text] }}'
      unsafe: false
    type: haystack.components.converters.output_adapter.OutputAdapter

  history_parser:
    init_parameters: {}
    type: deepset_cloud_custom_nodes.parsers.chat_history_parser.DeepsetChatHistoryParser
  MultiFileConverter:
    type: haystack.core.super_component.super_component.SuperComponent
    init_parameters:
      input_mapping:
        sources:
        - file_classifier.sources
      is_pipeline_async: false
      output_mapping:
        score_adder.output: documents
      pipeline:
        components:
          file_classifier:
            type: haystack.components.routers.file_type_router.FileTypeRouter
            init_parameters:
              mime_types:
              - text/plain
              - application/pdf
              - text/markdown
              - text/html
              - application/vnd.openxmlformats-officedocument.wordprocessingml.document
              - application/vnd.openxmlformats-officedocument.presentationml.presentation
              - application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
              - text/csv
          text_converter:
            type: haystack.components.converters.txt.TextFileToDocument
            init_parameters:
              encoding: utf-8
          pdf_converter:
            type: haystack.components.converters.pdfminer.PDFMinerToDocument
            init_parameters:
              line_overlap: 0.5
              char_margin: 2
              line_margin: 0.5
              word_margin: 0.1
              boxes_flow: 0.5
              detect_vertical: true
              all_texts: false
              store_full_path: false
          markdown_converter:
            type: haystack.components.converters.txt.TextFileToDocument
            init_parameters:
              encoding: utf-8
          html_converter:
            type: haystack.components.converters.html.HTMLToDocument
            init_parameters:
              extraction_kwargs:
                output_format: markdown
                target_language:
                include_tables: true
                include_links: true
          docx_converter:
            type: haystack.components.converters.docx.DOCXToDocument
            init_parameters:
              link_format: markdown
          pptx_converter:
            type: haystack.components.converters.pptx.PPTXToDocument
            init_parameters: {}
          xlsx_converter:
            type: haystack.components.converters.xlsx.XLSXToDocument
            init_parameters: {}
          csv_converter:
            type: haystack.components.converters.csv.CSVToDocument
            init_parameters:
              encoding: utf-8
          splitter:
            type: haystack.components.preprocessors.document_splitter.DocumentSplitter
            init_parameters:
              split_by: word
              split_length: 250
              split_overlap: 30
              respect_sentence_boundary: true
              language: en
          score_adder:
            type: haystack.components.converters.output_adapter.OutputAdapter
            init_parameters:
              template: |
                {%- set scored_documents = [] -%}
                {%- for document in documents -%}
                  {%- set doc_dict = document.to_dict() -%}
                  {%- set _ = doc_dict.update({'score': 100.0}) -%}
                  {%- set scored_doc = document.from_dict(doc_dict) -%}
                  {%- set _ = scored_documents.append(scored_doc) -%}
                {%- endfor -%}
                {{ scored_documents }}
              output_type: list[haystack.Document]
              custom_filters:
              unsafe: true
          text_joiner:
            type: haystack.components.joiners.document_joiner.DocumentJoiner
            init_parameters:
              join_mode: concatenate
              sort_by_score: false
          tabular_joiner:
            type: haystack.components.joiners.document_joiner.DocumentJoiner
            init_parameters:
              join_mode: concatenate
              sort_by_score: false
        connections:
        - sender: file_classifier.text/plain
          receiver: text_converter.sources
        - sender: file_classifier.application/pdf
          receiver: pdf_converter.sources
        - sender: file_classifier.text/markdown
          receiver: markdown_converter.sources
        - sender: file_classifier.text/html
          receiver: html_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.wordprocessingml.document
          receiver: docx_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.presentationml.presentation
          receiver: pptx_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
          receiver: xlsx_converter.sources
        - sender: file_classifier.text/csv
          receiver: csv_converter.sources
        - sender: text_joiner.documents
          receiver: splitter.documents
        - sender: text_converter.documents
          receiver: text_joiner.documents
        - sender: pdf_converter.documents
          receiver: text_joiner.documents
        - sender: markdown_converter.documents
          receiver: text_joiner.documents
        - sender: html_converter.documents
          receiver: text_joiner.documents
        - sender: pptx_converter.documents
          receiver: text_joiner.documents
        - sender: docx_converter.documents
          receiver: text_joiner.documents
        - sender: xlsx_converter.documents
          receiver: tabular_joiner.documents
        - sender: csv_converter.documents
          receiver: tabular_joiner.documents
        - sender: splitter.documents
          receiver: tabular_joiner.documents
        - sender: tabular_joiner.documents
          receiver: score_adder.documents
  ChatPromptBuilder:
    type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
    init_parameters:
      template: |-
        {% message role="user" %}
          {%- if documents|length > 0 -%}
            Here are documents provided by the user:
            {% for document in documents -%}
            Document [{{ loop.index }}] :
            Name of Source File: {{ document.meta.file_name }}
            {{ document.content }}
            {%- endfor -%}
          {%- endif -%}
        {% endmessage %}
  ListJoiner:
    type: haystack.components.joiners.list_joiner.ListJoiner
    init_parameters:
      list_type_: list[haystack.dataclasses.chat_message.ChatMessage]
  Agent:
    type: haystack.components.agents.agent.Agent
    init_parameters:
      chat_generator:
        init_parameters:
          model: gpt-5.2
          generation_kwargs:
            reasoning:
              effort: low
            verbosity: low
        type: haystack.components.generators.chat.openai_responses.OpenAIResponsesChatGenerator
      exit_conditions:
      - text
      max_agent_steps: 100
      raise_on_tool_invocation_failure: false
      state_schema:
        documents:
          type: list[haystack.Document]
      streaming_callback: deepset_cloud_custom_nodes.callbacks.streaming.streaming_callback
      system_prompt: >-
        You are a deep research assistant.

        You create comprehensive research reports to answer the user's
        questions.

        You have one tool to gather data: 'local_search'.


        The local_search tool supports hybrid retrieval using keywords, semantic
        embeddings, and reranking.

        Formulate natural language search queries that describe the full intent
        of the question.

        Use multiple varied searches to fully cover the topic.


        Only information retrieved from local_search may be used to answer the
        question.

        If the question cannot be answered using this knowledge source,
        explicitly state that it is not answerable and briefly explain why.

        Do not use external knowledge, assumptions, or speculation.


        When you use information from the local search, cite the source with the
        documents reference number in square brackets where you use the
        information (e.g. [5]).

        This is IMPORTANT:

        - Only use numbered citations for the local search results.

        - Do NOT add a References section, cite directly in the text where you
        use the information.

        - For internal knowledge "some information" [3] as (taken from <document
        reference="3">).

        - Format responses using markdown.
      tools:
      - type: haystack.tools.pipeline_tool.PipelineTool
        data:
          name: local_search
          description: >-
            Search the company's internal knowledge repository using hybrid
            retrieval.

            The search supports natural language queries, keyword matching,
            semantic embeddings, and cross-encoder reranking.

            Use descriptive, question-like queries to capture intent and
            retrieve the most relevant documents.
          input_mapping:
            query:
            - retriever.query
            - ranker.query
            documents:
            - builder.existing_documents
          output_mapping:
            builder.prompt: formatted_docs
            meta_field_grouping_ranker.documents: documents
          inputs_from_state:
            documents: documents
          outputs_to_state:
            documents:
              source: documents
          outputs_to_string:
            source: formatted_docs
          parameters:
          is_pipeline_async: false
          pipeline:
            components:
              builder:
                init_parameters:
                  required_variables:
                  - existing_documents
                  - docs
                  template: |-
                    {%- if existing_documents is not none -%}
                    {%- set existing_doc_len = existing_documents|length -%}
                    {%- else -%}
                    {%- set existing_doc_len = 0 -%}
                    {%- endif -%}
                    {%- for doc in docs %}
                    <document reference="{{loop.index + existing_doc_len}}">
                    {{ doc.content }}
                    </document>
                    {% endfor -%}
                  variables:
                type: haystack.components.builders.prompt_builder.PromptBuilder
              retriever:
                type: haystack_integrations.components.retrievers.opensearch.open_search_hybrid_retriever.OpenSearchHybridRetriever
                init_parameters:
                  document_store:
                    type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
                    init_parameters:
                      embedding_dim: 768
                  top_k: 20
                  fuzziness: 0
                  embedder:
                    type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
                    init_parameters:
                      normalize_embeddings: true
                      model: intfloat/e5-base-v2
              ranker:
                type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
                init_parameters:
                  model: intfloat/simlm-msmarco-reranker
                  top_k: 8
              meta_field_grouping_ranker:
                type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
                init_parameters:
                  group_by: file_id
                  subgroup_by:
                  sort_docs_by: split_id
            connections:
            - sender: retriever.documents
              receiver: ranker.documents
            - sender: ranker.documents
              receiver: meta_field_grouping_ranker.documents
            - sender: meta_field_grouping_ranker.documents
              receiver: builder.docs
            max_runs_per_component: 100
            metadata: {}
  DeepsetAnswerBuilder:
    type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
    init_parameters:
      reference_pattern: acm
  OutputAdapter:
    type: haystack.components.converters.output_adapter.OutputAdapter
    init_parameters:
      template: "{{ replies[0] }}"
      output_type: str
      custom_filters:
      unsafe: false

connections:
- sender: MultiFileConverter.documents
  receiver: ChatPromptBuilder.documents
- sender: ChatPromptBuilder.prompt
  receiver: ListJoiner.values
- sender: history_parser.messages
  receiver: ListJoiner.values
- sender: ListJoiner.values
  receiver: Agent.messages
- sender: Agent.documents
  receiver: DeepsetAnswerBuilder.documents
- sender: Agent.messages
  receiver: OutputAdapter.replies
- sender: OutputAdapter.output
  receiver: DeepsetAnswerBuilder.replies

inputs:
  query:
  - DeepsetAnswerBuilder.query
  - history_parser.history_and_query
  files:
  - MultiFileConverter.sources

max_runs_per_component: 100

metadata: {}

outputs:
  answers: DeepsetAnswerBuilder.answers
  documents: Agent.documents

pipeline_output_type: chat

After: Joining ChatMessages without ListJoiner

components:
  adapter:
    init_parameters:
      custom_filters: {}
      output_type: list[str]
      template: '{{ [(messages|last).text] }}'
      unsafe: false
    type: haystack.components.converters.output_adapter.OutputAdapter

  history_parser:
    init_parameters: {}
    type: deepset_cloud_custom_nodes.parsers.chat_history_parser.DeepsetChatHistoryParser
  MultiFileConverter:
    type: haystack.core.super_component.super_component.SuperComponent
    init_parameters:
      input_mapping:
        sources:
        - file_classifier.sources
      is_pipeline_async: false
      output_mapping:
        score_adder.output: documents
      pipeline:
        components:
          file_classifier:
            type: haystack.components.routers.file_type_router.FileTypeRouter
            init_parameters:
              mime_types:
              - text/plain
              - application/pdf
              - text/markdown
              - text/html
              - application/vnd.openxmlformats-officedocument.wordprocessingml.document
              - application/vnd.openxmlformats-officedocument.presentationml.presentation
              - application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
              - text/csv
          text_converter:
            type: haystack.components.converters.txt.TextFileToDocument
            init_parameters:
              encoding: utf-8
          pdf_converter:
            type: haystack.components.converters.pdfminer.PDFMinerToDocument
            init_parameters:
              line_overlap: 0.5
              char_margin: 2
              line_margin: 0.5
              word_margin: 0.1
              boxes_flow: 0.5
              detect_vertical: true
              all_texts: false
              store_full_path: false
          markdown_converter:
            type: haystack.components.converters.txt.TextFileToDocument
            init_parameters:
              encoding: utf-8
          html_converter:
            type: haystack.components.converters.html.HTMLToDocument
            init_parameters:
              extraction_kwargs:
                output_format: markdown
                target_language:
                include_tables: true
                include_links: true
          docx_converter:
            type: haystack.components.converters.docx.DOCXToDocument
            init_parameters:
              link_format: markdown
          pptx_converter:
            type: haystack.components.converters.pptx.PPTXToDocument
            init_parameters: {}
          xlsx_converter:
            type: haystack.components.converters.xlsx.XLSXToDocument
            init_parameters: {}
          csv_converter:
            type: haystack.components.converters.csv.CSVToDocument
            init_parameters:
              encoding: utf-8
          splitter:
            type: haystack.components.preprocessors.document_splitter.DocumentSplitter
            init_parameters:
              split_by: word
              split_length: 250
              split_overlap: 30
              respect_sentence_boundary: true
              language: en
          score_adder:
            type: haystack.components.converters.output_adapter.OutputAdapter
            init_parameters:
              template: |
                {%- set scored_documents = [] -%}
                {%- for document in documents -%}
                  {%- set doc_dict = document.to_dict() -%}
                  {%- set _ = doc_dict.update({'score': 100.0}) -%}
                  {%- set scored_doc = document.from_dict(doc_dict) -%}
                  {%- set _ = scored_documents.append(scored_doc) -%}
                {%- endfor -%}
                {{ scored_documents }}
              output_type: list[haystack.Document]
              custom_filters:
              unsafe: true
          text_joiner:
            type: haystack.components.joiners.document_joiner.DocumentJoiner
            init_parameters:
              join_mode: concatenate
              sort_by_score: false
          tabular_joiner:
            type: haystack.components.joiners.document_joiner.DocumentJoiner
            init_parameters:
              join_mode: concatenate
              sort_by_score: false
        connections:
        - sender: file_classifier.text/plain
          receiver: text_converter.sources
        - sender: file_classifier.application/pdf
          receiver: pdf_converter.sources
        - sender: file_classifier.text/markdown
          receiver: markdown_converter.sources
        - sender: file_classifier.text/html
          receiver: html_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.wordprocessingml.document
          receiver: docx_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.presentationml.presentation
          receiver: pptx_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
          receiver: xlsx_converter.sources
        - sender: file_classifier.text/csv
          receiver: csv_converter.sources
        - sender: text_joiner.documents
          receiver: splitter.documents
        - sender: text_converter.documents
          receiver: text_joiner.documents
        - sender: pdf_converter.documents
          receiver: text_joiner.documents
        - sender: markdown_converter.documents
          receiver: text_joiner.documents
        - sender: html_converter.documents
          receiver: text_joiner.documents
        - sender: pptx_converter.documents
          receiver: text_joiner.documents
        - sender: docx_converter.documents
          receiver: text_joiner.documents
        - sender: xlsx_converter.documents
          receiver: tabular_joiner.documents
        - sender: csv_converter.documents
          receiver: tabular_joiner.documents
        - sender: splitter.documents
          receiver: tabular_joiner.documents
        - sender: tabular_joiner.documents
          receiver: score_adder.documents
  ChatPromptBuilder:
    type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
    init_parameters:
      template: |-
        {% message role="user" %}
          {%- if documents|length > 0 -%}
            Here are documents provided by the user:
            {% for document in documents -%}
            Document [{{ loop.index }}] :
            Name of Source File: {{ document.meta.file_name }}
            {{ document.content }}
            {%- endfor -%}
          {%- endif -%}
        {% endmessage %}
  Agent:
    type: haystack.components.agents.agent.Agent
    init_parameters:
      chat_generator:
        init_parameters:
          model: gpt-5.2
          generation_kwargs:
            reasoning:
              effort: low
            verbosity: low
        type: haystack.components.generators.chat.openai_responses.OpenAIResponsesChatGenerator
      exit_conditions:
      - text
      max_agent_steps: 100
      raise_on_tool_invocation_failure: false
      state_schema:
        documents:
          type: list[haystack.Document]
      streaming_callback: deepset_cloud_custom_nodes.callbacks.streaming.streaming_callback
      system_prompt: >-
        You are a deep research assistant.

        You create comprehensive research reports to answer the user's
        questions.

        You have one tool to gather data: 'local_search'.


        The local_search tool supports hybrid retrieval using keywords, semantic
        embeddings, and reranking.

        Formulate natural language search queries that describe the full intent
        of the question.

        Use multiple varied searches to fully cover the topic.


        Only information retrieved from local_search may be used to answer the
        question.

        If the question cannot be answered using this knowledge source,
        explicitly state that it is not answerable and briefly explain why.

        Do not use external knowledge, assumptions, or speculation.


        When you use information from the local search, cite the source with the
        documents reference number in square brackets where you use the
        information (e.g. [5]).

        This is IMPORTANT:

        - Only use numbered citations for the local search results.

        - Do NOT add a References section, cite directly in the text where you
        use the information.

        - For internal knowledge "some information" [3] as (taken from <document
        reference="3">).

        - Format responses using markdown.
      tools:
      - type: haystack.tools.pipeline_tool.PipelineTool
        data:
          name: local_search
          description: >-
            Search the company's internal knowledge repository using hybrid
            retrieval.

            The search supports natural language queries, keyword matching,
            semantic embeddings, and cross-encoder reranking.

            Use descriptive, question-like queries to capture intent and
            retrieve the most relevant documents.
          input_mapping:
            query:
            - retriever.query
            - ranker.query
            documents:
            - builder.existing_documents
          output_mapping:
            builder.prompt: formatted_docs
            meta_field_grouping_ranker.documents: documents
          inputs_from_state:
            documents: documents
          outputs_to_state:
            documents:
              source: documents
          outputs_to_string:
            source: formatted_docs
          parameters:
          is_pipeline_async: false
          pipeline:
            components:
              builder:
                init_parameters:
                  required_variables:
                  - existing_documents
                  - docs
                  template: |-
                    {%- if existing_documents is not none -%}
                    {%- set existing_doc_len = existing_documents|length -%}
                    {%- else -%}
                    {%- set existing_doc_len = 0 -%}
                    {%- endif -%}
                    {%- for doc in docs %}
                    <document reference="{{loop.index + existing_doc_len}}">
                    {{ doc.content }}
                    </document>
                    {% endfor -%}
                  variables:
                type: haystack.components.builders.prompt_builder.PromptBuilder
              retriever:
                type: haystack_integrations.components.retrievers.opensearch.open_search_hybrid_retriever.OpenSearchHybridRetriever
                init_parameters:
                  document_store:
                    type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
                    init_parameters:
                      embedding_dim: 768
                  top_k: 20
                  fuzziness: 0
                  embedder:
                    type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
                    init_parameters:
                      normalize_embeddings: true
                      model: intfloat/e5-base-v2
              ranker:
                type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
                init_parameters:
                  model: intfloat/simlm-msmarco-reranker
                  top_k: 8
              meta_field_grouping_ranker:
                type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
                init_parameters:
                  group_by: file_id
                  subgroup_by:
                  sort_docs_by: split_id
            connections:
            - sender: retriever.documents
              receiver: ranker.documents
            - sender: ranker.documents
              receiver: meta_field_grouping_ranker.documents
            - sender: meta_field_grouping_ranker.documents
              receiver: builder.docs
            max_runs_per_component: 100
            metadata: {}
  DeepsetAnswerBuilder:
    type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
    init_parameters:
      reference_pattern: acm
  OutputAdapter:
    type: haystack.components.converters.output_adapter.OutputAdapter
    init_parameters:
      template: "{{ replies[0] }}"
      output_type: str
      custom_filters:
      unsafe: false

connections:
- sender: MultiFileConverter.documents
  receiver: ChatPromptBuilder.documents
- sender: Agent.documents
  receiver: DeepsetAnswerBuilder.documents
- sender: Agent.messages
  receiver: OutputAdapter.replies
- sender: OutputAdapter.output
  receiver: DeepsetAnswerBuilder.replies
- sender: history_parser.messages
  receiver: Agent.messages
- sender: ChatPromptBuilder.prompt
  receiver: Agent.messages

inputs:
  query:
  - DeepsetAnswerBuilder.query
  - history_parser.history_and_query
  files:
  - MultiFileConverter.sources

max_runs_per_component: 100

metadata: {}

outputs:
  answers: DeepsetAnswerBuilder.answers
  documents: Agent.documents

pipeline_output_type: chat

Remove OutputAdapter for Type Conversions

RAG Chat with LLM sending input to a Retriever and a Ranker

In chat pipelines, where the first LLM reformulates the query to be used for retrieval augmented generation, you can remove the OutputAdapter and connect the ChatGenerator directly to the Retriever or Ranker.

This is an example of a RAG chat pipeline with an OutputAdapter and a DocumentJoiner that you can simplify. Follow these steps:

Remove OutputAdapter.
Connect the replies output of the first OpenAIGenerator to the following components' inputs:

OpenSearchHybridRetriever's query input.
DeepsetNvidiaRanker's query input.
PromptBuilder's question input.
DeepsetAnswerBuilder's query input.

Remove DocumentJoiner.
Connect MultiFileConverter's documents output to the following components' inputs:
- The second PromptBuilder's documents input.
- DeepsetAnswerBuilder's documents input.
- Output's documents input.
Connect DeepsetNvidiaRanker's documents output to the following components' inputs:
- The second PromptBuilder's documents input.
- DeepsetAnswerBuilder's documents input.
- Output's documents input.

Before: LLM connected to a Retriever through an OutputAdapter

components:
  PromptBuilder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
      required_variables: "*"
      template: |-
        You are part of a chatbot.
        You receive a question (Current Question) and a chat history.
        Use the context from the chat history and reformulate the question so that it is suitable for retrieval augmented generation.
        If X is followed by Y, only ask for Y and do not repeat X again.
        If the question does not require any context from the chat history, output it unedited.
        Don't make questions too long, but short and precise.
        Stay as close as possible to the current question.
        Only output the new question, nothing else!

        {{ question }}

        New question:

  OpenAIGenerator:
    type: haystack.components.generators.openai.OpenAIGenerator
    init_parameters:
      api_key:
        "type": "env_var"
        "env_vars":
        - "OPENAI_API_KEY"
        "strict": False
      model: "gpt-5.2"
      generation_kwargs:
        reasoning_effort: low
        verbosity: low

  OutputAdapter:
    type: haystack.components.converters.output_adapter.OutputAdapter
    init_parameters:
      template: "{{ replies[0] }}"
      output_type: str

  OpenSearchHybridRetriever:
    type: haystack_integrations.components.retrievers.opensearch.open_search_hybrid_retriever.OpenSearchHybridRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          embedding_dim: 768
          hosts:
          index: ""
          max_chunk_bytes: 104857600
          return_embedding: false
          method:
          mappings:
          settings:
          create_index: true
          http_auth:
          use_ssl:
          verify_certs:
          timeout:
      top_k: 20 # The number of results to return
      embedder:
        type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
        init_parameters:
          normalize_embeddings: true
          model: intfloat/e5-base-v2

  DeepsetNvidiaRanker:
    type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
    init_parameters:
      model: intfloat/simlm-msmarco-reranker
      top_k: 8

  PromptBuilder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
      required_variables: '*'
      template: |-
        You are a technical expert.
        You answer questions truthfully based on provided documents.
        Ignore typing errors in the question.
        For each document check whether it is related to the question.
        Only use documents that are related to the question to answer it.
        Ignore documents that are not related to the question.
        If the answer exists in several documents, summarize them.
        Only answer based on the documents provided. Don't make things up.
        Just output the structured, informative and precise answer and nothing else.
        If the documents can't answer the question, say so.
        Always use references in the form [NUMBER OF DOCUMENT] when using information from a document, e.g. [3] for Document [3] .
        Never name the documents, only enter a number in square brackets as a reference.
        The reference must only refer to the number that comes in square brackets after the document.
        Otherwise, do not use brackets in your answer and reference ONLY the number of the document without mentioning the word document.

        These are the documents:
        {%- if documents|length > 0 %}
        {%- for document in documents %}
        Document [{{ loop.index }}] :
        Name of Source File: {{ document.meta.file_name }}
        {{ document.content }}
        {% endfor -%}
        {%- else %}
        No relevant documents found.
        Respond with "Sorry, no matching documents were found, please adjust the filters or try a different question."
        {% endif %}

        Question: {{ question }}
        Answer:

  OpenAIGenerator:
    type: haystack.components.generators.openai.OpenAIGenerator
    init_parameters:
      api_key:
        "type": "env_var"
        "env_vars":
        - "OPENAI_API_KEY"
        "strict": False
      model: "gpt-5.2"
      generation_kwargs:
        reasoning_effort: low
        verbosity: low

  DeepsetAnswerBuilder:
    type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
    init_parameters:
      reference_pattern: acm

  DocumentJoiner:
    type: haystack.components.joiners.document_joiner.DocumentJoiner
    init_parameters:
      join_mode: concatenate
      weights:
      top_k:
      sort_by_score: true

  MultiFileConverter:
    type: haystack.core.super_component.super_component.SuperComponent
    init_parameters:
      input_mapping:
        sources:
        - file_classifier.sources
      is_pipeline_async: false
      output_mapping:
        score_adder.output: documents
      pipeline:
        components:
          file_classifier:
            type: haystack.components.routers.file_type_router.FileTypeRouter
            init_parameters:
              mime_types:
              - text/plain
              - application/pdf
              - text/markdown
              - text/html
              - application/vnd.openxmlformats-officedocument.wordprocessingml.document
              - application/vnd.openxmlformats-officedocument.presentationml.presentation
              - application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
              - text/csv

          text_converter:
            type: haystack.components.converters.txt.TextFileToDocument
            init_parameters:
              encoding: utf-8

          pdf_converter:
            type: haystack.components.converters.pdfminer.PDFMinerToDocument
            init_parameters:
              line_overlap: 0.5
              char_margin: 2
              line_margin: 0.5
              word_margin: 0.1
              boxes_flow: 0.5
              detect_vertical: true
              all_texts: false
              store_full_path: false

          markdown_converter:
            type: haystack.components.converters.txt.TextFileToDocument
            init_parameters:
              encoding: utf-8

          html_converter:
            type: haystack.components.converters.html.HTMLToDocument
            init_parameters:
              extraction_kwargs:
                output_format: markdown 
                target_language:
                include_tables: true
                include_links: true

          docx_converter:
            type: haystack.components.converters.docx.DOCXToDocument
            init_parameters:
              link_format: markdown

          pptx_converter:
            type: haystack.components.converters.pptx.PPTXToDocument
            init_parameters: {}

          xlsx_converter:
            type: haystack.components.converters.xlsx.XLSXToDocument
            init_parameters: {}

          csv_converter:
            type: haystack.components.converters.csv.CSVToDocument
            init_parameters:
              encoding: utf-8

          splitter:
            type: haystack.components.preprocessors.document_splitter.DocumentSplitter
            init_parameters:
              split_by: word
              split_length: 250
              split_overlap: 30
              respect_sentence_boundary: true
              language: en

          score_adder:
            type: haystack.components.converters.output_adapter.OutputAdapter
            init_parameters:
              template: |
                {%- set scored_documents = [] -%}
                {%- for document in documents -%}
                  {%- set doc_dict = document.to_dict() -%}
                  {%- set _ = doc_dict.update({'score': 100.0}) -%}
                  {%- set scored_doc = document.from_dict(doc_dict) -%}
                  {%- set _ = scored_documents.append(scored_doc) -%}
                {%- endfor -%}
                {{ scored_documents }}
              output_type: List[haystack.Document]
              custom_filters:
              unsafe: true

          text_joiner:
            type: haystack.components.joiners.document_joiner.DocumentJoiner
            init_parameters:
              join_mode: concatenate
              sort_by_score: false

          tabular_joiner:
            type: haystack.components.joiners.document_joiner.DocumentJoiner
            init_parameters:
              join_mode: concatenate
              sort_by_score: false
        connections:
        - sender: file_classifier.text/plain
          receiver: text_converter.sources
        - sender: file_classifier.application/pdf
          receiver: pdf_converter.sources
        - sender: file_classifier.text/markdown
          receiver: markdown_converter.sources
        - sender: file_classifier.text/html
          receiver: html_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.wordprocessingml.document
          receiver: docx_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.presentationml.presentation
          receiver: pptx_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
          receiver: xlsx_converter.sources
        - sender: file_classifier.text/csv
          receiver: csv_converter.sources
        - sender: text_joiner.documents
          receiver: splitter.documents
        - sender: text_converter.documents
          receiver: text_joiner.documents
        - sender: pdf_converter.documents
          receiver: text_joiner.documents
        - sender: markdown_converter.documents
          receiver: text_joiner.documents
        - sender: html_converter.documents
          receiver: text_joiner.documents
        - sender: pptx_converter.documents
          receiver: text_joiner.documents
        - sender: docx_converter.documents
          receiver: text_joiner.documents
        - sender: xlsx_converter.documents
          receiver: tabular_joiner.documents
        - sender: csv_converter.documents
          receiver: tabular_joiner.documents
        - sender: splitter.documents
          receiver: tabular_joiner.documents
        - sender: tabular_joiner.documents
          receiver: score_adder.documents

connections:
- sender: PromptBuilder.prompt
  receiver: OpenAIGenerator.prompt
- sender: OpenAIGenerator.replies
  receiver: OutputAdapter.replies
- sender: OutputAdapter.output
  receiver: OpenSearchHybridRetriever.query
- sender: OutputAdapter.output
  receiver: DeepsetNvidiaRanker.query
- sender: OutputAdapter.output
  receiver: PromptBuilder.question
- sender: OutputAdapter.output
  receiver: DeepsetAnswerBuilder.query
- sender: retriever.documents
  receiver: DeepsetNvidiaRanker.documents
- sender: PromptBuilder.prompt
  receiver: OpenAIGenerator.prompt
- sender: PromptBuilder.prompt
  receiver: DeepsetAnswerBuilder.prompt
- sender: OpenAIGenerator.replies
  receiver: DeepsetAnswerBuilder.replies
- sender: MultiFileConverter.documents
  receiver: DocumentJoiner.documents
- sender: DeepsetNvidiaRanker.documents
  receiver: attachments_joiner.documents
- sender: DocumentJoiner.documents
  receiver: DeepsetAnswerBuilder.documents
- sender: DocumentJoiner.documents
  receiver: PromptBuilder.documents

inputs:
  query:
  - PromptBuilder.question
  filters:
  - OpenSearchHybridRetriever.filters_bm25
  - OpenSearchHybridRetriever.filters_embedding
  files:
  - MultiFileConverter.sources

outputs:
  documents: DocumentJoiner.documents
  answers: DeepsetAnswerBuilder.answers

max_runs_per_component: 100

metadata: {}

After: LLM connected directly to a Retriever and a Ranker

This is a simplified version of the pipeline above. You can remove the DocumentJoiner and OutputAdapterand connect the components directly.

components:
  PromptBuilder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
      required_variables: "*"
      template: |-
        You are part of a chatbot.
        You receive a question (Current Question) and a chat history.
        Use the context from the chat history and reformulate the question so that it is suitable for retrieval augmented generation.
        If X is followed by Y, only ask for Y and do not repeat X again.
        If the question does not require any context from the chat history, output it unedited.
        Don't make questions too long, but short and precise.
        Stay as close as possible to the current question.
        Only output the new question, nothing else!

        {{ question }}

        New question:

  OpenAIGenerator:
    type: haystack.components.generators.openai.OpenAIGenerator
    init_parameters:
      api_key:
        "type": "env_var"
        "env_vars":
        - "OPENAI_API_KEY"
        "strict": False
      model: "gpt-5.2"
      generation_kwargs:
        reasoning_effort: low
        verbosity: low

  OpenSearchHybridRetriever:
    type: haystack_integrations.components.retrievers.opensearch.open_search_hybrid_retriever.OpenSearchHybridRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          embedding_dim: 768
          hosts:
          index: ""
          max_chunk_bytes: 104857600
          return_embedding: false
          method:
          mappings:
          settings:
          create_index: true
          http_auth:
          use_ssl:
          verify_certs:
          timeout:
      top_k: 20
      embedder:
        type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
        init_parameters:
          normalize_embeddings: true
          model: intfloat/e5-base-v2

  DeepsetNvidiaRanker:
    type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
    init_parameters:
      model: intfloat/simlm-msmarco-reranker
      top_k: 8

  PromptBuilder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
      required_variables: '*'
      template: |-
        You are a technical expert.
        You answer questions truthfully based on provided documents.
        Ignore typing errors in the question.
        For each document check whether it is related to the question.
        Only use documents that are related to the question to answer it.
        Ignore documents that are not related to the question.
        If the answer exists in several documents, summarize them.
        Only answer based on the documents provided. Don't make things up.
        Just output the structured, informative and precise answer and nothing else.
        If the documents can't answer the question, say so.
        Always use references in the form [NUMBER OF DOCUMENT] when using information from a document, e.g. [3] for Document [3] .
        Never name the documents, only enter a number in square brackets as a reference.
        The reference must only refer to the number that comes in square brackets after the document.
        Otherwise, do not use brackets in your answer and reference ONLY the number of the document without mentioning the word document.

        These are the documents:
        {%- if documents|length > 0 %}
        {%- for document in documents %}
        Document [{{ loop.index }}] :
        Name of Source File: {{ document.meta.file_name }}
        {{ document.content }}
        {% endfor -%}
        {%- else %}
        No relevant documents found.
        Respond with "Sorry, no matching documents were found, please adjust the filters or try a different question."
        {% endif %}

        Question: {{ question }}
        Answer:

  OpenAIGenerator:
    type: haystack.components.generators.openai.OpenAIGenerator
    init_parameters:
      api_key:
        "type": "env_var"
        "env_vars":
        - "OPENAI_API_KEY"
        "strict": False
      model: "gpt-5.2"
      generation_kwargs:
        reasoning_effort: low
        verbosity: low

  DeepsetAnswerBuilder:
    type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
    init_parameters:
      reference_pattern: acm

  MultiFileConverter:
    type: haystack.core.super_component.super_component.SuperComponent
    init_parameters:
      input_mapping:
        sources:
        - file_classifier.sources
      is_pipeline_async: false
      output_mapping:
        score_adder.output: documents
      pipeline:
        components:
          file_classifier:
            type: haystack.components.routers.file_type_router.FileTypeRouter
            init_parameters:
              mime_types:
              - text/plain
              - application/pdf
              - text/markdown
              - text/html
              - application/vnd.openxmlformats-officedocument.wordprocessingml.document
              - application/vnd.openxmlformats-officedocument.presentationml.presentation
              - application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
              - text/csv

          text_converter:
            type: haystack.components.converters.txt.TextFileToDocument
            init_parameters:
              encoding: utf-8

          pdf_converter:
            type: haystack.components.converters.pdfminer.PDFMinerToDocument
            init_parameters:
              line_overlap: 0.5
              char_margin: 2
              line_margin: 0.5
              word_margin: 0.1
              boxes_flow: 0.5
              detect_vertical: true
              all_texts: false
              store_full_path: false

          markdown_converter:
            type: haystack.components.converters.txt.TextFileToDocument
            init_parameters:
              encoding: utf-8

          html_converter:
            type: haystack.components.converters.html.HTMLToDocument
            init_parameters:
              extraction_kwargs:
                output_format: markdown
                target_language:
                include_tables: true
                include_links: true

          docx_converter:
            type: haystack.components.converters.docx.DOCXToDocument
            init_parameters:
              link_format: markdown

          pptx_converter:
            type: haystack.components.converters.pptx.PPTXToDocument
            init_parameters: {}

          xlsx_converter:
            type: haystack.components.converters.xlsx.XLSXToDocument
            init_parameters: {}

          csv_converter:
            type: haystack.components.converters.csv.CSVToDocument
            init_parameters:
              encoding: utf-8

          splitter:
            type: haystack.components.preprocessors.document_splitter.DocumentSplitter
            init_parameters:
              split_by: word
              split_length: 250
              split_overlap: 30
              respect_sentence_boundary: true
              language: en

          score_adder:
            type: haystack.components.converters.output_adapter.OutputAdapter
            init_parameters:
              template: |
                {%- set scored_documents = [] -%}
                {%- for document in documents -%}
                  {%- set doc_dict = document.to_dict() -%}
                  {%- set _ = doc_dict.update({'score': 100.0}) -%}
                  {%- set scored_doc = document.from_dict(doc_dict) -%}
                  {%- set _ = scored_documents.append(scored_doc) -%}
                {%- endfor -%}
                {{ scored_documents }}
              output_type: List[haystack.Document]
              custom_filters:
              unsafe: true

          text_joiner:
            type: haystack.components.joiners.document_joiner.DocumentJoiner
            init_parameters:
              join_mode: concatenate
              sort_by_score: false

          tabular_joiner:
            type: haystack.components.joiners.document_joiner.DocumentJoiner
            init_parameters:
              join_mode: concatenate
              sort_by_score: false
        connections:
        - sender: file_classifier.text/plain
          receiver: text_converter.sources
        - sender: file_classifier.application/pdf
          receiver: pdf_converter.sources
        - sender: file_classifier.text/markdown
          receiver: markdown_converter.sources
        - sender: file_classifier.text/html
          receiver: html_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.wordprocessingml.document
          receiver: docx_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.presentationml.presentation
          receiver: pptx_converter.sources
        - sender: file_classifier.application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
          receiver: xlsx_converter.sources
        - sender: file_classifier.text/csv
          receiver: csv_converter.sources
        - sender: text_joiner.documents
          receiver: splitter.documents
        - sender: text_converter.documents
          receiver: text_joiner.documents
        - sender: pdf_converter.documents
          receiver: text_joiner.documents
        - sender: markdown_converter.documents
          receiver: text_joiner.documents
        - sender: html_converter.documents
          receiver: text_joiner.documents
        - sender: pptx_converter.documents
          receiver: text_joiner.documents
        - sender: docx_converter.documents
          receiver: text_joiner.documents
        - sender: xlsx_converter.documents
          receiver: tabular_joiner.documents
        - sender: csv_converter.documents
          receiver: tabular_joiner.documents
        - sender: splitter.documents
          receiver: tabular_joiner.documents
        - sender: tabular_joiner.documents
          receiver: score_adder.documents

connections:
- sender: PromptBuilder.prompt
  receiver: OpenAIGenerator.prompt
- sender: OpenSearchHybridRetriever.documents
  receiver: DeepsetNvidiaRanker.documents
- sender: PromptBuilder.prompt
  receiver: DeepsetAnswerBuilder.prompt
- sender: OpenAIGenerator.replies
  receiver: DeepsetAnswerBuilder.replies
- sender: DeepsetNvidiaRanker.documents
  receiver: PromptBuilder.documents
- sender: DeepsetNvidiaRanker.documents
  receiver: DeepsetAnswerBuilder.documents
- sender: OpenAIGenerator.replies
  receiver: DeepsetNvidiaRanker.query
- sender: OpenAIGenerator.replies
  receiver: DeepsetAnswerBuilder.query
- sender: MultiFileConverter.documents
  receiver: DeepsetAnswerBuilder.documents
- sender: MultiFileConverter.documents
  receiver: PromptBuilder.documents
- sender: MultiFileConverter.documents
  receiver: Output.documents

inputs:
  query:
  - PromptBuilder.question
  filters:
  - OpenSearchHybridRetriever.filters_bm25
  - OpenSearchHybridRetriever.filters_embedding
  files:
  - MultiFileConverter.sources

outputs:
  answers: DeepsetAnswerBuilder.answers

max_runs_per_component: 100

metadata: {}

When You Still Need OutputAdapter

You still need OutputAdapter when:

You're converting between types that smart connections don't support (anything other than string and ChatMessage).
You need explicit control over formatting, ordering, or extracting specific fields from the output.

Remove DeepsetChatHistoryParser

DeepsetChatHistoryParser was most commonly used with the Agent component to pass messages augmented with chat history to the Agent. Currently, this is done automatically. You can connect the Input component's messages output directly to the Agent's messages input. messages include prior conversations from the search history.

To simplify pipelines that contain DeepsetChatHistoryParser:

Delete DeepsetChatHistoryParser from your pipeline.
On the Input component card, click Configure and -messages as a connection point.
Connect the Input component's messages output to the Agent's messages input.

That's it. Your Agent will now receive messages augmented with chat history.

Was this page helpful?

About This Task​

Remove DocumentJoiner​

Hybrid Retrieval Pipelines​

Indexes with Multiple File Converters​

Remove ListJoiner​

Joining ChatMessages​

Remove OutputAdapter for Type Conversions​

RAG Chat with LLM sending input to a Retriever and a Ranker​

When You Still Need OutputAdapter​

Remove DeepsetChatHistoryParser​