Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

DocumentJoiner

Merge multiple lists of documents from different pipeline branches into a single list.

While most pipelines can use smart connections to automatically merge document lists, DocumentJoiner provides explicit control over the joining process.

Key Features

  • Merges multiple document lists into a single output.
  • Maintains document order from input lists.
  • Supports any number of input connections.
  • Useful when smart connections don't provide enough control over the merge logic.

When To Use DocumentJoiner

Use DocumentJoiner when:

  • You need explicit control over how document lists are merged.
  • Your pipeline has complex branching that requires specific joining logic.

In most cases, you can simplify your pipeline by using smart connections instead. Components automatically accept multiple lists of the same type and merge them. For more information, see Smart Connections.

Configuration

  1. Drag the DocumentJoiner component onto the canvas from the Component Library.
  2. Click on the component to open the configuration panel.
  3. The component requires no configuration. Connect multiple components that output document lists to its inputs.

Connections

DocumentJoiner accepts List[Document] from any number of upstream components, such as retrievers, document processors, or other DocumentJoiner components. Its output is a merged List[Document] that you can connect to rankers, generators, or document writers.

Source Code

To check this component's source code, open document_joiner.py in the Haystack repository.

Usage Examples

Basic Configuration

  document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
components:
bm25_retriever:
type: haystack.components.retrievers.in_memory.InMemoryBM25Retriever
params:
document_store:
type: haystack.document_stores.in_memory.InMemoryDocumentStore

embedding_retriever:
type: haystack.components.retrievers.in_memory.InMemoryEmbeddingRetriever
params:
document_store:
type: haystack.document_stores.in_memory.InMemoryDocumentStore

document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner

ranker:
type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker

connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: document_joiner.documents
receiver: ranker.documents

Parameters

Inputs

ParameterTypeDescription
documentsList[Document]Lists of documents to join together. Accepts multiple connections.

Outputs

ParameterTypeDescription
documentsList[Document]The combined list of all input documents.

Init Parameters

DocumentJoiner has no initialization parameters.

Run Method Parameters

DocumentJoiner has no run-time parameters.