DocumentJoiner
Merge multiple lists of documents from different pipeline branches into a single list.
While most pipelines can use smart connections to automatically merge document lists, DocumentJoiner provides explicit control over the joining process.
Key Features
- Merges multiple document lists into a single output.
- Maintains document order from input lists.
- Supports any number of input connections.
- Useful when smart connections don't provide enough control over the merge logic.
When To Use DocumentJoiner
Use DocumentJoiner when:
- You need explicit control over how document lists are merged.
- Your pipeline has complex branching that requires specific joining logic.
In most cases, you can simplify your pipeline by using smart connections instead. Components automatically accept multiple lists of the same type and merge them. For more information, see Smart Connections.
Configuration
- Drag the
DocumentJoinercomponent onto the canvas from the Component Library. - Click on the component to open the configuration panel.
- The component requires no configuration. Connect multiple components that output document lists to its inputs.
Connections
DocumentJoiner accepts List[Document] from any number of upstream components, such as retrievers, document processors, or other DocumentJoiner components. Its output is a merged List[Document] that you can connect to rankers, generators, or document writers.
Source Code
To check this component's source code, open document_joiner.py in the Haystack repository.
Usage Examples
Basic Configuration
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
components:
bm25_retriever:
type: haystack.components.retrievers.in_memory.InMemoryBM25Retriever
params:
document_store:
type: haystack.document_stores.in_memory.InMemoryDocumentStore
embedding_retriever:
type: haystack.components.retrievers.in_memory.InMemoryEmbeddingRetriever
params:
document_store:
type: haystack.document_stores.in_memory.InMemoryDocumentStore
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
ranker:
type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: document_joiner.documents
receiver: ranker.documents
Parameters
Inputs
| Parameter | Type | Description |
|---|---|---|
documents | List[Document] | Lists of documents to join together. Accepts multiple connections. |
Outputs
| Parameter | Type | Description |
|---|---|---|
documents | List[Document] | The combined list of all input documents. |
Init Parameters
DocumentJoiner has no initialization parameters.
Run Method Parameters
DocumentJoiner has no run-time parameters.
Related Information
Was this page helpful?