DeepsetMetadataGrouper
Reorder documents by grouping them on metadata fields. Use this component to keep related document chunks together before sending them to an LLM.
This component is deprecated. Use MetaFieldGroupingRanker from Haystack instead. Existing pipelines that use this component continue to work for now.
Key Features
- Groups documents by a metadata field such as
file_idordokid. - Creates subgroups within each group using a second metadata field.
- Sorts documents within groups by a numeric metadata value.
- Helps LLMs process related chunks in a logical order.
Configuration
- Drag the
DeepsetMetadataGroupercomponent onto the canvas from the Component Library. - Click the component to open the configuration panel.
- Set
group_byto the primary metadata field for grouping. - Optionally set
subgroup_byandsort_docs_by.
Connections
DeepsetMetadataGrouper accepts a list of documents as input. It outputs documents — the same documents reordered by group and subgroup.
Connect a Ranker or retriever to the input. Connect the output to a PromptBuilder or pipeline output.
Usage Example
This example groups ranked documents by dokid and sorts them by tokennr:
components:
ranker:
type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
init_parameters:
model: cross-encoder/ms-marco-MiniLM-L-6-v2
top_k: 15
metadata_grouper:
type: deepset_cloud_custom_nodes.rankers.deepset_metadata_grouper.DeepsetMetadataGrouper
init_parameters:
group_by: dokid
subgroup_by:
sort_docs_by: tokennr
connections:
- sender: ranker.documents
receiver: metadata_grouper.documents
outputs:
documents: metadata_grouper.documents
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | Documents to group and reorder. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | Documents reordered by group and subgroup. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| group_by | str | Metadata key used to group documents. | |
| subgroup_by | Optional[str] | None | Metadata key used to create subgroups within each group. |
| sort_docs_by | Optional[str] | None | Metadata key used to sort documents within each group or subgroup. Documents without this key are placed at the end. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | Documents to group and reorder. |
Was this page helpful?