MetaFieldGroupingRanker
Reorder documents by grouping them based on their metadata fields. The component groups documents by a primary metadata key and optionally subgroups them with a secondary key, then outputs a flat ordered list. This helps improve the quality of context provided to LLMs by organizing documents in a structured way.
Key Features
- Groups documents by a primary metadata key (
group_by). - Optionally subgroups documents within each group using a secondary key (
subgroup_by). - Sorts documents within groups by a configurable metadata key (
sort_docs_by). - Outputs a flat list ordered by group and subgroup values.
- Places documents without a matching group at the end of the list.
- Improves LLM performance by providing well-organized document context.
Configuration
- Drag the
MetaFieldGroupingRankercomponent onto the canvas from the Component Library. - Click on the component to open the configuration panel.
- On the General tab:
- Set
group_byto the metadata key you want to group documents by (required). - Optionally set
subgroup_byto add a secondary grouping level.
- Set
- Go to the Advanced tab to set
sort_docs_byif you want to sort documents within each group by a specific metadata key.
Connections
MetaFieldGroupingRanker accepts a list of documents through its documents input. It's typically placed after a retriever or another ranker. It outputs a reordered documents list, which you can send to an LLM component or another processing component.
Source Code
To check this component's source code, open meta_field_grouping_ranker.py in the Haystack repository.
Usage Examples
Basic Configuration
components:
MetaFieldGroupingRanker:
type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
init_parameters:
group_by: "group"
subgroup_by: "subgroup"
sort_docs_by: "split_id"
This groups documents by the group metadata key, subgroups them by the subgroup key, and sorts them by the split_id key.
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
documents | List[Document] | The list of documents to group. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
documents | List[Document] | The list of documents ordered by the group_by and subgroup_by metadata values. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
group_by | str | The metadata key to aggregate the documents by. | |
subgroup_by | Optional[str] | None | The metadata key to aggregate the documents within a group created by the group_by key. |
sort_docs_by | Optional[str] | None | Determines which metadata key is used to sort the documents. If not provided, the documents within the groups or subgroups are not sorted and are kept in the same order as they were inserted in the subgroups. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
documents | List[Document] | The list of documents to group. |
Was this page helpful?