Skip to main content

MetaFieldGroupingRanker

Reorders the documents by grouping them based on metadata keys.

Basic Information

  • Type: haystack_integrations.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker

Inputs

ParameterTypeDefaultDescription
documentsList[Document]The list of documents to group.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]A dictionary with the following keys: - documents: The list of documents ordered by the group_by and subgroup_by metadata values.

Overview

Work in Progress

Bear with us while we're working on adding pipeline examples and most common components connections.

Reorders the documents by grouping them based on metadata keys.

The MetaFieldGroupingRanker can group documents by a primary metadata key group_by, and subgroup them with an optional secondary key, subgroup_by. Within each group or subgroup, it can also sort documents by a metadata key sort_docs_by.

The output is a flat list of documents ordered by group_by and subgroup_by values. Any documents without a group are placed at the end of the list.

The proper organization of documents helps improve the efficiency and performance of subsequent processing by an LLM.

Usage Example

components:
MetaFieldGroupingRanker:
type: components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
init_parameters:

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
group_bystrThe metadata key to aggregate the documents by.
subgroup_byOptional[str]NoneThe metadata key to aggregate the documents within a group that was created by the group_by key.
sort_docs_byOptional[str]NoneDetermines which metadata key is used to sort the documents. If not provided, the documents within the groups or subgroups are not sorted and are kept in the same order as they were inserted in the subgroups.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
documentsList[Document]The list of documents to group.