MetadataRouter
Route documents or byte streams to different pipeline branches based on their metadata fields.
Key Features
- Routes documents or byte streams based on metadata filtering rules.
- Supports complex conditions using Haystack's metadata filtering syntax (AND, OR, and comparison operators).
- Routes unmatched items to a dedicated
unmatchedoutput. - Can handle both
DocumentandByteStreamobjects in the same configuration.
Configuration
- Drag the
MetadataRoutercomponent onto the canvas from the Component Library. - Click on the component to open the configuration panel.
- On the General tab:
- Set the Rules dictionary. Keys are output connection names and values are metadata filtering expressions. Items matching a rule go to that named output. Items not matching any rule go to
unmatched. - Set the Output Type to
documentsorbyte_streamsdepending on what you're routing (default isdocuments).
- Set the Rules dictionary. Keys are output connection names and values are metadata filtering expressions. Items matching a rule go to that named output. Items not matching any rule go to
Connections
MetadataRouter receives either a List[Document] or a List[ByteStream] depending on the configured output type. For each rule you define, it creates a named output connection. Items that don't match any rule go to the unmatched output. Connect each named output to the appropriate downstream component.
Source Code
To check this component's source code, open metadata_router.py in the Haystack repository.
Usage Examples
Basic Configuration
MetadataRouter:
type: components.routers.metadata_router.MetadataRouter
init_parameters:
output_type: documents
rules:
edge_1:
operator: AND
conditions:
- field: meta.created_at
operator: '>='
value: '2023-01-01'
- field: meta.created_at
operator: <
value: '2023-04-01'
components:
MetadataRouter:
type: components.routers.metadata_router.MetadataRouter
init_parameters:
output_type: documents # or byte_streams
rules:
edge_1:
operator: AND
conditions:
- field: meta.created_at
operator: ">="
value: "2023-01-01"
- field: meta.created_at
operator: "<"
value: "2023-04-01"
Parameters
Inputs
| Parameter | Type | Description |
|---|---|---|
documents | List[Document] | A list of documents to route. |
byte_streams | List[ByteStream] | A list of byte streams to route. |
Outputs
| Parameter | Type | Description |
|---|---|---|
| (per rule name) | List[Document] or List[ByteStream] | Items matching the corresponding rule. |
unmatched | List[Document] or List[ByteStream] | Items that don't match any rule. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
rules | Dict[str, Dict] | A dictionary defining routing rules. Keys are output connection names and values are metadata filtering expressions. For example: {"edge_1": {"operator": "AND", "conditions": [{"field": "meta.created_at", "operator": ">=", "value": "2023-01-01"}]}}. Items not matching any rule go to unmatched. | |
output_type | Optional[str] | documents | The type of objects to route. Can be documents or byte_streams. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
documents | List[Document] | A list of documents to route. | |
byte_streams | List[ByteStream] | A list of byte streams to route. |
Was this page helpful?