Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

FileTypeRouter

Categorize files or byte streams by their MIME types and route them to different pipeline branches.

Key Features

  • Routes file paths and ByteStream objects to outputs based on MIME type.
  • Supports exact MIME type matching and regex patterns (for example, audio/* or text/*).
  • Infers MIME types from file extensions for file paths and from metadata for byte streams.
  • Supports custom MIME type mappings for unsupported or proprietary file types.
  • Optionally raises an error for non-existent files.

Configuration

  1. Drag the FileTypeRouter component onto the canvas from the Component Library.
  2. Click on the component to open the configuration panel.
  3. On the General tab:
    • Set MIME Types to the list of MIME types or regex patterns to classify the input files or byte streams (for example, ["text/plain", "audio/x-wav", "image/jpeg"]).
  4. Go to the Advanced tab to configure optional settings:
    • Set Additional MIME Types to add custom MIME type-to-extension mappings for file types not supported by the standard mimetypes module (for example, {"application/vnd.openxmlformats-officedocument.wordprocessingml.document": ".docx"}).
    • Enable Raise on Failure to always raise a FileNotFoundError for non-existent files. When disabled (default), this error is only raised when the meta parameter is provided at runtime.

Connections

FileTypeRouter receives a list of file paths or ByteStream objects. For each MIME type you specify, it creates a named output connection. Connect each output to the appropriate file converter (for example, connect the text/plain output to TextFileToDocument and the application/pdf output to a PDF converter).

Source Code

To check this component's source code, open file_type_router.py in the Haystack repository.

Usage Examples

Basic Configuration

  FileTypeRouter:
type: components.routers.file_type_router.FileTypeRouter
init_parameters:
mime_types:
- text/plain
- audio/x-wav
- image/jpeg
raise_on_failure: false
components:
FileTypeRouter:
type: components.routers.file_type_router.FileTypeRouter
init_parameters:
mime_types:
- text/plain
- audio/x-wav
- image/jpeg
raise_on_failure: false

Parameters

Inputs

ParameterTypeDescription
sourcesList[Union[str, Path, ByteStream]]A list of file paths or byte streams to categorize.
metaOptional[Union[Dict[str, Any], List[Dict[str, Any]]]]Optional metadata to attach to the sources. A single dictionary is applied to all sources; a list must match the number of sources.

Outputs

ParameterTypeDescription
(per MIME type)List[ByteStream]Files or streams matching each specified MIME type, routed to the corresponding named output.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
mime_typesList[str]A list of MIME types or regex patterns to classify the input files or byte streams (for example, ["text/plain", "audio/x-wav", "image/jpeg"]).
additional_mimetypesOptional[Dict[str, str]]NoneA dictionary of MIME type-to-extension mappings to add to the mimetypes module. Useful for unsupported or non-native file types (for example, {"application/vnd.openxmlformats-officedocument.wordprocessingml.document": ".docx"}).
raise_on_failureboolFalseWhen True, FileNotFoundError is always raised for non-existent files. When False, this error is only raised when the meta parameter is provided to run().

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
sourcesList[Union[str, Path, ByteStream]]A list of file paths or byte streams to categorize.
metaOptional[Union[Dict[str, Any], List[Dict[str, Any]]]]NoneOptional metadata to attach to the sources. A single dictionary is applied to all sources; a list must match the number of sources.