Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

ImageFileToImageContent

Convert image files to ImageContent objects for multimodal AI processing, including tasks like image captioning or visual question answering.

ImageFileToImageContent reads image files from various sources and creates ImageContent objects containing base64-encoded image data and associated metadata.

Key Features

  • Converts image files to base64-encoded ImageContent objects ready for multimodal AI models.
  • Supports various image formats.
  • Optional image resizing to reduce file size and processing time while maintaining aspect ratio.
  • Configurable detail level for optimization with OpenAI vision models.

Configuration

  1. Drag the ImageFileToImageContent component onto the canvas from the Component Library.
  2. Click on the component to open the configuration panel.
  3. Configure the component settings:
    • Set the Detail level for images (auto, high, or low). This is passed to the created ImageContent objects and is only supported by OpenAI.
    • Set the Size to resize images to the specified dimensions (width, height) while maintaining aspect ratio. This reduces file size, memory usage, and processing time.

Connections

ImageFileToImageContent accepts a list of file paths or ByteStream objects through its sources input. It outputs a list of ImageContent objects.

It typically connects with:

  • FilesInput: receives image file paths.
  • ChatPromptBuilder: sends ImageContent objects to include in multimodal prompts for vision-capable models.

Source Code

To check this component's source code, open file_to_image.py in the Haystack repository.

Usage Examples

Basic Configuration

  ImageFileToImageContent:
type: haystack.components.converters.image.ImageFileToImageContent
init_parameters:
detail: auto

Using the Component in a Pipeline

This is an example query pipeline that uses ImageFileToImageContent to convert uploaded images to ImageContent objects for multimodal processing. The images are sent to a ChatPromptBuilder along with a user question, and then processed by a vision-enabled chat generator:

# haystack-pipeline
components:
ImageFileToImageContent:
type: haystack.components.converters.image.ImageFileToImageContent
init_parameters:
detail: auto
size:

ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- _content:
- text: "You are an AI assistant that can analyze images and answer questions about them."
_role: system
- _content:
- text: "{% for image in images %}{{ image }}{% endfor %}\n\nQuestion: {{ query }}"
_role: user
required_variables:
variables:

OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]
custom_filters:
unsafe: false

answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm

OpenAIChatGenerator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o
generation_kwargs:
streaming_callback:
tools:

connections:
- sender: ImageFileToImageContent.image_contents
receiver: ChatPromptBuilder.images
- sender: ChatPromptBuilder.prompt
receiver: OpenAIChatGenerator.messages
- sender: OpenAIChatGenerator.replies
receiver: OutputAdapter.replies
- sender: OutputAdapter.output
receiver: answer_builder.replies

inputs:
query:
- ChatPromptBuilder.query
- answer_builder.query
files:
- ImageFileToImageContent.sources

outputs:
answers: answer_builder.answers

max_runs_per_component: 100

metadata: {}

Parameters

Inputs

ParameterTypeDefaultDescription
sourcesList[Union[str, Path, ByteStream]]List of image file paths or ByteStream objects to convert.
metaOptional[Union[Dict[str, Any], List[Dict[str, Any]]]]NoneOptional metadata to attach to the ImageContent objects. This value can be a list of dictionaries or a single dictionary. If it's a single dictionary, its content is added to the metadata of all produced ImageContent objects. If it's a list, its length must match the number of sources as they're zipped together. For ByteStream objects, their meta is added to the output ImageContent objects.
detailOptional[Literal["auto", "high", "low"]]NoneOptional detail level of the image (only supported by OpenAI). This is passed to the created ImageContent objects. If not provided, the detail level is the one set in the constructor.
sizeOptional[Tuple[int, int]]NoneIf provided, resizes the image to fit within the specified dimensions (width, height) while maintaining aspect ratio. If not provided, the size value is the one set in the constructor.

Outputs

ParameterTypeDescription
image_contentsList[ImageContent]A list of ImageContent objects created from the input image files.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
detailOptional[Literal["auto", "high", "low"]]NoneOptional detail level of the image (only supported by OpenAI). Possible values: "auto", "high", or "low". This is passed to the created ImageContent objects.
sizeOptional[Tuple[int, int]]NoneIf provided, resizes the image to fit within the specified dimensions (width, height) while maintaining aspect ratio. This reduces file size, memory usage, and processing time.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
sourcesList[Union[str, Path, ByteStream]]List of image file paths or ByteStream objects to convert.
metaOptional[Union[Dict[str, Any], List[Dict[str, Any]]]]NoneOptional metadata to attach to the ImageContent objects. This value can be a list of dictionaries or a single dictionary. If it's a single dictionary, its content is added to the metadata of all produced ImageContent objects. If it's a list, its length must match the number of sources as they're zipped together. For ByteStream objects, their meta is added to the output ImageContent objects.
detailOptional[Literal["auto", "high", "low"]]NoneOptional detail level of the image (only supported by OpenAI). This will be passed to the created ImageContent objects. If not provided, the detail level will be the one set in the constructor.
sizeOptional[Tuple[int, int]]NoneIf provided, resizes the image to fit within the specified dimensions (width, height) while maintaining aspect ratio. If not provided, the size value will be the one set in the constructor.