Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

AmazonBedrockDocumentImageEmbedder

Compute document embeddings from images using Amazon Bedrock models. Use this component in indexes to create embeddings from images referenced in documents, enabling multimodal semantic search.

Key Features

  • Embeds images referenced in document metadata using Amazon Bedrock models.
  • Supports Amazon Titan and Cohere multimodal embedding models.
  • Stores the computed embedding in the embedding field of each document.
  • Supports optional image resizing while maintaining aspect ratio.
  • Useful for building multimodal search applications where you want to find documents based on image similarity.

Configuration

Authentication

To use this component, you need AWS credentials. Connect Haystack Platform to your AWS account by adding secrets with the following keys: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_DEFAULT_REGION.

For details on how to create secrets, see Add Secrets.

For instructions on using Bedrock models, see Use Amazon Bedrock and SageMaker Models.

  1. Drag the AmazonBedrockDocumentImageEmbedder component onto the canvas from the Component Library.
  2. Click the component to open the configuration panel.
  3. On the General tab:
    1. Select the embedding model from the list.
  4. Go to the Advanced tab to configure the AWS credentials, file path metadata field, root path, image size, progress bar, and boto3 client settings.

Connections

AmazonBedrockDocumentImageEmbedder accepts a list of documents with image file paths in their metadata as input. It outputs a list of documents with embeddings stored in the embedding field.

Connect a converter like ImageFileToDocument to the documents input to provide image documents for embedding. Connect the documents output to DocumentWriter to store the embedded documents in the document store.

Usage Example

This is an example indexing pipeline with AmazonBedrockDocumentImageEmbedder for image-based document embedding:

components:
converter:
type: haystack.components.converters.image.file_to_document.ImageFileToDocument
init_parameters: {}

image_embedder:
type: haystack_integrations.components.embedders.amazon_bedrock.document_image_embedder.AmazonBedrockDocumentImageEmbedder
init_parameters:
model: amazon.titan-embed-image-v1
aws_access_key_id:
type: env_var
env_vars:
- AWS_ACCESS_KEY_ID
strict: false
aws_secret_access_key:
type: env_var
env_vars:
- AWS_SECRET_ACCESS_KEY
strict: false
aws_session_token:
type: env_var
env_vars:
- AWS_SESSION_TOKEN
strict: false
aws_region_name:
type: env_var
env_vars:
- AWS_DEFAULT_REGION
strict: false
aws_profile_name:
type: env_var
env_vars:
- AWS_PROFILE
strict: false
file_path_meta_field: file_path
root_path:
image_size:
progress_bar: true
boto3_config:

writer:
type: haystack.components.writers.document_writer.DocumentWriter
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'images'
max_chunk_bytes: 104857600
embedding_dim: 1024
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
policy: OVERWRITE

connections:
- sender: converter.documents
receiver: image_embedder.documents
- sender: image_embedder.documents
receiver: writer.documents

max_runs_per_component: 100

metadata: {}

inputs:
files:
- converter.sources

Parameters

Inputs

ParameterTypeDefaultDescription
documentsList[Document]A list of documents with image file paths in their metadata.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]Documents with embeddings stored in the embedding field.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
modelLiteral['amazon.titan-embed-image-v1', 'cohere.embed-english-v3', 'cohere.embed-multilingual-v3']The Bedrock model to use for calculating embeddings.
aws_access_key_idOptional[Secret]Secret.from_env_var('AWS_ACCESS_KEY_ID')AWS access key ID.
aws_secret_access_keyOptional[Secret]Secret.from_env_var('AWS_SECRET_ACCESS_KEY')AWS secret access key.
aws_session_tokenOptional[Secret]Secret.from_env_var('AWS_SESSION_TOKEN')AWS session token for temporary credentials.
aws_region_nameOptional[Secret]Secret.from_env_var('AWS_DEFAULT_REGION')AWS region name.
aws_profile_nameOptional[Secret]Secret.from_env_var('AWS_PROFILE')AWS profile name.
file_path_meta_fieldstr"file_path"The metadata field in the Document that contains the file path to the image.
root_pathOptional[str]NoneThe root directory path where document files are located. If provided, file paths in document metadata are resolved relative to this path.
image_sizeOptional[Tuple[int, int]]NoneIf provided, resizes the image to fit within the specified dimensions (width, height) while maintaining aspect ratio.
progress_barboolTrueIf True, shows a progress bar when embedding documents.
boto3_configOptional[Dict[str, Any]]NoneConfiguration for the boto3 client.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
documentsList[Document]A list of documents with image file paths in their metadata.