Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

AmazonBedrockTextEmbedder

Calculate text embeddings using models through Amazon Bedrock API. Use this component to embed strings, like user queries, for semantic search.

Amazon Bedrock is a fully managed service that makes state-of-the-art language models available for use through a unified API. To learn more, see Amazon Bedrock documentation.

You can use this component with the following models:

  • amazon.titan-embed-text-v1
  • cohere.embed-english-v3
  • cohere.embed-multilingual-v3
  • amazon.titan-embed-text-v2:0

Embedding Models in Query Pipelines and Indexes

The embedding model you use to embed documents in your indexing pipeline must be the same as the embedding model you use to embed the query in your query pipeline.

This means the embedders for your indexing and query pipelines must match. For example, if you use CohereDocumentEmbedder to embed your documents, you should use CohereTextEmbedder with the same model to embed your queries.

Key Features

  • Embeds a single string using Amazon Bedrock embedding models.
  • Supports Amazon Titan and Cohere embedding models.
  • Use AmazonBedrockDocumentEmbedder to embed documents in indexes.
  • Outputs a list of floats representing the embedding of the input text.

Configuration

Authentication

To use this component, connect Haystack Platform with Amazon Bedrock first. You'll need the region name, access key ID, and secret access key.

For detailed explanation, see Use Amazon Bedrock and SageMaker Models.

  1. Drag the AmazonBedrockTextEmbedder component onto the canvas from the Component Library.
  2. Click the component to open the configuration panel.
  3. On the General tab:
    1. Select the embedding model from the list. Make sure to use the same model that was used to embed documents in the index.
  4. Go to the Advanced tab to configure the AWS credentials and boto3 client settings.

Connections

AmazonBedrockTextEmbedder accepts a text string as input and outputs a list of floats representing the text embedding.

Connect the pipeline's query input to the text input. Connect the embedding output to an embedding retriever's query_embedding input to find matching documents.

Usage Example

Using the Component in a Pipeline

This is an example of a basic document search pipeline where AmazonBedrockTextEmbedder uses a Cohere model to embed the query and send the resulting embedding to an Embedding Retriever:

components:
OpenSearchEmbeddingRetriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 1024
similarity: cosine
index: Standard-Index-English
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
filters:
top_k: 10
filter_policy: replace
custom_query:
raise_on_failure: true
efficient_filtering: false

AmazonBedrockTextEmbedder:
type: haystack_integrations.components.embedders.amazon_bedrock.text_embedder.AmazonBedrockTextEmbedder
init_parameters:
model: cohere.embed-english-v3
aws_access_key_id:
type: env_var
env_vars:
- AWS_ACCESS_KEY_ID
strict: false
aws_secret_access_key:
type: env_var
env_vars:
- AWS_SECRET_ACCESS_KEY
strict: false
aws_session_token:
type: env_var
env_vars:
- AWS_SESSION_TOKEN
strict: false
aws_region_name:
type: env_var
env_vars:
- AWS_DEFAULT_REGION
strict: false
aws_profile_name:
type: env_var
env_vars:
- AWS_PROFILE
strict: false
boto3_config:

connections:
- sender: AmazonBedrockTextEmbedder.embedding
receiver: OpenSearchEmbeddingRetriever.query_embedding

max_runs_per_component: 100

metadata: {}

inputs:
query:
- AmazonBedrockTextEmbedder.text

outputs:
documents: OpenSearchEmbeddingRetriever.documents

Parameters

Inputs

ParameterTypeDefaultDescription
textstrThe text to embed.

Outputs

ParameterTypeDefaultDescription
embeddingList[float]The embedding of the input text.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
modelLiteral['amazon.titan-embed-text-v1', 'cohere.embed-english-v3', 'cohere.embed-multilingual-v3', 'amazon.titan-embed-text-v2:0']The embedding model to use. The model has to be specified in the format outlined in the Amazon Bedrock documentation. Make sure the embedding model used for embedding the query is the same as the embedding model used to embed documents in the index.
aws_access_key_idOptional[Secret]Secret.from_env_var('AWS_ACCESS_KEY_ID', strict=False)AWS access key ID.
aws_secret_access_keyOptional[Secret]Secret.from_env_var('AWS_SECRET_ACCESS_KEY', strict=False)AWS secret access key.
aws_session_tokenOptional[Secret]Secret.from_env_var('AWS_SESSION_TOKEN', strict=False)AWS session token.
aws_region_nameOptional[Secret]Secret.from_env_var('AWS_DEFAULT_REGION', strict=False)AWS region name.
aws_profile_nameOptional[Secret]Secret.from_env_var('AWS_PROFILE', strict=False)AWS profile name.
boto3_configOptional[Dict[str, Any]]NoneThe configuration for the boto3 client.
kwargsAnyAdditional parameters to pass for model inference. For example, input_type and truncate for Cohere models.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
textstrThe input text to embed.