Skip to main content

AmazonBedrockTextEmbedder

Calculate text embeddings using models through Amazon Bedrock API.

Basic Information

  • Type: haystack_integrations.components.embedders.amazon_bedrock.text_embedder.AmazonBedrockTextEmbedder
  • Components it can connect with:
    • Query: AmazonBedrockTextEmbedder receives the query tor embed from Query.
    • Embedding Retrievers: AmazonBedrockTextEmbedder can send the embedded query to an embedding retriever that uses it to find matching documents.

Inputs

ParameterTypeDefaultDescription
textstrThe text to embed.

Outputs

ParameterTypeDefaultDescription
embeddingList[float]The embedding of the input text.

Overview

Amazon Bedrock is a fully managed service that makes state-of-the-art language models available for use through a unified API. To learn more, see Amazon Bedrock documentation.

You can use this component with the following models:

  • amazon.titan-embed-text-v1
  • cohere.embed-english-v3
  • cohere.embed-multilingual-v3
  • amazon.titan-embed-text-v2:0

Embedding Models in Query Pipelines and Indexes

The embedding model you use to embed documents in your indexing pipeline must be the same as the embedding model you use to embed the query in your query pipeline.

This means the embedders for your indexing and query pipelines must match. For example, if you use CohereDocumentEmbedder to embed your documents, you should use CohereTextEmbedder with the same model to embed your queries.

Use AmazonBedrockTextEmbedder to embed strings, such as query. Use AmazonBedrockDocumentEmbedder to embed documents.

Authentication

To use this component, connect deepset with Amazon Bedrock first. You'll need:

  • The region name
  • Access key ID
  • Secret access key

Connection Instructions

  1. Click your profile icon in the top right corner and choose Integrations.
    Integrations menu screenshot
  2. Click Connect next to the provider.
  3. Enter your API key and submit it.

For detailed explanation, see Use Amazon Bedrock and SageMaker Models.

Usage Example

Initializing the Component

components:
AmazonBedrockTextEmbedder:
type: haystack_integrations.components.embedders.amazon_bedrock.text_embedder.AmazonBedrockTextEmbedder
init_parameters:

Using the Component in a Pipeline

This is an example of a basic document search pipeline where AmazonBedrockTextEmbedder uses a Cohere model to embed the query and send the resulting embedding to an Embedding Retriever:

components:
OpenSearchEmbeddingRetriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 1024
similarity: cosine
index: Standard-Index-English
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
filters:
top_k: 10
filter_policy: replace
custom_query:
raise_on_failure: true
efficient_filtering: false

AmazonBedrockTextEmbedder:
type: haystack_integrations.components.embedders.amazon_bedrock.text_embedder.AmazonBedrockTextEmbedder
init_parameters:
model: cohere.embed-english-v3
aws_access_key_id:
type: env_var
env_vars:
- AWS_ACCESS_KEY_ID
strict: false
aws_secret_access_key:
type: env_var
env_vars:
- AWS_SECRET_ACCESS_KEY
strict: false
aws_session_token:
type: env_var
env_vars:
- AWS_SESSION_TOKEN
strict: false
aws_region_name:
type: env_var
env_vars:
- AWS_DEFAULT_REGION
strict: false
aws_profile_name:
type: env_var
env_vars:
- AWS_PROFILE
strict: false
boto3_config:

connections:
- sender: AmazonBedrockTextEmbedder.embedding
receiver: OpenSearchEmbeddingRetriever.query_embedding

max_runs_per_component: 100

metadata: {}

inputs:
query:
- AmazonBedrockTextEmbedder.text

outputs:
documents: OpenSearchEmbeddingRetriever.documents

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
modelLiteral['amazon.titan-embed-text-v1', 'cohere.embed-english-v3', 'cohere.embed-multilingual-v3', 'amazon.titan-embed-text-v2:0']The embedding model to use. The model has to be specified in the format outlined in the Amazon Bedrock documentation. Make sure the embedding model used for embedding the query is the same as the embedding model used to embed documents in the index.
aws_access_key_idOptional[Secret]Secret.from_env_var('AWS_ACCESS_KEY_ID', strict=False)AWS access key ID.
aws_secret_access_keyOptional[Secret]Secret.from_env_var('AWS_SECRET_ACCESS_KEY', strict=False)AWS secret access key.
aws_session_tokenOptional[Secret]Secret.from_env_var('AWS_SESSION_TOKEN', strict=False)AWS session token.
aws_region_nameOptional[Secret]Secret.from_env_var('AWS_DEFAULT_REGION', strict=False)AWS region name.
aws_profile_nameOptional[Secret]Secret.from_env_var('AWS_PROFILE', strict=False)AWS profile name.
boto3_configOptional[Dict[str, Any]]NoneThe configuration for the boto3 client.
kwargsAnyAdditional parameters to pass for model inference. For example, input_type and truncate for Cohere models.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
textstrThe input text to embed.