Use Amazon Bedrock and SageMaker Models
You can use models hosted in your own Bedrock or SageMaker account or in deepset's account.
You can use embedding models and LLMs hosted on Amazon Bedrock through Bedrock's API. For a full list of supported models, see Amazon Bedrock documentation.
Prerequisites
To use models through your own Amazon Bedrock or SageMaker account, you must have valid AWS credentials:
- Access key ID
- Secret access key
For details, see Amazon Bedrock model access and Amazong Bedrock documentation.
Using Bedrock Models
Using Bedrock Models Through deepset's Account
If you don't have a Bedrock account, you can use LLMs hosted there through DeepsetAmazonBedrockGenerator. It uses deepset's Bedrock account, so you don't need to create your own. If you use DeepsetAmazonBedrockGenerator, you don't need to connect deepset Cloud to Bedrock.
Using Bedrock Models Through Your Own Account
First, connect deepset Cloud to Amazon Bedrock by passing your Bedrock API key on the Connections page:
-
Click your initials in the top right corner and select Connections.
-
Click Connect next to the provider.
-
Enter your user access token and submit it.
Then, add a component that uses the Bedrock model to your pipeline. Here is a list of components by the model type they use:
- Embedding models (used to calculate embeddings for text):
- AmazonBedrockTextEmbedder: Calculates embeddings for text, such as query. Often used in query pipelines to embed query and then pass it to an embedding retriever.
- AmazonBedrockDocumentEmbedder: Calculates embeddings for documents. Often used in indexing pipelines to embed documents and pass them to DocumentWriter.
Embedding Models in Query and Indexing Pipelines
The embedding model you use to embed documents in your indexing pipeline must be the same as the embedding model you use to embed the query in your query pipeline.
- LLMs:
- AmazonBedrockGenerator: Generates text, often used in RAG pipelines.
Usage Examples
This is an example of how to use embedding models and an LLM hosted on Bedrock in indexing and query pipelines (each in a separate tab):
components:
...
splitter:
type: haystack.components.preprocessors.document_splitter.DocumentSplitter
init_parameters:
split_by: word
split_length: 250
split_overlap: 30
document_embedder:
type: haystack_integrations.components.embedders.amazon_bedrock.document_embedder.AmazonBedrockDocumentEmbedder
init_parameters:
model: "cohere.embed-english-v3"
writer:
type: haystack.components.writers.document_writer.DocumentWriter
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
embedding_dim: 768
similarity: cosine
policy: OVERWRITE
connections: # Defines how the components are connected
...
- sender: splitter.documents
receiver: document_embedder.documents
- sender: document_embedder.documents
receiver: writer.documents
components:
...
query_embedder:
type: haystack_integrations.components.embedders.amazon_bedrock.text_embedder.AmazonBedrockTextEmbedder
init_parameters:
model: "cohere.embed-english-v3"
retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
init_parameters:
use_ssl: True
verify_certs: False
http_auth:
- "${OPENSEARCH_USER}"
- "${OPENSEARCH_PASSWORD}"
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
top_k: 20
prompt_builder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: |-
You are a technical expert.
You answer questions truthfully based on provided documents.
For each document check whether it is related to the question.
Only use documents that are related to the question to answer it.
Ignore documents that are not related to the question.
If the answer exists in several documents, summarize them.
Only answer based on the documents provided. Don't make things up.
If the documents can't answer the question or you are unsure say: 'The answer can't be found in the text'.
These are the documents:
{% for document in documents %}
Document[{{ loop.index }}]:
{{ document.content }}
{% endfor %}
Question: {{question}}
Answer:
generator:
type: haystack_integrations.components.generators.amazon_bedrock.generator.AmazonBedrockGenerator
init_parameters:
model: "anthropic.claude-v2"
kwargs:
temperature: 0.0
answer_builder:
init_parameters: {}
type: haystack.components.builders.answer_builder.AnswerBuilder
connections: # Defines how the components are connected
...
- sender: query_embedder.embedding # AmazonBedrockTextEmbedder sends the embedded query to the retriever
receiver: retriever.query_embedding
- sender: retriever.documents
receiver: prompt_builder.documents
- sender: prompt_builder.prompt
receiver: generator.prompt
- sender: generator.replies
receiver: answer_builder.replies
...
inputs:
query:
..
- "query_embedder.text" # AmazonBedrockTextEmbedder needs query as input and it's not getting it
- "retriever.query" # from any component it's connected to, so it needs to receive it from the pipeline.
- "prompt_builder.question"
- "answer_builder.query"
...
Using SageMaker Models
You can use LLMs hosted on SageMaker through the SagemakerGenerator component. Pass the model endpoint in the model
parameter. If your model requires custom attributes, you can specify them in the aws_custom_attributes
parameter. The llama2 family of models requires a custom parameter: accept_eula: True
to work. See the usage example below for more details.
Usage Example
To use an LLM hosted on SageMaker in your query pipeline, pass the model in the Generator's model
parameter:
components:
...
prompt_builder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: |-
You are a technical expert.
You answer questions truthfully based on provided documents.
For each document check whether it is related to the question.
Only use documents that are related to the question to answer it.
Ignore documents that are not related to the question.
If the answer exists in several documents, summarize them.
Only answer based on the documents provided. Don't make things up.
If the documents can't answer the question or you are unsure say: 'The answer can't be found in the text'.
These are the documents:
{% for document in documents %}
Document[{{ loop.index }}]:
{{ document.content }}
{% endfor %}
Question: {{question}}
Answer:
generator:
type: haystack_integrations.components.generators.amazon_sagemaker.sagemaker.SagemakerGenerator
init_parameters:
model: jumpstart-dft-hf-llm-llama2-7b-instruct-bf16 #this is the model
generation_kwargs:
temperature: 0.0
aws_custom_attributes:
accept_eula: True
...
connections:
...
- sender: prompt_builder.prompt
receiver: generator.prompt
...
iputs:
...
query:
- "prompt_builder.question"
...
outputs:
...
answers: "generator.replies"
Updated 2 months ago