AnthropicVertexChatGenerator
Generate text using Claude models through the Anthropic Vertex AI API.
Key Features
- Chat completion using Claude models (including Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku) through Vertex AI
- Streaming support for real-time token-by-token responses
- Tool/function calling support
- Configurable generation parameters (temperature, top_p, max_tokens, and more)
- Requires a GCP project with Vertex AI enabled and the desired Claude model activated in the Vertex AI Model Garden
Configuration
- Drag the
AnthropicVertexChatGeneratorcomponent onto the canvas from the Component Library. - Click on the component to open the configuration panel.
- On the General tab:
- Enter your GCP region (defaults to "us-central1") and project ID. Create secrets called
REGIONandPROJECT_IDfor these values. For details, see Add a secret. - Select a model. Make sure the model is activated in the Vertex AI Model Garden. For details, see Use Anthropic Models.
- Enter your GCP region (defaults to "us-central1") and project ID. Create secrets called
- Go to the Advanced tab to configure generation parameters, timeout, max retries, tools, and streaming.
Connections
AnthropicVertexChatGenerator accepts a list of ChatMessage objects through its messages input and outputs generated responses as replies (a list of ChatMessage instances).
Connect ChatPromptBuilder's prompt output to this component's messages input. Connect the replies output to DeepsetAnswerBuilder through OutputAdapter.
Source Code
To check this component's source code, open vertex_chat_generator.py in the Haystack Core Integrations repository.
Usage Examples
Basic Configuration
AnthropicVertexChatGenerator:
type: haystack_integrations.components.generators.anthropic.chat.vertex_chat_generator.AnthropicVertexChatGenerator
init_parameters:
model: claude-3-5-sonnet@20240620
ignore_tools_thinking_messages: true
Using the Component in a Pipeline
This is an example of a RAG pipeline with AnthropicVertexChatGenerator:
components:
bm25_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'Standard-Index-English'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20 # The number of results to return
fuzziness: 0
query_embedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
normalize_embeddings: true
model: intfloat/e5-base-v2
embedding_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'Standard-Index-English'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20 # The number of results to return
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
ranker:
type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
init_parameters:
model: intfloat/simlm-msmarco-reranker
top_k: 8
meta_field_grouping_ranker:
type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
init_parameters:
group_by: file_id
subgroup_by:
sort_docs_by: split_id
answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm
ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- _content:
- text: "You are a helpful assistant answering the user's questions based on the provided documents.\nIf the answer is not in the documents, rely on the web_search tool to find information.\nDo not use your own knowledge.\n"
_role: system
- _content:
- text: "Provided documents:\n{% for document in documents %}\nDocument [{{ loop.index }}] :\n{{ document.content }}\n{% endfor %}\n\nQuestion: {{ query }}\n"
_role: user
required_variables:
variables:
OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]
custom_filters:
unsafe: false
AnthropicVertexChatGenerator:
type: haystack_integrations.components.generators.anthropic.chat.vertex_chat_generator.AnthropicVertexChatGenerator
init_parameters:
region:
project_id:
model: claude-3-5-sonnet@20240620
streaming_callback:
generation_kwargs:
ignore_tools_thinking_messages: true
tools:
connections: # Defines how the components are connected
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: query_embedder.embedding
receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: document_joiner.documents
receiver: ranker.documents
- sender: ranker.documents
receiver: meta_field_grouping_ranker.documents
- sender: meta_field_grouping_ranker.documents
receiver: answer_builder.documents
- sender: meta_field_grouping_ranker.documents
receiver: ChatPromptBuilder.documents
- sender: OutputAdapter.output
receiver: answer_builder.replies
- sender: AnthropicVertexChatGenerator.replies
receiver: OutputAdapter.replies
- sender: ChatPromptBuilder.prompt
receiver: AnthropicVertexChatGenerator.messages
inputs: # Define the inputs for your pipeline
query: # These components will receive the query as input
- "bm25_retriever.query"
- "query_embedder.text"
- "ranker.query"
- "answer_builder.query"
- "ChatPromptBuilder.query"
filters: # These components will receive a potential query filter as input
- "bm25_retriever.filters"
- "embedding_retriever.filters"
outputs: # Defines the output of your pipeline
documents: "meta_field_grouping_ranker.documents" # The output of the pipeline is the retrieved documents
answers: "answer_builder.answers" # The output of the pipeline is the generated answers
max_runs_per_component: 100
metadata: {}
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
messages | List[ChatMessage] | A list of ChatMessage instances representing the input messages. | |
streaming_callback | Optional[StreamingCallbackT] | None | A callback function that is called when a new token is received from the stream. |
generation_kwargs | Optional[Dict[str, Any]] | None | Optional arguments to pass to the Anthropic generation endpoint. |
tools | Optional[Union[List[Tool], Toolset]] | None | A list of Tool objects or a Toolset that the model can use. If set, it overrides the tools parameter set during initialization. |
Outputs
| Parameter | Type | Description |
|---|---|---|
replies | List[ChatMessage] | The responses from the model. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
region | str | The region where the Anthropic model is deployed. Defaults to "us-central1". | |
project_id | str | The GCP project ID where the Anthropic model is deployed. | |
model | str | claude-sonnet-4@20250514 | The name of the model to use. |
streaming_callback | Optional[Callable[[StreamingChunk], None]] | None | A callback function that is called when a new token is received from the stream. The callback function accepts StreamingChunk as an argument. |
generation_kwargs | Optional[Dict[str, Any]] | None | Other parameters to use for the model. These parameters are all sent directly to the AnthropicVertex endpoint. See Anthropic documentation for more details. Supported generation_kwargs parameters are: system (the system message), max_tokens (the maximum number of tokens to generate), metadata (a dictionary of metadata), stop_sequences (a list of strings that the model should stop generating at), temperature (the temperature to use for sampling), top_p, top_k, extra_headers (a dictionary of extra headers for beta features). |
ignore_tools_thinking_messages | bool | True | If True, drops "chain of thought" thinking messages when tool use is detected. See the Anthropic tools for more details. |
tools | Optional[List[Tool]] | None | A list of Tool objects that the model can use. Each tool should have a unique name. |
timeout | Optional[float] | None | Timeout for Anthropic client calls. If not set, defaults to the Anthropic client's default. |
max_retries | Optional[int] | None | Maximum number of retries to attempt for failed requests. If not set, defaults to the Anthropic client's default. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|
Related Information
Was this page helpful?