Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

AnthropicVertexChatGenerator

Generate text using Claude models through the Anthropic Vertex AI API.

Key Features

  • Chat completion using Claude models (including Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku) through Vertex AI
  • Streaming support for real-time token-by-token responses
  • Tool/function calling support
  • Configurable generation parameters (temperature, top_p, max_tokens, and more)
  • Requires a GCP project with Vertex AI enabled and the desired Claude model activated in the Vertex AI Model Garden

Configuration

  1. Drag the AnthropicVertexChatGenerator component onto the canvas from the Component Library.
  2. Click on the component to open the configuration panel.
  3. On the General tab:
    1. Enter your GCP region (defaults to "us-central1") and project ID. Create secrets called REGION and PROJECT_ID for these values. For details, see Add a secret.
    2. Select a model. Make sure the model is activated in the Vertex AI Model Garden. For details, see Use Anthropic Models.
  4. Go to the Advanced tab to configure generation parameters, timeout, max retries, tools, and streaming.

Connections

AnthropicVertexChatGenerator accepts a list of ChatMessage objects through its messages input and outputs generated responses as replies (a list of ChatMessage instances).

Connect ChatPromptBuilder's prompt output to this component's messages input. Connect the replies output to DeepsetAnswerBuilder through OutputAdapter.

Source Code

To check this component's source code, open vertex_chat_generator.py in the Haystack Core Integrations repository.

Usage Examples

Basic Configuration

  AnthropicVertexChatGenerator:
type: haystack_integrations.components.generators.anthropic.chat.vertex_chat_generator.AnthropicVertexChatGenerator
init_parameters:
model: claude-3-5-sonnet@20240620
ignore_tools_thinking_messages: true

Using the Component in a Pipeline

This is an example of a RAG pipeline with AnthropicVertexChatGenerator:

components:
bm25_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'Standard-Index-English'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20 # The number of results to return
fuzziness: 0

query_embedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
normalize_embeddings: true
model: intfloat/e5-base-v2

embedding_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'Standard-Index-English'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20 # The number of results to return

document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate

ranker:
type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
init_parameters:
model: intfloat/simlm-msmarco-reranker
top_k: 8

meta_field_grouping_ranker:
type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
init_parameters:
group_by: file_id
subgroup_by:
sort_docs_by: split_id

answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm

ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- _content:
- text: "You are a helpful assistant answering the user's questions based on the provided documents.\nIf the answer is not in the documents, rely on the web_search tool to find information.\nDo not use your own knowledge.\n"
_role: system
- _content:
- text: "Provided documents:\n{% for document in documents %}\nDocument [{{ loop.index }}] :\n{{ document.content }}\n{% endfor %}\n\nQuestion: {{ query }}\n"
_role: user
required_variables:
variables:
OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]
custom_filters:
unsafe: false

AnthropicVertexChatGenerator:
type: haystack_integrations.components.generators.anthropic.chat.vertex_chat_generator.AnthropicVertexChatGenerator
init_parameters:
region:
project_id:
model: claude-3-5-sonnet@20240620
streaming_callback:
generation_kwargs:
ignore_tools_thinking_messages: true
tools:

connections: # Defines how the components are connected
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: query_embedder.embedding
receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: document_joiner.documents
receiver: ranker.documents
- sender: ranker.documents
receiver: meta_field_grouping_ranker.documents
- sender: meta_field_grouping_ranker.documents
receiver: answer_builder.documents
- sender: meta_field_grouping_ranker.documents
receiver: ChatPromptBuilder.documents
- sender: OutputAdapter.output
receiver: answer_builder.replies
- sender: AnthropicVertexChatGenerator.replies
receiver: OutputAdapter.replies
- sender: ChatPromptBuilder.prompt
receiver: AnthropicVertexChatGenerator.messages

inputs: # Define the inputs for your pipeline
query: # These components will receive the query as input
- "bm25_retriever.query"
- "query_embedder.text"
- "ranker.query"
- "answer_builder.query"
- "ChatPromptBuilder.query"
filters: # These components will receive a potential query filter as input
- "bm25_retriever.filters"
- "embedding_retriever.filters"

outputs: # Defines the output of your pipeline
documents: "meta_field_grouping_ranker.documents" # The output of the pipeline is the retrieved documents
answers: "answer_builder.answers" # The output of the pipeline is the generated answers

max_runs_per_component: 100

metadata: {}

Parameters

Inputs

ParameterTypeDefaultDescription
messagesList[ChatMessage]A list of ChatMessage instances representing the input messages.
streaming_callbackOptional[StreamingCallbackT]NoneA callback function that is called when a new token is received from the stream.
generation_kwargsOptional[Dict[str, Any]]NoneOptional arguments to pass to the Anthropic generation endpoint.
toolsOptional[Union[List[Tool], Toolset]]NoneA list of Tool objects or a Toolset that the model can use. If set, it overrides the tools parameter set during initialization.

Outputs

ParameterTypeDescription
repliesList[ChatMessage]The responses from the model.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
regionstrThe region where the Anthropic model is deployed. Defaults to "us-central1".
project_idstrThe GCP project ID where the Anthropic model is deployed.
modelstrclaude-sonnet-4@20250514The name of the model to use.
streaming_callbackOptional[Callable[[StreamingChunk], None]]NoneA callback function that is called when a new token is received from the stream. The callback function accepts StreamingChunk as an argument.
generation_kwargsOptional[Dict[str, Any]]NoneOther parameters to use for the model. These parameters are all sent directly to the AnthropicVertex endpoint. See Anthropic documentation for more details. Supported generation_kwargs parameters are: system (the system message), max_tokens (the maximum number of tokens to generate), metadata (a dictionary of metadata), stop_sequences (a list of strings that the model should stop generating at), temperature (the temperature to use for sampling), top_p, top_k, extra_headers (a dictionary of extra headers for beta features).
ignore_tools_thinking_messagesboolTrueIf True, drops "chain of thought" thinking messages when tool use is detected. See the Anthropic tools for more details.
toolsOptional[List[Tool]]NoneA list of Tool objects that the model can use. Each tool should have a unique name.
timeoutOptional[float]NoneTimeout for Anthropic client calls. If not set, defaults to the Anthropic client's default.
max_retriesOptional[int]NoneMaximum number of retries to attempt for failed requests. If not set, defaults to the Anthropic client's default.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription