AnthropicVertexChatGenerator
Generate text using Claude models through the Anthropic Vertex AI API.
Key Features
- Supports Claude models (for example, Claude 3.5 Sonnet, Claude 3 Opus) through the Vertex AI API endpoint.
- Returns responses in the
ChatMessageformat for chat-based pipelines. - Supports tool and function calling with configurable tool behavior.
- Configurable generation parameters including temperature, top_p, max tokens, and stop sequences.
- Supports streaming for real-time token delivery.
Configuration
Connect Haystack Platform to Anthropic on the Integrations page first. For details, see Use Anthropic Models.
You also need a GCP project with Vertex AI enabled. Create secrets for REGION and PROJECT_ID. For details on secrets, see Add a secret.
- Drag the
AnthropicVertexChatGeneratorcomponent onto the canvas from the Component Library. - Click the component to open the configuration panel.
- On the General tab:
- Enter the model name (for example,
claude-sonnet-4@20250514).
- Enter the model name (for example,
- Go to the Advanced tab to configure the region, project ID, generation kwargs, streaming callback, tools, timeout, and max retries.
Connections
AnthropicVertexChatGenerator accepts a list of ChatMessage instances as input (messages). It outputs a list of ChatMessage replies (replies).
Typically, you connect ChatPromptBuilder to the messages input to build the prompt. Connect the replies output to OutputAdapter and then to DeepsetAnswerBuilder to format the final answer.
Usage Example
Using the Component in a Pipeline
This is an example of a RAG pipeline with AnthropicVertexChatGenerator:
components:
bm25_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'Standard-Index-English'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20 # The number of results to return
fuzziness: 0
query_embedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
normalize_embeddings: true
model: intfloat/e5-base-v2
embedding_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'Standard-Index-English'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20 # The number of results to return
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
ranker:
type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
init_parameters:
model: intfloat/simlm-msmarco-reranker
top_k: 8
meta_field_grouping_ranker:
type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
init_parameters:
group_by: file_id
subgroup_by:
sort_docs_by: split_id
answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm
ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- _content:
- text: "You are a helpful assistant answering the user's questions based on the provided documents.\nIf the answer is not in the documents, rely on the web_search tool to find information.\nDo not use your own knowledge.\n"
_role: system
- _content:
- text: "Provided documents:\n{% for document in documents %}\nDocument [{{ loop.index }}] :\n{{ document.content }}\n{% endfor %}\n\nQuestion: {{ query }}\n"
_role: user
required_variables:
variables:
OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]
custom_filters:
unsafe: false
AnthropicVertexChatGenerator:
type: haystack_integrations.components.generators.anthropic.chat.vertex_chat_generator.AnthropicVertexChatGenerator
init_parameters:
region:
project_id:
model: claude-3-5-sonnet@20240620
streaming_callback:
generation_kwargs:
ignore_tools_thinking_messages: true
tools:
connections: # Defines how the components are connected
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: query_embedder.embedding
receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: document_joiner.documents
receiver: ranker.documents
- sender: ranker.documents
receiver: meta_field_grouping_ranker.documents
- sender: meta_field_grouping_ranker.documents
receiver: answer_builder.documents
- sender: meta_field_grouping_ranker.documents
receiver: ChatPromptBuilder.documents
- sender: OutputAdapter.output
receiver: answer_builder.replies
- sender: AnthropicVertexChatGenerator.replies
receiver: OutputAdapter.replies
- sender: ChatPromptBuilder.prompt
receiver: AnthropicVertexChatGenerator.messages
inputs: # Define the inputs for your pipeline
query: # These components will receive the query as input
- "bm25_retriever.query"
- "query_embedder.text"
- "ranker.query"
- "answer_builder.query"
- "ChatPromptBuilder.query"
filters: # These components will receive a potential query filter as input
- "bm25_retriever.filters"
- "embedding_retriever.filters"
outputs: # Defines the output of your pipeline
documents: "meta_field_grouping_ranker.documents" # The output of the pipeline is the retrieved documents
answers: "answer_builder.answers" # The output of the pipeline is the generated answers
max_runs_per_component: 100
metadata: {}
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| messages | List[ChatMessage] | A list of ChatMessage instances representing the input messages. | |
| streaming_callback | Optional[StreamingCallbackT] | None | A callback function that is called when a new token is received from the stream. |
| generation_kwargs | Optional[Dict[str, Any]] | None | Optional arguments to pass to the Anthropic generation endpoint. |
| tools | Optional[Union[List[Tool], Toolset]] | None | A list of Tool objects or a Toolset that the model can use. Each tool should have a unique name. If set, it will override the tools parameter set during component initialization. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| replies | List[ChatMessage] | The responses from the model. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| region | str | The region where the Anthropic model is deployed. Defaults to "us-central1". | |
| project_id | str | The GCP project ID where the Anthropic model is deployed. | |
| model | str | claude-sonnet-4@20250514 | The name of the model to use. |
| streaming_callback | Optional[Callable[[StreamingChunk], None]] | None | A callback function that is called when a new token is received from the stream. The callback function accepts StreamingChunk as an argument. |
| generation_kwargs | Optional[Dict[str, Any]] | None | Other parameters to use for the model. These parameters are all sent directly to the AnthropicVertex endpoint. See Anthropic documentation for more details. Supported generation_kwargs parameters are: - system: The system message to be passed to the model. - max_tokens: The maximum number of tokens to generate. - metadata: A dictionary of metadata to be passed to the model. - stop_sequences: A list of strings that the model should stop generating at. - temperature: The temperature to use for sampling. - top_p: The top_p value to use for nucleus sampling. - top_k: The top_k value to use for top-k sampling. - extra_headers: A dictionary of extra headers to be passed to the model (that is for beta features). |
| ignore_tools_thinking_messages | bool | True | Anthropic's approach to tools (function calling) resolution involves a "chain of thought" messages before returning the actual function names and parameters in a message. If ignore_tools_thinking_messages is True, the generator will drop so-called thinking messages when tool use is detected. See the Anthropic tools for more details. |
| tools | Optional[List[Tool]] | None | A list of Tool objects that the model can use. Each tool should have a unique name. |
| timeout | Optional[float] | None | Timeout for Anthropic client calls. If not set, it defaults to the default set by the Anthropic client. |
| max_retries | Optional[int] | None | Maximum number of retries to attempt for failed requests. If not set, it defaults to the default set by the Anthropic client. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|
Was this page helpful?