Skip to main content

AnthropicGenerator

Generate text using large language models (LLMs) by Anthropic.

Basic Information

  • Type: haystack_integrations.components.generators.anthropic.generator.AnthropicGenerator
  • Components it can connect with:
    • PromptBuilder: AnthropicGenerator can receive instructions from PromptBuilder.
    • DeepsetAnswerBuilder: AnthropicGenerator can send generated replies to DeepsetAnswerBuilder that uses them to return answers with references.

Inputs

ParameterTypeDefaultDescription
promptstrThe instructions for the model.
generation_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for text generation.
streaming_callbackOptional[Callable[[StreamingChunk], None]]NoneAn optional callback function to handle streaming chunks.

Outputs

ParameterTypeDefaultDescription
repliesList[str]A list of generated replies.
metaList[Dict[str, Any]]A list of metadata dictionaries for each reply.

Overview

For a complete list of models that work with this generator, see Anthropic documentation. Although Anthropic natively supports a much richer messaging API, this component intentionally simplifies it so that the main input and output interface is string-based. For more complete support, consider using AnthropicChatGenerator.

Authentication

To use this component, connect deepset with Anthropic first. You'll need an Anthropic API key to do this.

Connection Instructions

  1. Click your profile icon in the top right corner and choose Integrations.
    Integrations menu screenshot
  2. Click Connect next to the provider.
  3. Enter your API key and submit it.

Usage Example

Initializing the Component

components:
AnthropicGenerator:
type: haystack_integrations.components.generators.anthropic.generator.AnthropicGenerator
init_parameters:

Using the Component in a Pipeline

This is a RAG pipeline that uses Claude Sonnet 4:

# If you need help with the YAML format, have a look at https://docs.cloud.deepset.ai/v2.0/docs/create-a-pipeline#create-a-pipeline-using-pipeline-editor.
# This section defines components that you want to use in your pipelines. Each component must have a name and a type. You can also set the component's parameters here.
# The name is up to you, you can give your component a friendly name. You then use components' names when specifying the connections in the pipeline.
# Type is the class path of the component. You can check the type on the component's documentation page.
components:
retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.open_search_hybrid_retriever.OpenSearchHybridRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: ''
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20 # The number of results to return
fuzziness: 0
embedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
normalize_embeddings: true
model: intfloat/multilingual-e5-base

ranker:
type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
init_parameters:
model: svalabs/cross-electra-ms-marco-german-uncased
top_k: 8

meta_field_grouping_ranker:
type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
init_parameters:
group_by: file_id
subgroup_by:
sort_docs_by: split_id

prompt_builder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
required_variables: "*"
template: |-
Du bist ein technischer Experte.
Du beantwortest die Fragen wahrheitsgemäß auf Grundlage der vorgelegten Dokumente.
Wenn die Antwort in mehreren Dokumenten enthalten ist, fasse diese zusammen.
Ignoriere Dokumente, die keine Antwort auf die Frage enthalten.
Antworte nur auf der Grundlage der vorgelegten Dokumente. Erfinde keine Fakten.
Wenn in dem Dokument keine Informationen zu der Frage gefunden werden können, gib dies an.
Verwende immer Verweise in der Form [NUMMER DES DOKUMENTS], wenn du Informationen aus einem Dokument verwendest, z. B. [3] für Dokument [3] .
Nenne die Dokumente nie, sondern gebe nur eine Zahl in eckigen Klammern als Referenz an.
Der Verweis darf sich nur auf die Nummer beziehen, die in eckigen Klammern hinter der Passage steht.
Andernfalls verwende in deiner Antwort keine Klammern und gib NUR die Nummer des Dokuments an, ohne das Wort Dokument zu erwähnen.
Gebe eine präzise, exakte und strukturierte Antwort ohne die Frage zu wiederholen.

Hier sind die Dokumente:
{%- if documents|length > 0 %}
{%- for document in documents %}
Dokument [{{ loop.index }}] :
Name der Quelldatei: {{ document.meta.file_name }}
{{ document.content }}
{% endfor -%}
{%- else %}
Keine Dokumente gefunden.
Sage "Es wurden leider keine passenden Dokumente gefunden, bitte passen sie die Filter an oder versuchen es mit einer veränderten Frage."
{% endif %}

Frage: {{ question }}
Antwort:

answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm
# extract_xml_tags: # uncomment to move thinking part into answer's meta
# - thinking

file_downloader:
type: deepset_cloud_custom_nodes.augmenters.deepset_file_downloader.DeepsetFileDownloader
init_parameters:
file_extensions:
- .txt
- .pdf
- .md
- .docx
- .csv
- .xlsx
- .html
- .htm
- .pptx
sources_target_type: haystack.dataclasses.ByteStream

attachments_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
weights:
top_k:
sort_by_score: true

multi_file_converter:
type: haystack.core.super_component.super_component.SuperComponent
init_parameters:
input_mapping:
sources:
- file_classifier.sources
is_pipeline_async: false
output_mapping:
score_adder.output: documents
pipeline:
components:
file_classifier:
type: haystack.components.routers.file_type_router.FileTypeRouter
init_parameters:
mime_types:
- text/plain
- application/pdf
- text/markdown
- text/html
- application/vnd.openxmlformats-officedocument.wordprocessingml.document
- application/vnd.openxmlformats-officedocument.presentationml.presentation
- application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
- text/csv

text_converter:
type: haystack.components.converters.txt.TextFileToDocument
init_parameters:
encoding: utf-8

pdf_converter:
type: haystack.components.converters.pdfminer.PDFMinerToDocument
init_parameters:
line_overlap: 0.5
char_margin: 2
line_margin: 0.5
word_margin: 0.1
boxes_flow: 0.5
detect_vertical: true
all_texts: false
store_full_path: false

markdown_converter:
type: haystack.components.converters.txt.TextFileToDocument
init_parameters:
encoding: utf-8

html_converter:
type: haystack.components.converters.html.HTMLToDocument
init_parameters:
# A dictionary of keyword arguments to customize how you want to extract content from your HTML files.
# For the full list of available arguments, see
# the [Trafilatura documentation](https://trafilatura.readthedocs.io/en/latest/corefunctions.html#extract).
extraction_kwargs:
output_format: markdown # Extract text from HTML. You can also also choose "txt"
target_language: # You can define a language (using the ISO 639-1 format) to discard documents that don't match that language.
include_tables: true # If true, includes tables in the output
include_links: true # If true, keeps links along with their targets

docx_converter:
type: haystack.components.converters.docx.DOCXToDocument
init_parameters:
link_format: markdown

pptx_converter:
type: haystack.components.converters.pptx.PPTXToDocument
init_parameters: {}

xlsx_converter:
type: haystack.components.converters.xlsx.XLSXToDocument
init_parameters: {}

csv_converter:
type: haystack.components.converters.csv.CSVToDocument
init_parameters:
encoding: utf-8

splitter:
type: haystack.components.preprocessors.document_splitter.DocumentSplitter
init_parameters:
split_by: word
split_length: 250
split_overlap: 30
respect_sentence_boundary: true
language: en

score_adder:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: |
{%- set scored_documents = [] -%}
{%- for document in documents -%}
{%- set doc_dict = document.to_dict() -%}
{%- set _ = doc_dict.update({'score': 100.0}) -%}
{%- set scored_doc = document.from_dict(doc_dict) -%}
{%- set _ = scored_documents.append(scored_doc) -%}
{%- endfor -%}
{{ scored_documents }}
output_type: List[haystack.Document]
custom_filters:
unsafe: true

text_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
sort_by_score: false

tabular_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
sort_by_score: false
connections:
- sender: file_classifier.text/plain
receiver: text_converter.sources
- sender: file_classifier.application/pdf
receiver: pdf_converter.sources
- sender: file_classifier.text/markdown
receiver: markdown_converter.sources
- sender: file_classifier.text/html
receiver: html_converter.sources
- sender: file_classifier.application/vnd.openxmlformats-officedocument.wordprocessingml.document
receiver: docx_converter.sources
- sender: file_classifier.application/vnd.openxmlformats-officedocument.presentationml.presentation
receiver: pptx_converter.sources
- sender: file_classifier.application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
receiver: xlsx_converter.sources
- sender: file_classifier.text/csv
receiver: csv_converter.sources
- sender: text_joiner.documents
receiver: splitter.documents
- sender: text_converter.documents
receiver: text_joiner.documents
- sender: pdf_converter.documents
receiver: text_joiner.documents
- sender: markdown_converter.documents
receiver: text_joiner.documents
- sender: html_converter.documents
receiver: text_joiner.documents
- sender: pptx_converter.documents
receiver: text_joiner.documents
- sender: docx_converter.documents
receiver: text_joiner.documents
- sender: xlsx_converter.documents
receiver: tabular_joiner.documents
- sender: csv_converter.documents
receiver: tabular_joiner.documents
- sender: splitter.documents
receiver: tabular_joiner.documents
- sender: tabular_joiner.documents
receiver: score_adder.documents

AnthropicGenerator:
type: haystack_integrations.components.generators.anthropic.generator.AnthropicGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- ANTHROPIC_API_KEY
strict: false
model: claude-sonnet-4-20250514
streaming_callback:
system_prompt:
generation_kwargs:

connections: # Defines how the components are connected
- sender: retriever.documents
receiver: ranker.documents
- sender: ranker.documents
receiver: meta_field_grouping_ranker.documents
- sender: prompt_builder.prompt
receiver: answer_builder.prompt
- sender: file_downloader.sources
receiver: multi_file_converter.sources
- sender: multi_file_converter.documents
receiver: attachments_joiner.documents

- sender: meta_field_grouping_ranker.documents
receiver: attachments_joiner.documents
- sender: attachments_joiner.documents
receiver: answer_builder.documents
- sender: attachments_joiner.documents
receiver: prompt_builder.documents
- sender: prompt_builder.prompt
receiver: AnthropicGenerator.prompt
- sender: AnthropicGenerator.replies
receiver: answer_builder.replies

inputs: # Define the inputs for your pipeline
query: # These components will receive the query as input
- "retriever.query"
- "ranker.query"
- "prompt_builder.question"
- "answer_builder.query"

filters: # These components will receive a potential query filter as input
- "retriever.filters_bm25"
- "retriever.filters_embedding"

files:
- file_downloader.sources

outputs: # Defines the output of your pipeline
documents: "attachments_joiner.documents" # The output of the pipeline is the retrieved documents
answers: "answer_builder.answers" # The output of the pipeline is the generated answers

max_runs_per_component: 100

metadata: {}

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
api_keySecretSecret.from_env_var('ANTHROPIC_API_KEY')The Anthropic API key.
modelstrclaude-sonnet-4-20250514The name of the Anthropic model to use.
streaming_callbackOptional[Callable[[StreamingChunk], None]]NoneAn optional callback function to handle streaming chunks.
system_promptOptional[str]NoneAn optional system prompt to use for generation.
generation_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for generation.
timeoutOptional[float]NoneThe timeout for request.
max_retriesOptional[int]NoneThe maximum number of retries if a request fails.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
promptstrThe prompt with instructions for the model.
generation_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for generation. For a complete list, see Anthropic API documentation.
streaming_callbackOptional[Callable[[StreamingChunk], None]]NoneAn optional callback function to handle streaming chunks.