Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

LLM

Generate text using a large language model (LLM). This component provides a single interface for single-turn text generation without using tools.

Simplify Your Pipelines

LLM replaces ChatPromptBuilder and ChatGenerator components. If you're using ChatPromptBuilder and ChatGenerator components, you can simplify your pipelines by using the LLM component instead. For details, see Simplify Pipelines.

Unlike the Agent component, LLM only generates text, it doesn't call tools. You send messages to the component, it forwards them to the configured LLM provider, and it returns a single response.

LLM is model-agnostic, which means you can use it with any LLM provider, such as OpenAI, Azure, or any other compatible one.

Key Features

  • Flexible prompting: Supports system prompts, user prompts, and Jinja2 template variables in prompts.
  • Model agnostic: Works with any LLM.
  • Streaming support: Streams responses token by token.
  • Replaces ChatPromptBuilder and ChatGenerator components.

When to Use LLM versus Agent

If all you need is straightforward text generation without tools, choose LLM. For tool calling, multi-step reasoning, or exit conditions, use the Agent component instead.

Configuration

User Prompt

You must configure a user prompt with at least one variable. If you don't configure a user prompt, the LLM will not produce any output.

  1. Drag the LLM component from the Component Library onto the canvas.
  2. Click Model on the component card to open the configuration panel.
  3. On the General tab:
    1. Choose the model from the list. Make sure Haystack Platform is connected to the model provider. For help, see Add Integrations.
    2. Optionally, enter a system prompt to configure the model's behavior. System prompt is in a string format.
    3. Enter a user prompt with at least one variable. User prompt supports Jinja2 syntax, so you can use variables and functions in the prompt. For examples, see the Examples section below. For instructions on writing prompts, see Writing Prompts in Haystack Enterprise Platform.
Jinja in the LLM component

When you add or edit a prompt for the LLM component, you write regular text. To insert a variable, type @ and a list of available variables appears so you can pick one. To insert a function, type / and you see all functions you can add to the template. To see the raw Jinja2 syntax, enable the Jinja toggle. It's disabled by default.

  1. To configure the model's behavior, open the Advanced tab and set model parameters. The parameters are specific to the model you chose.

Connections

The LLM's input connections depend on the user prompt configured. By default, it accepts a list of ChatMessage objects through its messages input. It outputs a single ChatMessage object through its last_message output and a list of ChatMessage objects through its messages output. The messages output is a concatenated system prompt, messages provided as input, user prompt, and the LLM's answer appended at the end.

Last Message vs Messages

If LLM is the component that produces the final answer, the last_message output is the LLM's answer. That's the output you want to use as the final output of the pipeline. You can connect last_message to the Output component's messages input.

You can connect it directly to the query output of the Input component or to any component that produces ChatMessage or string output.

You can send its messages output to Input's messages input or to another LLM.

Usage Example

Basic LLM Configuration

This is the basic LLM configuration, with a model, system prompt, and user prompt. The user prompt includes the variable document that will be dynamically inserted into the prompt when the LLM runs.

  LLM:
type: haystack.components.generators.chat.llm.LLM
init_parameters:
chat_generator:
init_parameters:
model: gpt-5.4
type: haystack.components.generators.chat.openai_responses.OpenAIResponsesChatGenerator
system_prompt: |-
{% message role="system" %}
You are a helpful assistant that translates documents.
{% endmessage %}
user_prompt: |-
{% message role="user" %}
Translate the following document: {{ document }}
{% endmessage %}
required_variables:
- document
streaming_callback:

The Simplest Working Pipeline

To build the simplest pipeline with the LLM component:

  1. Drag the Input component onto the canvas.
  2. Drag the LLM component onto the canvas and configure its user prompt with at least one variable. A basic prompt could be Answer the question: {{ query }}.
  3. Connect Input's query output to the LLM's query input.
  4. Connect the LLM's last_message output to the Output's messages input.

You can now run the pipeline to test it. It uses the LLM's knowledge to answer questions. Pipeline YAML:

components:
LLM:
type: haystack.components.generators.chat.llm.LLM
init_parameters:
chat_generator:
init_parameters:
model: gpt-5.4
type: haystack.components.generators.chat.openai_responses.OpenAIResponsesChatGenerator
system_prompt: ''
user_prompt: |-
{% message role="user" %}
Answer the user query: {{ query }}
{% endmessage %}
required_variables: '*'
streaming_callback:

connections: []

max_runs_per_component: 100

metadata: {}

inputs:
query:
- LLM.query

outputs:
messages: LLM.last_message

Summarization Pipeline

Here's a pipeline that uses LLM to summarize retrieved documents:

components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
- ${OPENSEARCH_HOST}
index: standard-index
embedding_dim: 768
return_embedding: false
create_index: true
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
use_ssl: true
verify_certs: false
max_chunk_bytes: 104857600
method:
mappings:
settings:
timeout:
top_k: 10

llm:
type: haystack.components.generators.chat.llm.LLM
init_parameters:
chat_generator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-5-mini
system_prompt: "You are a helpful assistant. Answer the question based on the
provided documents. If the documents don't contain the answer, say so."
user_prompt: |-
{% message role="user"%}
Documents:
{% for document in documents %}
{{ document.content }}
{% endfor %}

Question: {{ question }}
{% endmessage %}
required_variables:
- documents
- question

connections:
- sender: bm25_retriever.documents
receiver: llm.documents

max_runs_per_component: 100

inputs:
query:
- bm25_retriever.query
question:
- llm.question

metadata: {}

outputs:
documents: bm25_retriever.documents
messages: llm.last_message


Parameters

Inputs

ParameterTypeDescription
messagesOptional[List[ChatMessage]]A list of ChatMessage objects to process.
streaming_callbackOptional[StreamingCallbackT]A callback function called when a new token is received from the stream.
generation_kwargsOptional[Dict[str, Any]]Additional keyword arguments for the LLM. These override the parameters passed during initialization.
system_promptOptional[str]The system prompt for the LLM. If provided, it overrides the system prompt set during initialization.
user_promptOptional[str]The user prompt for the LLM. If provided, it overrides the default user prompt and is appended to the messages provided at runtime.
**kwargsAnyAdditional keyword arguments used to fill template variables in the user_prompt. The keys must match the template variable names.

Outputs

ParameterTypeDescription
messagesList[ChatMessage]A list of all messages exchanged during the LLM's run.
last_messageChatMessageThe last message exchanged during the LLM's run.

Init parameters

You configure these parameters in Builder:

ParameterTypeDefaultDescription
chat_generatorChatGeneratorThe chat generator the LLM uses. This can be any Haystack chat generator, such as OpenAIChatGenerator or AzureOpenAIChatGenerator. Each chat generator exposes a SUPPORTED_MODELS class variable that lists the models it supports.
system_promptOptional[str]NoneA system prompt that defines how the model should behave.
user_promptOptional[str]NoneA user prompt appended to the messages at runtime. Supports Jinja2 template variables that become additional inputs for the run() method.
required_variablesOptional[List[str] or "*"]NoneVariables that must be provided as input to the user_prompt. If any are missing, an exception is raised. Set to "*" to require all variables found in the prompt.
streaming_callbackOptional[StreamingCallbackT]NoneA callback function that runs when a new token is received from the stream.

Run method parameters

You can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
messagesOptional[List[ChatMessage]]NoneA list of ChatMessage objects to process.
streaming_callbackOptional[StreamingCallbackT]NoneA callback function called when a new token is received from the stream.
generation_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for the underlying chat generator. These override the parameters set during initialization.
system_promptOptional[str]NoneThe system prompt for the LLM. If provided, it overrides the system prompt set during initialization.
user_promptOptional[str]NoneThe user prompt for the LLM. If provided, it overrides the default user prompt and is appended to the messages at runtime.
**kwargsAnyAdditional keyword arguments used to fill template variables in the user_prompt. The keys must match the template variable names.