Skip to main content

LLM

Generate text using a large language model (LLM). This component provides a single interface for single-turn text generation without using tools. You can use it instead of the ChatPromptBuilder and ChatGenerator components to simplify your pipelines.

Unlike the Agent component, LLM only generates text, it doesn't call tools. You send messages to the component, it forwards them to the configured LLM provider, and it returns a single response.

LLM is model-agnostic, which means you can use it with any LLM provider, such as OpenAI, Azure, or any other compatible one.

Key Features

  • Flexible prompting: Supports system prompts, user prompts, and Jinja2 template variables in prompts.
  • Model agnostic: Works with any LLM.
  • Streaming support: Streams responses token by token.
  • Replaces ChatPromptBuilder and ChatGenerator components.

When to Use LLM versus Agent

If all you need is straightforward text generation without tools, choose LLM. For tool calling, multi-step reasoning, or exit conditions, use the Agent component instead.

Configuration

  1. Drag the LLM component from the Component Library onto the canvas.
  2. Click Model on the component card to open the configuration panel.
  3. On the General tab:
    1. Choose the model from the list. Make sure Haystack Platform is connected to the model provider. For help, see Add Integrations.
    2. Optionally, enter a system prompt to configure the model's behavior. System prompt is in a string format.
    3. Enter a user prompt. It supports Jinja2 syntax, so you can use variables and functions in the prompt. For examples, see the Examples section below. For instructions on writing prompts, see Writing Prompts in Haystack Enterprise Platform.
Jinja in the LLM component

When you add or edit a prompt for the LLM component, you write regular text. To insert a variable, type @ and a list of available variables appears so you can pick one. To insert a function, type / and you see all functions you can add to the template. To see the raw Jinja2 syntax, enable the Jinja toggle. It's disabled by default.

  1. To configure the model's behavior, open the Advanced tab and set model parameters. The parameters are specific to the model you chose.

Connections

The LLM's input connections depend on the user prompt configured. By default, it accepts a list of ChatMessage objects through its messages input. It outputs a single ChatMessage object through its last_message output and a list of ChatMessage objects through its messages output. The messages output is a concatenated system prompt, messages provided as input, user prompt (if provided), and the LLM's answer appended at the end.

Last Message vs Messages

If LLM is the component that produces the final answer, the last_message output is the LLM's answer. That's the output you want to use as the final output of the pipeline.

You can connect it directly to the query output of the Input component or to any component that produces ChatMessage or string output.

You can send its messages output to AnswerBuilder or to another LLM.

Usage Example

The Simplest Working Pipeline

To build the simplest pipeline with the LLM component:

  1. Drag the LLM component onto the canvas and configure it.
  2. Connect Input's query output to the LLM's messages input.
  3. Drag DeepsetAnswerBuilder onto the canvas.
  4. Connect the LLM's last_message output to the replies input of DeepsetAnswerBuilder.
  5. Connect Input's query output to the query input of DeepsetAnswerBuilder.
  6. Connect DeepsetAnswerBuilder's answers output to the Output's answers input.

You can now run the pipeline to test it. It uses the LLM's knowledge to answer questions.

The simplest LLM pipeline
Pipeline YAML
components:
LLM:
type: haystack.components.generators.chat.llm.LLM
init_parameters:
chat_generator:
init_parameters:
model: gpt-5.2
type: haystack.components.generators.chat.openai_responses.OpenAIResponsesChatGenerator
system_prompt: You are a helpful assistant that answers user's questions.
user_prompt: "Answer the question: {{ query }} "
required_variables:
streaming_callback:

DeepsetAnswerBuilder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
pattern:
reference_pattern:
extract_xml_tags:

connections:
- sender: LLM.last_message
receiver: DeepsetAnswerBuilder.replies

max_runs_per_component: 100

metadata: {}

inputs:
query:
- LLM.messages
- DeepsetAnswerBuilder.query

outputs:
answers: DeepsetAnswerBuilder.answers


Summarization Pipeline

Here's a pipeline that uses LLM to summarize retrieved documents:

components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
- ${OPENSEARCH_HOST}
index: standard-index
embedding_dim: 768
return_embedding: false
create_index: true
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
use_ssl: true
verify_certs: false
max_chunk_bytes: 104857600
method:
mappings:
settings:
timeout:
top_k: 10

llm:
type: haystack.components.generators.chat.llm.LLM
init_parameters:
chat_generator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-5-mini
system_prompt: "You are a helpful assistant. Answer the question based on the
provided documents. If the documents don't contain the answer, say so."
user_prompt: |-
{% message role="user"%}
Documents:
{% for document in documents %}
{{ document.content }}
{% endfor %}

Question: {{ question }}
{% endmessage %}
required_variables:
- documents
- question
AnswerBuilder:
type: haystack.components.builders.answer_builder.AnswerBuilder
init_parameters:
pattern:
reference_pattern:
last_message_only: false
return_only_referenced_documents: true

connections:
- sender: bm25_retriever.documents
receiver: llm.documents
- sender: llm.messages
receiver: AnswerBuilder.replies

max_runs_per_component: 100

inputs:
query:
- bm25_retriever.query
- AnswerBuilder.query
question:
- llm.question

metadata: {}

outputs:
answers: AnswerBuilder.answers
documents: bm25_retriever.documents

Parameters

Inputs

ParameterTypeDescription
messagesOptional[List[ChatMessage]]A list of ChatMessage objects to process.
streaming_callbackOptional[StreamingCallbackT]A callback function called when a new token is received from the stream.
generation_kwargsOptional[Dict[str, Any]]Additional keyword arguments for the LLM. These override the parameters passed during initialization.
system_promptOptional[str]The system prompt for the LLM. If provided, it overrides the system prompt set during initialization.
user_promptOptional[str]The user prompt for the LLM. If provided, it overrides the default user prompt and is appended to the messages provided at runtime.
**kwargsAnyAdditional keyword arguments used to fill template variables in the user_prompt. The keys must match the template variable names.

Outputs

ParameterTypeDescription
messagesList[ChatMessage]A list of all messages exchanged during the LLM's run.
last_messageChatMessageThe last message exchanged during the LLM's run.

Init parameters

You configure these parameters in Builder:

ParameterTypeDefaultDescription
chat_generatorChatGeneratorThe chat generator the LLM uses. This can be any Haystack chat generator, such as OpenAIChatGenerator or AzureOpenAIChatGenerator.
system_promptOptional[str]NoneA system prompt that defines how the model should behave.
user_promptOptional[str]NoneA user prompt appended to the messages at runtime. Supports Jinja2 template variables that become additional inputs for the run() method.
required_variablesOptional[List[str] or "*"]NoneVariables that must be provided as input to the user_prompt. If any are missing, an exception is raised. Set to "*" to require all variables found in the prompt.
streaming_callbackOptional[StreamingCallbackT]NoneA callback function that runs when a new token is received from the stream.

Run method parameters

You can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
messagesOptional[List[ChatMessage]]NoneA list of ChatMessage objects to process.
streaming_callbackOptional[StreamingCallbackT]NoneA callback function called when a new token is received from the stream.
generation_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for the underlying chat generator. These override the parameters set during initialization.
system_promptOptional[str]NoneThe system prompt for the LLM. If provided, it overrides the system prompt set during initialization.
user_promptOptional[str]NoneThe user prompt for the LLM. If provided, it overrides the default user prompt and is appended to the messages at runtime.
**kwargsAnyAdditional keyword arguments used to fill template variables in the user_prompt. The keys must match the template variable names.