Skip to main content

FallbackChatGenerator

Use FallbackChatGenerator to try multiple chat generators sequentially. If one generator fails, it falls back to the next one in the list. If all generators fail, it raises an error with all the details.

Basic Information

  • Type: haystack.components.generators.chat.fallback.FallbackChatGenerator
  • Components it can connect with:
    • Any component that produces messages, such as ChatPromptBuilder.
    • Any component that consumes replies, such as DocumentWriter.

Inputs

ParameterTypeDefaultDescription
messagesList[ChatMessage]The conversation history as a list of ChatMessage objects.
generation_kwargsOptional[Dict[str, Any]]NoneOptional parameters for the chat generator (for example, temperature, max_tokens).
toolsOptional[ToolsType]NoneA list of Tool or Toolset objects the generator can use.
streaming_callbackOptional[StreamingCallbackT]NoneOptional callable for handling streaming responses.

Outputs

ParameterTypeDefaultDescription
repliesList[ChatMessage]Generated ChatMessage objects from the first successful generator.
metaDict[str, Any]Execution metadata including successful_chat_generator_index, successful_chat_generator_class, total_attempts, failed_chat_generators, and any metadata from the successful generator.

Overview

The FallbackChatGenerator is a chat generator wrapper that tries multiple chat generators sequentially. It forwards all parameters transparently to the underlying chat generators and returns the first successful result. If a chat generator fails, it falls back to the next one in the list. If all chat generators fail, it raises a RuntimeError with details.

Failover is automatically triggered when a generator raises any exception, including:

  • Timeout errors
  • Rate limit errors (429)
  • Authentication errors (401)
  • Context length errors (400)
  • Server errors (500+)

To control the timeout of each generator, set the timeout parameter in the generator's init parameters. For example:

components:
OpenAIChatGenerator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
timeout: 30

Usage Example

This query pipeline uses FallbackChatGenerator to try GPT-4o first and fall back to GPT-4o-mini if it fails:

components:
ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- _content:
- text: "You are a helpful assistant that answers questions concisely."
_role: system
- _content:
- text: "Question: {{ question }}"
_role: user
required_variables:
variables:

FallbackChatGenerator:
type: haystack.components.generators.chat.fallback.FallbackChatGenerator
init_parameters:
chat_generators:
- type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
model: gpt-4o
generation_kwargs:
temperature: 0.7
max_tokens: 500
- type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
model: gpt-4o-mini
generation_kwargs:
temperature: 0.7
max_tokens: 500

AnswerBuilder:
type: haystack.components.builders.answer_builder.AnswerBuilder
init_parameters:
pattern:
reference_pattern:

connections:
- sender: ChatPromptBuilder.prompt
receiver: FallbackChatGenerator.messages
- sender: FallbackChatGenerator.replies
receiver: AnswerBuilder.replies

inputs:
query:
- ChatPromptBuilder.question
- AnswerBuilder.query

outputs:
answers: AnswerBuilder.answers

Parameters

Init Parameters

These are the parameters you can configure in Builder:

ParameterTypeDefaultDescription
chat_generatorsList[ChatGenerator]A list of chat generator components to try in order.

Run Method Parameters

These are the parameters you can configure for the component's run() method.

ParameterTypeDefaultDescription
messagesList[ChatMessage]The conversation history as a list of ChatMessage objects.
generation_kwargsOptional[Dict[str, Any]]NoneOptional parameters for the chat generator.
toolsOptional[ToolsType]NoneA list of Tool and/or Toolset objects for the generators to use.
streaming_callbackOptional[StreamingCallbackT]NoneOptional callable for handling streaming responses.