Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

QueryExpander

Generate multiple semantically similar query variations to improve retrieval recall. QueryExpander uses an LLM to create alternative phrasings of a query, which helps find relevant documents that might be missed with a single query formulation.

Key Features

  • Generates multiple semantically similar query variations using an LLM.
  • Improves retrieval recall by covering different phrasings of the same question.
  • Works with any chat generator.
  • Configurable number of query expansions.
  • Optionally includes the original query in the output.
  • Returns results in a structured JSON format compatible with multi-query retrievers.

Configuration

  1. Drag the QueryExpander component onto the canvas from the Component Library.
  2. Click on the component to open the configuration panel.
  3. On the General tab:
    • Set n_expansions to control how many alternative queries to generate (default: 4).
    • Toggle include_original_query to include or exclude the original query in the output.
  4. Go to the Advanced tab to configure a custom chat_generator or a custom prompt_template.

Connections

QueryExpander receives a query string from the Input component. It outputs a queries list. Connect the queries output to MultiQueryTextRetriever or MultiQueryEmbeddingRetriever to retrieve documents for each generated query.

Source Code

To check this component's source code, open query_expander.py in the Haystack repository.

Usage Examples

Basic Configuration

  query_expander:
type: haystack.components.query.query_expander.QueryExpander
init_parameters:
n_expansions: 3
include_original_query: true
chat_generator:
type: haystack_integrations.components.generators.anthropic.chat.chat_generator.AnthropicChatGenerator

This example shows how to perform retrieval with QueryExpander and MultiQueryTextRetriever. You can then send the retrieved documents to a Ranker or DocumentJoiner component to combine the results:

components:
query_expander:
type: haystack.components.query.query_expander.QueryExpander
init_parameters:
n_expansions: 3
include_original_query: true

chat_generator:
type: haystack_integrations.components.generators.anthropic.chat.chat_generator.AnthropicChatGenerator
init_parameters: {}
multi_query_retriever:
type: haystack.components.retrievers.multi_query_text_retriever.MultiQueryTextRetriever
init_parameters:
retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
top_k: 5

connections:
- sender: query_expander.queries
receiver: multi_query_retriever.queries

max_runs_per_component: 100

metadata: {}

inputs:
query:
- query_expander.query

Parameters

Inputs

ParameterTypeDescription
querystrThe original query to expand.
n_expansionsOptional[int]Number of additional queries to generate.

Outputs

ParameterTypeDescription
queriesList[str]A list of semantically similar queries generated for the original query. If include_original_query is True, it includes the original query and the expanded alternatives; otherwise, only the expanded queries.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
chat_generatorOptional[ChatGenerator]NoneThe chat generator to use for query expansion. If None, a default OpenAIChatGenerator with gpt-4.1-mini is used.
prompt_templateOptional[str]NoneCustom PromptBuilder template for query expansion. The template should instruct the LLM to return a JSON response with the structure: {"queries": ["query1", "query2", "query3"]}. The template should include query and n_expansions variables.
n_expansionsint4Number of alternative queries to generate.
include_original_queryboolTrueWhether to include the original query in the output.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

Run method parameters take precedence over initialization parameters.

ParameterTypeDefaultDescription
querystrThe original query to expand.
n_expansionsOptional[int]NoneNumber of additional queries to generate (not including the original). If None, uses the value from initialization. Can be zero to generate no additional queries. Must be positive.