QueryExpander
Generate multiple semantically similar query variations to improve retrieval recall. QueryExpander uses an LLM to create alternative phrasings of a query, which helps find relevant documents that might be missed with a single query formulation.
Key Features
- Generates multiple semantically similar query variations using an LLM.
- Improves retrieval recall by covering different phrasings of the same question.
- Works with any chat generator.
- Configurable number of query expansions.
- Optionally includes the original query in the output.
- Returns results in a structured JSON format compatible with multi-query retrievers.
Configuration
- Drag the
QueryExpandercomponent onto the canvas from the Component Library. - Click on the component to open the configuration panel.
- On the General tab:
- Set
n_expansionsto control how many alternative queries to generate (default: 4). - Toggle
include_original_queryto include or exclude the original query in the output.
- Set
- Go to the Advanced tab to configure a custom
chat_generatoror a customprompt_template.
Connections
QueryExpander receives a query string from the Input component. It outputs a queries list. Connect the queries output to MultiQueryTextRetriever or MultiQueryEmbeddingRetriever to retrieve documents for each generated query.
Source Code
To check this component's source code, open query_expander.py in the Haystack repository.
Usage Examples
Basic Configuration
query_expander:
type: haystack.components.query.query_expander.QueryExpander
init_parameters:
n_expansions: 3
include_original_query: true
chat_generator:
type: haystack_integrations.components.generators.anthropic.chat.chat_generator.AnthropicChatGenerator
This example shows how to perform retrieval with QueryExpander and MultiQueryTextRetriever. You can then send the retrieved documents to a Ranker or DocumentJoiner component to combine the results:
components:
query_expander:
type: haystack.components.query.query_expander.QueryExpander
init_parameters:
n_expansions: 3
include_original_query: true
chat_generator:
type: haystack_integrations.components.generators.anthropic.chat.chat_generator.AnthropicChatGenerator
init_parameters: {}
multi_query_retriever:
type: haystack.components.retrievers.multi_query_text_retriever.MultiQueryTextRetriever
init_parameters:
retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
top_k: 5
connections:
- sender: query_expander.queries
receiver: multi_query_retriever.queries
max_runs_per_component: 100
metadata: {}
inputs:
query:
- query_expander.query
Parameters
Inputs
| Parameter | Type | Description |
|---|---|---|
query | str | The original query to expand. |
n_expansions | Optional[int] | Number of additional queries to generate. |
Outputs
| Parameter | Type | Description |
|---|---|---|
queries | List[str] | A list of semantically similar queries generated for the original query. If include_original_query is True, it includes the original query and the expanded alternatives; otherwise, only the expanded queries. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
chat_generator | Optional[ChatGenerator] | None | The chat generator to use for query expansion. If None, a default OpenAIChatGenerator with gpt-4.1-mini is used. |
prompt_template | Optional[str] | None | Custom PromptBuilder template for query expansion. The template should instruct the LLM to return a JSON response with the structure: {"queries": ["query1", "query2", "query3"]}. The template should include query and n_expansions variables. |
n_expansions | int | 4 | Number of alternative queries to generate. |
include_original_query | bool | True | Whether to include the original query in the output. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
Run method parameters take precedence over initialization parameters.
| Parameter | Type | Default | Description |
|---|---|---|---|
query | str | The original query to expand. | |
n_expansions | Optional[int] | None | Number of additional queries to generate (not including the original). If None, uses the value from initialization. Can be zero to generate no additional queries. Must be positive. |
Related Information
Was this page helpful?