MultiQueryTextRetriever

Retrieve documents using multiple text queries in parallel with a text-based retriever.

Basic Information

Type: haystack.components.retrievers.multi_query_text_retriever.MultiQueryTextRetriever
Components it can connect with:
- QueryExpander: MultiQueryTextRetriever receives the queries to expand from the QueryExpander component.
- Ranker, DocumentJoiner: MultiQueryTextRetriever sends the documents to the Ranker and DocumentJoiner components
- Any component that accepts a list of documents as input

Inputs

Parameter	Type	Description
`queries`	List[str]	List of text queries to process.
`retriever_kwargs`	Optional[Dict[str, Any]]	Optional dictionary of arguments to pass to the retriever's run method.

Outputs

Parameter	Type	Description
`documents`	List[Document]	List of retrieved documents sorted by relevance score, deduplicated by content.

Overview

MultiQueryTextRetriever processes multiple text queries in parallel using a text-based retriever (such as BM25 retrievers). It uses a thread pool to manage concurrent execution, making it efficient for processing multiple queries simultaneously. The results from all queries are combined, deduplicated based on document content, and sorted by relevance score.

MultiQueryTextRetriever is designed to work with QueryExpander to enhance the retrieval process. By retrieving documents for multiple semantically similar queries, it improves recall and helps find relevant documents that might be missed with a single query.

Use this Retriever if your documents use different words than your users' queries, or when you want to use query expansion with keyword-based search (BM25).

Usage Example

This example shows how to perform retrieval with QueryExpander and MultiQueryTextRetriever. You can then send the retrieved documents to a Ranker or DocumentJoiner component to combine the results:

components:
  query_expander:
    type: haystack.components.query.query_expander.QueryExpander
    init_parameters:
      n_expansions: 3
      include_original_query: true

      chat_generator:
        type: haystack_integrations.components.generators.anthropic.chat.chat_generator.AnthropicChatGenerator
        init_parameters: {}
  multi_query_retriever:
    type: haystack.components.retrievers.multi_query_embedding_retriever.MultiQueryEmbeddingRetriever
    init_parameters:
      query_embedder:
        type: haystack.components.embedders.sentence_transformers_text_embedder.SentenceTransformersTextEmbedder
        init_parameters:
          model: sentence-transformers/all-MiniLM-L6-v2
      retriever:
        type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
        init_parameters:
          document_store:
            type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
          top_k: 5
      max_workers: 3

connections:
- sender: query_expander.queries
  receiver: multi_query_retriever.queries

max_runs_per_component: 100

metadata: {}

inputs:
  query:
  - query_expander.query

Parameters

Init parameters

These are the parameters you can configure in Pipeline Builder:

Parameter	Type	Default	Description
`retriever`	TextRetriever		The text-based retriever to use for document retrieval. Must implement the TextRetriever protocol.
`max_workers`	int	three	Maximum number of worker threads for parallel processing.

Run method parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

Parameter	Type	Default	Description
`queries`	List[str]		List of text queries to process.
`retriever_kwargs`	Optional[Dict[str, Any]]	None	Optional dictionary of arguments to pass to the retriever's run method (for example, `filters`, `top_k`).

Was this page helpful?

Basic Information​

Inputs​

Outputs​

Overview​

Usage Example​

Parameters​

Init parameters​

Run method parameters​