Skip to main content

AzureAISearchHybridRetriever

Retrieve documents from an Azure AI Search document store using hybrid (vector + BM25) retrieval.

Basic Information

  • Type: haystack_integrations.components.retrievers.azure_ai_search.hybrid_retriever.AzureAISearchHybridRetriever
  • Components it can connect with:
    • TextEmbedder: AzureAISearchHybridRetriever receives query embeddings from a text embedder.
    • Ranker: AzureAISearchHybridRetriever sends retrieved documents to a Ranker for reranking.

Inputs

ParameterTypeDefaultDescription
querystrText of the query.
query_embeddingList[float]A list of floats representing the query embedding.
filtersOptional[Dict[str, Any]]NoneFilters applied to the retrieved documents. The way runtime filters are applied depends on the filter_policy chosen at retriever initialization.
top_kOptional[int]NoneThe maximum number of documents to retrieve.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]A list of documents retrieved from the AzureAISearchDocumentStore.

Overview

Use AzureAISearchHybridRetriever to retrieve documents from an Azure AI Search document store using a combination of vector similarity and BM25 keyword search. This retriever must be connected to an AzureAISearchDocumentStore to run.

Hybrid retrieval combines the benefits of both semantic (vector) search and keyword (BM25) search, often providing better results than either method alone.

Azure AI Search Setup

To use this retriever, you need an Azure AI Search service with a properly configured index that supports both vector and keyword search. For more information, see the Azure AI Search documentation.

Usage Example

Initializing the Component

components:
AzureAISearchHybridRetriever:
type: haystack_integrations.components.retrievers.azure_ai_search.hybrid_retriever.AzureAISearchHybridRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.azure_ai_search.document_store.AzureAISearchDocumentStore
init_parameters:
api_key:
type: env_var
env_vars:
- AZURE_SEARCH_API_KEY
strict: false
azure_endpoint:
type: env_var
env_vars:
- AZURE_SEARCH_ENDPOINT
strict: false
index_name: my-index
top_k: 10

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
document_storeAzureAISearchDocumentStoreAn instance of AzureAISearchDocumentStore to use with the retriever.
filtersOptional[Dict[str, Any]]NoneFilters applied when fetching documents from the document store. Filters are applied during the hybrid search to ensure the retriever returns top_k matching documents.
top_kint10Maximum number of documents to return.
filter_policyUnion[str, FilterPolicy]FilterPolicy.REPLACEPolicy to determine how filters are applied. Can be REPLACE (runtime filters replace initialization filters) or MERGE (runtime filters are merged with initialization filters).
query_typeOptional[str]NoneA string indicating the type of query to perform. Possible values are simple, full, and semantic.
semantic_configuration_nameOptional[str]NoneThe name of semantic configuration to be used when processing semantic queries.

Run Method Parameters

These are the parameters you can configure for the component's run() method. You can pass these parameters at query time through the API, in Playground, or when running a job.

ParameterTypeDefaultDescription
querystrText of the query.
query_embeddingList[float]A list of floats representing the query embedding.
filtersOptional[Dict[str, Any]]NoneFilters applied to the retrieved documents. The way runtime filters are applied depends on the filter_policy chosen at retriever initialization.
top_kOptional[int]NoneThe maximum number of documents to retrieve.