Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

WeaviateEmbeddingRetriever

Retrieve documents from a Weaviate document store using vector search to find similar documents based on the embeddings of the query.

Key Features

  • Embedding-based vector retrieval from a Weaviate vector database.
  • Configurable number of results with top_k.
  • Supports metadata filtering to narrow down the search space.
  • Supports distance and certainty thresholds for controlling result quality.
  • Configurable filter policy (replace or merge) for runtime filters.

Configuration

  1. Drag the WeaviateEmbeddingRetriever component onto the canvas from the Component Library.
  2. Click on the component to open the configuration panel.
  3. On the General tab:
    • Configure the WeaviateDocumentStore with your Weaviate instance URL.
    • Set top_k to control the maximum number of documents to retrieve.
  4. Go to the Advanced tab to configure filter_policy, distance, certainty, and default filters.

Connections

WeaviateEmbeddingRetriever receives query embeddings from a text embedder. It sends retrieved documents to downstream components such as PromptBuilder or a ranker.

Source Code

To check this component's source code, open embedding_retriever.py in the Haystack Core Integrations repository.

Usage Examples

Basic Configuration

  WeaviateEmbeddingRetriever:
type: weaviate.src.haystack_integrations.components.retrievers.weaviate.embedding_retriever.WeaviateEmbeddingRetriever
init_parameters: {}
components:
WeaviateEmbeddingRetriever:
type: weaviate.src.haystack_integrations.components.retrievers.weaviate.embedding_retriever.WeaviateEmbeddingRetriever
init_parameters:

Parameters

Inputs

ParameterTypeDescription
query_embeddingList[float]Embedding of the query.
filtersOptional[Dict[str, Any]]Filters applied to the retrieved Documents. The way runtime filters are applied depends on the filter_policy chosen at retriever initialization.
top_kOptional[int]The maximum number of documents to return.
distanceOptional[float]The maximum allowed distance between Documents' embeddings.
certaintyOptional[float]Normalized distance between the result item and the search vector.

Outputs

ParameterTypeDescription
documentsList[Document]Retrieved documents.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
document_storeWeaviateDocumentStoreInstance of WeaviateDocumentStore that will be used from this retriever.
filtersOptional[Dict[str, Any]]NoneCustom filters applied when running the retriever.
top_kint10Maximum number of documents to return.
distanceOptional[float]NoneThe maximum allowed distance between Documents' embeddings.
certaintyOptional[float]NoneNormalized distance between the result item and the search vector.
filter_policyUnion[str, FilterPolicy]FilterPolicy.REPLACEPolicy to determine how filters are applied.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
query_embeddingList[float]Embedding of the query.
filtersOptional[Dict[str, Any]]NoneFilters applied to the retrieved Documents. The way runtime filters are applied depends on the filter_policy chosen at retriever initialization. See init method docstring for more details.
top_kOptional[int]NoneThe maximum number of documents to return.
distanceOptional[float]NoneThe maximum allowed distance between Documents' embeddings.
certaintyOptional[float]NoneNormalized distance between the result item and the search vector.