Skip to main content

AstraEmbeddingRetriever

Retrieve documents from an Astra DB document store using vector similarity.

Basic Information

  • Type: haystack_integrations.components.retrievers.astra.retriever.AstraEmbeddingRetriever
  • Components it can connect with:
    • TextEmbedder: AstraEmbeddingRetriever receives query embeddings from a text embedder.
    • Ranker: AstraEmbeddingRetriever sends retrieved documents to a Ranker for reranking.

Inputs

ParameterTypeDefaultDescription
query_embeddingList[float]Floats representing the query embedding.
filtersOptional[Dict[str, Any]]NoneFilters applied to the retrieved documents. The way runtime filters are applied depends on the filter_policy chosen at retriever initialization.
top_kOptional[int]NoneThe maximum number of documents to retrieve.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]A list of documents retrieved from the AstraDocumentStore.

Overview

Use AstraEmbeddingRetriever to retrieve documents from an Astra DB document store using vector similarity search. This retriever must be connected to an AstraDocumentStore to run.

Astra DB is a cloud-native, multi-cloud database service built on Apache Cassandra, optimized for modern, data-intensive applications.

Astra DB Setup

To use this retriever, you need an Astra DB account and a properly configured collection. For more information, see the Astra DB documentation.

Usage Example

Initializing the Component

components:
AstraEmbeddingRetriever:
type: haystack_integrations.components.retrievers.astra.retriever.AstraEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.astra.document_store.AstraDocumentStore
init_parameters:
api_endpoint:
type: env_var
env_vars:
- ASTRA_API_ENDPOINT
strict: false
token:
type: env_var
env_vars:
- ASTRA_TOKEN
strict: false
collection_name: my-collection
embedding_dim: 768
top_k: 10

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
document_storeAstraDocumentStoreAn instance of AstraDocumentStore to use with the retriever.
filtersOptional[Dict[str, Any]]NoneA dictionary with filters to narrow down the search space.
top_kint10The maximum number of documents to retrieve.
filter_policyUnion[str, FilterPolicy]FilterPolicy.REPLACEPolicy to determine how filters are applied. Can be REPLACE (runtime filters replace initialization filters) or MERGE (runtime filters are merged with initialization filters).

Run Method Parameters

These are the parameters you can configure for the component's run() method. You can pass these parameters at query time through the API, in Playground, or when running a job.

ParameterTypeDefaultDescription
query_embeddingList[float]Floats representing the query embedding.
filtersOptional[Dict[str, Any]]NoneFilters applied to the retrieved documents. The way runtime filters are applied depends on the filter_policy chosen at retriever initialization.
top_kOptional[int]NoneThe maximum number of documents to retrieve.