AstraEmbeddingRetriever
Retrieve documents from an Astra DB document store using vector similarity.
Basic Information
- Type:
haystack_integrations.components.retrievers.astra.retriever.AstraEmbeddingRetriever - Components it can connect with:
TextEmbedder:AstraEmbeddingRetrieverreceives query embeddings from a text embedder.Ranker:AstraEmbeddingRetrieversends retrieved documents to aRankerfor reranking.
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| query_embedding | List[float] | Floats representing the query embedding. | |
| filters | Optional[Dict[str, Any]] | None | Filters applied to the retrieved documents. The way runtime filters are applied depends on the filter_policy chosen at retriever initialization. |
| top_k | Optional[int] | None | The maximum number of documents to retrieve. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of documents retrieved from the AstraDocumentStore. |
Overview
Use AstraEmbeddingRetriever to retrieve documents from an Astra DB document store using vector similarity search. This retriever must be connected to an AstraDocumentStore to run.
Astra DB is a cloud-native, multi-cloud database service built on Apache Cassandra, optimized for modern, data-intensive applications.
Astra DB Setup
To use this retriever, you need an Astra DB account and a properly configured collection. For more information, see the Astra DB documentation.
Usage Example
Initializing the Component
components:
AstraEmbeddingRetriever:
type: haystack_integrations.components.retrievers.astra.retriever.AstraEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.astra.document_store.AstraDocumentStore
init_parameters:
api_endpoint:
type: env_var
env_vars:
- ASTRA_API_ENDPOINT
strict: false
token:
type: env_var
env_vars:
- ASTRA_TOKEN
strict: false
collection_name: my-collection
embedding_dim: 768
top_k: 10
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| document_store | AstraDocumentStore | An instance of AstraDocumentStore to use with the retriever. | |
| filters | Optional[Dict[str, Any]] | None | A dictionary with filters to narrow down the search space. |
| top_k | int | 10 | The maximum number of documents to retrieve. |
| filter_policy | Union[str, FilterPolicy] | FilterPolicy.REPLACE | Policy to determine how filters are applied. Can be REPLACE (runtime filters replace initialization filters) or MERGE (runtime filters are merged with initialization filters). |
Run Method Parameters
These are the parameters you can configure for the component's run() method. You can pass these parameters at query time through the API, in Playground, or when running a job.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query_embedding | List[float] | Floats representing the query embedding. | |
| filters | Optional[Dict[str, Any]] | None | Filters applied to the retrieved documents. The way runtime filters are applied depends on the filter_policy chosen at retriever initialization. |
| top_k | Optional[int] | None | The maximum number of documents to retrieve. |
Was this page helpful?