MongoDBAtlasEmbeddingRetriever

Retrieves documents from the MongoDBAtlasDocumentStore by embedding similarity.

Basic Information

Type: haystack_integrations.mongodb_atlas.src.haystack_integrations.components.retrievers.mongodb_atlas.embedding_retriever.MongoDBAtlasEmbeddingRetriever

Inputs

Parameter	Type	Default	Description
query_embedding	List[float]		Embedding of the query.
filters	Optional[Dict[str, Any]]	None	Filters applied to the retrieved Documents. The way runtime filters are applied depends on the `filter_policy` chosen at retriever initialization. See init method docstring for more details.
top_k	Optional[int]	None	Maximum number of Documents to return. Overrides the value specified at initialization.

Outputs

Parameter	Type	Default	Description
documents	List[Document]		A dictionary with the following keys: - `documents`: List of Documents most similar to the given `query_embedding`

Overview

Work in Progress

Bear with us while we're working on adding pipeline examples and most common components connections.

Retrieves documents from the MongoDBAtlasDocumentStore by embedding similarity.

The similarity is dependent on the vector_search_index used in the MongoDBAtlasDocumentStore and the chosen metric during the creation of the index (i.e. cosine, dot product, or euclidean). See MongoDBAtlasDocumentStore for more information.

Usage example:

import numpy as np
from haystack_integrations.document_stores.mongodb_atlas import MongoDBAtlasDocumentStore
from haystack_integrations.components.retrievers.mongodb_atlas import MongoDBAtlasEmbeddingRetriever

store = MongoDBAtlasDocumentStore(database_name="haystack_integration_test",
                                  collection_name="test_embeddings_collection",
                                  vector_search_index="cosine_index",
                                  full_text_search_index="full_text_index")
retriever = MongoDBAtlasEmbeddingRetriever(document_store=store)

results = retriever.run(query_embedding=np.random.random(768).tolist())
print(results["documents"])

The example above retrieves the 10 most similar documents to a random query embedding from the MongoDBAtlasDocumentStore. Note that dimensions of the query_embedding must match the dimensions of the embeddings stored in the MongoDBAtlasDocumentStore.

Usage Example

components:
  MongoDBAtlasEmbeddingRetriever:
    type: mongodb_atlas.src.haystack_integrations.components.retrievers.mongodb_atlas.embedding_retriever.MongoDBAtlasEmbeddingRetriever
    init_parameters:

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

Parameter	Type	Default	Description
document_store	MongoDBAtlasDocumentStore		An instance of MongoDBAtlasDocumentStore.
filters	Optional[Dict[str, Any]]	None	Filters applied to the retrieved Documents. Make sure that the fields used in the filters are included in the configuration of the `vector_search_index`. The configuration must be done manually in the Web UI of MongoDB Atlas.
top_k	int	10	Maximum number of Documents to return.
filter_policy	Union[str, FilterPolicy]	FilterPolicy.REPLACE	Policy to determine how filters are applied.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

Parameter	Type	Default	Description
query_embedding	List[float]		Embedding of the query.
filters	Optional[Dict[str, Any]]	None	Filters applied to the retrieved Documents. The way runtime filters are applied depends on the `filter_policy` chosen at retriever initialization. See init method docstring for more details.
top_k	Optional[int]	None	Maximum number of Documents to return. Overrides the value specified at initialization.

Was this page helpful?

Basic Information​

Inputs​

Outputs​

Overview​

Usage Example​

Parameters​

Init Parameters​

Run Method Parameters​