MongoDBAtlasDocumentStore

Use the MongoDB database as the document store for storing data your pipelines can query.

Basic Information

Overview

For details, see MongoDB documentation and Haystack documentation.

Authorization

To connect to the MongoDB database, you must provide a connection string in the format "mongodb+srv://{mongo_atlas_username}:{mongo_atlas_password}@{mongo_atlas_host}/?{mongo_atlas_params_string}". For detailed instructions on how to obtain it, see Create a Connection String in MongoDB documentation.

Once you have your connection string, connect MongoDB to deepset Cloud on the Connections page:

  1. Log in to deepset Cloud.

  2. Click your initials in the top right corner and choose Connections.

    The menu expanded with the connections options highlighted
  3. Scroll down the page to find MongoDB and click Connect next to it.

  4. Paste your MongoDB connection string and click Connect.

Usage

To configure MongoDB as the document store, you need:

  • The name of the database to use.
  • The name of the collection to use. This collection must have a vector search index set up on the embedding field.
  • The name of the vector search index to use for vector search. You can create a vector search index on your collection in the Atlas web user interface.
    Important: Your MongoDB vector search index must have an embedding field of type knnVector defined, for example:
    const index = {  
             name: "vector_index",  
             type: "vectorSearch",  
             definition: {  
               "fields": [  
                 {  
                   "type": "knnVector",  
                   "path": "embedding",  
                   "similarity": "cosine",  
                   "numDimensions": 768  
                 }  
               ]  
             }  
         }
    
    This allows you to use embedding retrieval with the MongoDB document store.
    For details, see MongoDB documentation.

Writing Data to MongoDB

To write the preprocessed files into the MongoDB document store:

  1. Add DocumentWriter to your indexing pipeline.
  2. On the component card, click Configure under the document_store parameters. This opens a YAML editor where you can enter the document store parameters. You must provide:
  • type: This is haystack_core_integrations.integrations.mongodb_atlas.MongoDBAtlasDocumentStore.
  • Then, in the init_parameters section, specify:
    • database_name: The name of your MongoDB database.
    • collection_name: The name of your collection.
    • vector_search_index: The name of the vector search index created on your collection. Make sure the index has an embedding field of type knnVector defined.

Retrieving Files From MongoDB

To retrieve files from the MongoDB document store and use them for search:

  1. Add the MongoDBAtlasEmbeddingRetriever to your query pipeline.
  2. On the retriever's card, click Configure under the document_store parameter. This opens a YAML editor where you can enter the document store initialization parameters. You must provide:
    • type: This is haystack_core_integrations.integrations.mongodb_atlas.MongoDBAtlasDocumentStore.
    • Then, in the init_parameters section, specify:
      • database_name: The name of your MongoDB database.
      • collection_name: The name of your collection.
      • vector_search_index: The name of the vector search index created on your collection. Make sure the index has an embedding field of type knnVector defined.

Examples

This is where you can access the configuration:

The Configure button under the document_store parameter on a component card

This is how you configure MongoDB document store init parameters:

The document store configuration window with the YAML editor open and the init parameters filled in.

Here is a copiable YAML example:

type: haystack_core_integrations.integrations.mongodb_atlas.MongoDBAtlasDocumentStore
init_parameters:
  database_name: my_database
  collection_name: my_collection
  vector_search_index: my_vector_search_index

Init Parameters

To check the initialization parameters MongoDB takes, see MongoDB Atlas API reference in Haystack documentation.