Connect to an External Document Store

Run your query pipelines on data stored in an external database, such as Pinecone, Weaviate, Qdrant, or others.

About this Task

Currently, deepset Cloud supports the following document stores:

  • OpenSearch (the core document store, unless you're on your own OpenSearch cluster, you don't need to provide any credentials and can just use it out of the box)
  • Elasticsearch
  • Pinecone
  • Qdrant
  • Snowflake (for details on connecting to Snowflake, see Connect to Your Snowflake Database)
  • Weaviate

These databases act as document stores in deepset Cloud. For more information, see Document Stores.

You can also add an integration with any other database through a custom component. For details, see Custom Components.

Prerequisites

  • You need an active API key to the database you want to use.
  • Basic knowledge of document stores in deepset Cloud. For details, see Document Stores.
  • Understanding of secrets in deepset Cloud. For more information, see Add Secrets to Connect to Third Party Providers.
  • Check the parameters you can configure for your document store, especially the name of the parameter for passing the API key. See the documentation for your document store in the Document Stores section.

Run Queries on Data in Your Document Store

This task involves three steps:

  1. Create a secret to securely store your API key to the document store.
  2. Write your documents to the document store.
  3. Retrieve documents from the document store.

Create a Secret for Your Document Store

This step is needed to enable a connection with the database you want to use as the document store without adding the API key explicitly in the configuration.

  1. In deepset Cloud, click your initials in the top right corner and choose Secrets>Add New Secret.

  2. Give your secret the same name as the environment variable where you want to store it.

  3. Paste the API key into the Secret field and save it.

You'll then use the secret name as the API key for components that need to connect to the document store.

Write Documents into the Document Store

DocumentWriter is the component that writers preprocessed documents into the document store. You must add it at the end of your indexing pipeline.

  1. Build your indexing pipeline and add DocumentWriter as its last component.

ℹ️

If you're using a pipeline template, DocumentWriter is already there.

  1. On the DocumentWriter component card, click Configure under the document_store parameter.

    The configure button highlighted on the component card

This opens the YAML editor for configuring the document store.

  1. Add all the parameters and make sure you pass the secret name as the API key. For example, if the name of the secret was QDRANT_KEY, specify: api_key: {"type": "env_var", "env_vars": ["OPENAI_API_KEY"], "strict": False}.
  2. Save your configuration.

Retrieve Documents From the Document Store

Each document store has dedicated retrievers. Add them to your query pipeline and configure the document store they should connect to in the same way you configured DocumentWriter. Pass the secret's name as the API key for the document store.