OpenSearchDocumentStore

The core document store of deepset AI Platform.

Basic Information

Overview

For details on how this document store works, see OpenSearchDocumentStore in the Haystack documentation.

Authorization

This is the core document store of deepset and as such, we handle the authorization.

Example Configuration

In an Indexing Pipeline

To write the preprocesses files into the document store:

  1. Add DocumentWriter to your pipeline.
  2. Add OpenSearchDocumentStore and configure it on the component card.
  3. Connect OpenSearchDocumentStore to DocumentWriter.

In a Query Pipeline

To retrieve files from the document store:

  1. Add an OpenSearch retriever to your pipeline.
  2. Add OpenSearchDocumentStore and configure it on the component card.
  3. Connect OpenSearchDocumentStore to the retriever.

Example

This is an example of the document store connected to a keyword and a vector retriever:

document store connected to retrievers

When you switch to YAML, you'll see that the document store is an argument of each retriever:

  bm25_retriever: # Selects the most similar documents from the document store
    type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore # document store configuration
        init_parameters:
          index: default
          max_chunk_bytes: 104857600
          embedding_dim: 768
          return_embedding: false
          create_index: true
      top_k: 20 # The number of results to return
   embedding_retriever: # Selects the most similar documents from the document store
    type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore # document store configuration
        init_parameters:
          index: default
          max_chunk_bytes: 104857600
          embedding_dim: 768
          return_embedding: false
          create_index: true
      top_k: 20 # The number of results to return

Init Parameters

For a list of parameters you can configure, see OpenSearchDocumentStore API reference in Haystack documentation.