Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

DocumentWriter

Write documents to a DocumentStore.

DocumentWriter is used in indexing pipelines to persist processed documents into a DocumentStore so they can be retrieved later by query pipelines.

Key Features

  • Writes documents to any compatible DocumentStore.
  • Configurable duplicate handling: NONE, SKIP, OVERWRITE, or FAIL.
  • Returns the number of documents successfully written.

Configuration

  1. Drag the DocumentWriter component onto the canvas from the Component Library.
  2. Click on the component to open the configuration panel.
  3. On the General tab:
    1. Configure the document_store to specify where the documents should be written.
    2. Set the policy to control how duplicate documents are handled.
  4. Go to the Advanced tab if you need to override the policy at run time.

Connections

DocumentWriter receives a list of Document objects through its documents input — typically from a preprocessor, splitter, or embedder at the end of an indexing pipeline. It outputs the count of written documents through its documents_written output. DocumentWriter is usually the last component in an indexing pipeline.

Source Code

To check this component's source code, open document_writer.py in the Haystack repository.

Usage Examples

Basic Configuration

  DocumentWriter:
type: components.writers.document_writer.DocumentWriter
init_parameters: {}
components:
DocumentWriter:
type: components.writers.document_writer.DocumentWriter
init_parameters:

Parameters

Inputs

ParameterTypeDescription
documentsList[Document]A list of documents to write to the document store.
policyOptional[DuplicatePolicy]The policy to use when encountering duplicate documents.

Outputs

ParameterTypeDescription
documents_writtenintNumber of documents written to the document store.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
document_storeDocumentStoreThe instance of the document store where you want to store your documents.
policyDuplicatePolicyDuplicatePolicy.NONEThe policy to apply when a Document with the same ID already exists in the DocumentStore. DuplicatePolicy.NONE: Default policy, relies on the DocumentStore settings. DuplicatePolicy.SKIP: Skips documents with the same ID. DuplicatePolicy.OVERWRITE: Overwrites documents with the same ID. DuplicatePolicy.FAIL: Raises an error if a Document with the same ID is already in the DocumentStore.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
documentsList[Document]A list of documents to write to the document store.
policyOptional[DuplicatePolicy]NoneThe policy to use when encountering duplicate documents.