DocumentWriter
Write documents to a document store.
Key Features
- Writes a list of
Documentobjects to any compatibleDocumentStore. - Configures duplicate handling policies: skip, fail, overwrite, or upsert duplicates.
- Returns the count of documents written to the store.
- Used at the end of indexing pipelines to persist processed documents.
Configuration
- Drag the
DocumentWritercomponent onto the canvas from the Component Library. - Click the component to open the configuration panel.
- On the General tab:
- Configure the document_store with the target document store instance.
- Go to the Advanced tab to configure the duplicate handling policy (SKIP, FAIL, OVERWRITE, UPSERT, or INCREMENTAL).
Connections
DocumentWriter accepts a list of Document objects (documents) and an optional policy as input. It outputs documents_written, an integer representing the number of documents written.
Typically, DocumentWriter is the last component in an indexing pipeline. Connect a preprocessor or document joiner to the documents input to provide the final processed documents.
Usage Example
components:
DocumentWriter:
type: components.writers.document_writer.DocumentWriter
init_parameters:
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of documents to write to the document store. | |
| policy | Optional[DuplicatePolicy] | None | The policy to use when encountering duplicate documents. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents_written | int | Number of documents written to the document store. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| document_store | DocumentStore | The instance of the document store where you want to store your documents. | |
| policy | DuplicatePolicy | DuplicatePolicy.NONE | The policy to apply when a Document with the same ID already exists in the DocumentStore. - DuplicatePolicy.NONE: Default policy, relies on the DocumentStore settings. - DuplicatePolicy.SKIP: Skips documents with the same ID and doesn't write them to the DocumentStore. - DuplicatePolicy.OVERWRITE: Overwrites documents with the same ID. - DuplicatePolicy.FAIL: Raises an error if a Document with the same ID is already in the DocumentStore. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of documents to write to the document store. | |
| policy | Optional[DuplicatePolicy] | None | The policy to use when encountering duplicate documents. |
Was this page helpful?