TextConverter
deepset Cloud pipelines search through Documents stored in the DocumentStore. Documents are passages of plain text. Use TextConverter to convert your files to Document objects that pipelines can use for search.
TextConverter preprocesses files and returns Documents.
A typical scenario where you'd want to use a TextConverter is in an indexing pipeline to convert your files to plain text document objects. It's worth noting that if you add a Converter to your indexing pipeline, the conversion only happens once when you deploy the pipeline. Your files are not converted every time you run a search.
After the files are converted, they're stored in the DocumentStore.
Basic Information
- Pipeline type: Used in indexing pipelines.
- Position in a pipeline: Either at the very begining or after a FileTypeClassifier.
- Nodes that can precede it in a pipeline**: Used as the first node, takes
[File]
as input, or after FileTypeClassifier - Nodes that can follow it in a pipeline: PreProcessor
- Node input: File
- Node output: Documents
- Available node classes: TextConverter
Usage Examples
You can use it in your indexing pipeline as the first node.
...
components:
- name: TextFileConverter
type: TextConverter
...
pipelines:
- name: indexing
nodes:
- name: TextFileConverter
inputs: [File]
- name: Preprocessor
inputs: [TextFileConverter]
Updated 8 months ago
Related Links