Basic Concepts

That's the place where you can check the meaning of terms and notions used in deepset Cloud.

A blue icon resembling a sheet of paper with one corner bentA blue icon resembling a sheet of paper with one corner bent

Document

Refers to an individual piece of text stored in the document store. Multiple documents may originally come from one file.

A blue icon of a square with rounded corners and two lines in the middleA blue icon of a square with rounded corners and two lines in the middle

File

Refers to the raw file that you upload to deepset Cloud (for example, a PDF). When an indexing pipeline runs, files get converted, cleaned, and split into documents, which contain the actual text and are then used for finding the best answer to a query.

a blue icon of a tray with documents in ita blue icon of a tray with documents in it

Document Store

A component that stores the text documents, their metadata, and (optionally) embeddings.

A blue icon of a square with rounded corners and two right-arrows in the middleA blue icon of a square with rounded corners and two right-arrows in the middle

Node

A pipeline component. Nodes are the processing steps in a pipeline. They act like building blocks that you can mix and match or replace.

A blue image of four squares connected with a dotted lineA blue image of four squares connected with a dotted line

Pipeline

Pipelines define the processing steps for executing a query and indexing your files. These steps are pipeline nodes. Nodes in pipelines are connected in series so that the output of one node is used by the next node. You can mix and match the nodes in a pipeline.

Evaluation Dataset

Also referred to as "eval set", an annotated set of data held back from your model. It contains the gold answers against which deepset Cloud evaluates the actual answers that your pipeline returns during and experiment run. For more information, see Evaluation Datasets.

Experiment Run

A single run of an experiment to evaluate your pipeline. When you create and start an experiment, it runs all the questions from the evaluation dataset through the pipeline and compares them to gold answers. After it finishes, it calculates the metrics that you can use to tweak your pipeline. To learn more, see About Pipeline Evaluation.


Did this page help you?