Tutorial: Creating an Experiment

Use this tutorial to teach you how to use experiments and to try them out. The instructions here guide you through the steps to create your first experiment. They also contain all the data you need for it.

Level: Intermediate

  • Time to complete: 15 minutes
  • Prerequisites:
    • This tutorial assumes a basic knowledge of NLP and the concept of pipeline evaluation. If you need more information, have a look at About Pipeline Evaluation.
    • You must be an Admin to complete this tutorial.
  • Goal: After completing this tutorial, you will have created and run an experiment to evaluate a pipeline.

Upload Files

Your pipeline will run the search on these files.

  1. Download the .zip file with sample files and unpack it on your computer.

  2. Log in to deepset Cloud, make sure you're in the right workspace, and go to Data>Files.

  3. Click Upload Files.

  4. Select all the files you extracted and drop them on the Upload Your Files page. There should be 344 files in total.

  5. Click Upload and wait until the files are uploaded.

Result: Your files are in your workspace and you can see them on the Files page.

A screenshot of the Files page with the files uploaded and listed thereA screenshot of the Files page with the files uploaded and listed there

Upload an Evaluation Set

You need a set of annotated data your pipeline will be evaluated against.

  1. Download the CSV file and save it on your computer.
  2. In deepset Cloud, go to Data>Evaluation Sets and click Import Eval Sets.
  3. Drop the CSV file you downloaded on the Evaluation Set Import page and click Upload File. Wait for the confirmation that it uploaded without problems and click Go to Evaluation Sets.

Result: The evaluation dataset is uploaded and you can see it on the Evaluation Sets page.

A screenshot of the Evaluation Sets page with the annotations_jaz eval set uploaded and displayedA screenshot of the Evaluation Sets page with the annotations_jaz eval set uploaded and displayed

Create a Pipeline to Run the Experiment On

  1. In deepset Cloud, go to Pipelines>New Pipeline.

  2. In YAML Editor, click Create Pipeline and select From Template.

  3. Choose the English Question Answering template.

  4. In the YAML editor, in line 8, find name and change it to 'Test_experiment'.

  5. In line 17, change the retriever type to ElasticsearchRetriever and delete lines 20 and 21 (embedding_model and model_format).

  6. Change the top_k parameter value in line 22 to 10.
    This is what your pipeline should look like now:

A screenshot of the pipeline YAML with all the updates parameters underlined.A screenshot of the pipeline YAML with all the updates parameters underlined.
  1. Save your pipeline.

Result: You have created a question answering pipeline that you're going to evaluate next. Your pipeline is displayed on the Pipelines page.

A screenshot of the Pipelines page with the Test Experiment pipeline displayed under in development.A screenshot of the Pipelines page with the Test Experiment pipeline displayed under in development.

Create an Experiment

Now it's time to evaluate your pipeline.

  1. Go to Experiments>New Experiment.

  2. Choose Test_experiment as the pipeline.

  3. Choose annotations_jazz as the evaluation set.

  4. Type jazz as the experiment name and add test as a tag.

  5. Click Start Experiment. You can see that jazz is running. Wait until it completes. It may take a couple of minutes.

  6. When the experiment status changes to Completed, click its name to view its details, such as the data and the pipeline used, the metrics, and predictions.

Result: Congratulations! You just created an experiment and ran it to evaluate your pipeline.

A screenshot of the experiments page with the jazz experiment displayed as completedA screenshot of the experiments page with the jazz experiment displayed as completed

You can now review the results on the experiment details page.

A screenshot of the Experiment Details page showing all the experiment data such as evaluation set used, date started, debug information, and metrics.A screenshot of the Experiment Details page showing all the experiment data such as evaluation set used, date started, debug information, and metrics.


Related Links