- Level: Beginner
- Time to complete: 10 minutes
- This tutorial assumes a basic knowledge of NLP.
- You must be an Admin to complete this tutorial.
- Goal: After completing this tutorial, you will have built a complete English document retrieval system from scratch that can fetch NHS documents.
First, let's get the files the search will run on into deepset Cloud.
- Download the .zip file from gdrive and unzip it to a location on your computer.
- Log in to deepset Cloud and go to Data>Files.
- Click Upload Files.
- Click Browse and select the files you unpacked in step 1.
Note: This usually takes a couple of seconds so don't worry if you can't see anything yet. Just give us a while.
- Wait until the files show up on the page, and when they do, scroll down to the bottom and click Upload.
- Wait until you get redirected to the Files page, where you can see all your files. You should have around 900 files. You can check the number of files on the Dashboard.
Result: Your files have been uploaded and are shown on the Files page.
The next step is to define the components of your search app. We'll use a document retrieval template with a dense retriever to create the pipeline.
- Go to Pipelines>New Pipeline.
- Under YAML Editor, click Create Pipeline and select From Template.
- When the templates show up, find the Dense Document Search template and click Use Template.
- When the Pipeline Designer opens, change the pipeline name in line 8 to NHS_doc_retrieval and save the pipeline.
- Click Deploy to start indexing and make your pipeline ready for running a search.
- Return to the Pipelines page and wait until the status of your pipeline changes to Indexed. This can take a couple of minutes.
Tip: When you hover your mouse over the status, you can see how many files have already been indexed.
Result: You created and deployed a pipeline, which means your documents have been indexed and you can now run a search. Your pipeline shows on the Pipelines page with the status Indexed.
Let's see what the pipeline can do.
- Go to Search.
- Choose NHS_doc_retrieval as the pipeline.
- Type "How do I treat atopic skin?" and search for relevant documents. You should get a number of documents sorted by the most relevant ones.
Result: Congratulations! You have built a search system that can retrieve documents related to health. You can now ask it health-related queries and it will find documents that are relevant.
Updated about 1 month ago