Tutorial: Building a Robust RAG System

Build a retrieval augmented generation (RAG) system running on your own data that can generate answers in a friendly and conversational tone. Learn how to test different prompts and save them for future.

  • Level: Intermediate
  • Time to complete: 15 minutes
  • Prerequisites:
    • You must be an Admin to complete this tutorial.
    • You must have an API key from an active OpenAI account as this pipeline is using the gpt-3.5-turbo model by OpenAI.
  • Goal: After completing this tutorial, you will have built a RAG system that can answer questions about treating various diseases based on the documents from Mayo Clinic. This system will run on the data you provide to it to minimize the possibility of hallucinations.
  • Keywords: PromptNode, large language models, retrieval augmented generation, RAG, gpt-3.5-turbo, Prompt Studio

Create a Workspace

We need a deepset Cloud workspace to store our files and the generative pipeline.

  1. Log in to deepset Cloud.

  2. In the upper left corner, click the name of the workspace, type RAG as the workspace name, and click Create.

    The workspace creation window expanded

Result: You have created a workspace called RAG, where you'll upload the Mayo Clinic files.

Upload Files to Your Workspace

  1. First, download the mayoclinic.zip file and unpack it on your computer. (You can also use your own files.)
  2. In deepset Cloud, make sure you're in the RAG workspace, and go to Files in the navigation.
  3. Click Upload Files.
  4. Drop the files you unpacked in step 1 into the Upload Files window and click Upload.
  5. Wait until the upload finishes. It may take a while until the files are processed and visible in deepset Cloud.
    You should have 1096 files in your workspace.

Result: Your files are in the RAG workspace, and you can see them on the Files page.

The Files page with the uploaded files showing in a list

Connect Your OpenAI Account

You'll be able to use OpenAI models without having to pass the API keys in the pipeline itself.

  1. Click your initials in the top right corner and choose Connections.
The personal menu expanded with the Connections option underlined.
  1. Next to OpenAI, click Connect, paste your OpenAI API key, and click Submit.

Result: You're connected to your OpenAI account and can use OpenAI models in your pipelines.

The integrations section with the OpenAI option showing as connected.

Create a Draft Pipeline

Let's create a pipeline that will be a starting point for the generative question answering app:

  1. In the navigation, go to Pipeline Templates.

  2. Choose Basic QA, find Retrieval Augmented Generation Question Answering GPT-3.5, and click Use Template.

    Alt-text: "Screenshot of a 'Basic QA' section on a webpage showcasing pipeline templates for question-answering systems. The section is one of several categories listed in a sidebar on the left, with 'Basic QA' showing a count of '6'. The main pane shows three templates: 'Extractive Question Answering', 'Extractive Question Answering (German)', and 'Generative Question Answering GPT-3.5'. Each template offers a brief description of its functionality, emphasizing the use of semantic similarity in searching for answers. Icons indicate the creator of the templates, 'deepset'. At the bottom of each template, there are options to 'View Details' and 'Use Template', with the latter having a notification bubble with a curled arrow symbol, suggesting an update or new feature. The interface is clean with a color scheme consisting primarily of blue, white, and gray.
  3. Type RAG as the pipeline name and click Create Pipeline. You're redirected to the Pipelines page. You can find your pipeline in the All tab.
    Info: Newly created undeployed pipelines are automatically classified as drafts, so you can also find your pipeline in the _Drafts tab. But once you start deploying it, it changes to a Development pipeline and is moved from the _Drafts to the Development tab.

  4. Click Deploy next to your pipeline and wait until the pipeline is deployed and indexed.

Result: You now have an indexed RAG pipeline that generates answers based on your data. Your pipeline status is Indexed, and it's ready for use. Your pipeline is at the development service level. We recommend you test it before setting it to the production service level.

The pipelines page with the generative QA pipeline showing as indexed

Test Your Prompt

The default prompt makes the model act as a matter-of-fact technical expert, while we want our system to be friendly and empathetic. Let's experiment with different prompts to achieve this effect.

  1. In the navigation, click Prompt Studio.

  2. Choose the RAG pipeline. Your current prompt is showing in the Prompt Editor pane.

    Prompt Studio with the RAG pipeline selected and the prompt text showing in the Prompt Editor.
  3. In the Type your query here placeholder, try asking some questions related to treating medical conditions, for example: "I had my wisdom tooth removed but my gum hurts and is swollen. What should I do?"

The prompt explorer window with the Generative QA pipeline selected and marked with a red number 1. Below pipeline selection, there's a welcome page. At the bottom of the page, there's prompt editor with the prompt text displayed. And below prompt editor there's the question about wisdom tooth marked with step 2.

The model generates an answer and provides its sources, which are the documents it's based on.

  1. Now, let's try a different prompt. In Prompt Editor, click the Menu button and choose deepset. You can see all prompts curated by deepset.
Prompt Editor with the templates button highlighted Prompt Studio with the deepset tab open.
  1. Scroll down through the templates, choose deepset/question-answering, and click Use Prompt. The prompt is now showing in Prompt Editor.

    Prompt Studio with the question answering prompt selected
  2. Submit the query from step 3. You can now compare the two answers to check which prompt performs better.

  3. To return to the original prompt, reload the whole page and choose the RAG pipeline again.

  4. In Prompt Editor, change the prompt to adjust the tone of the answer. Replace "You are a technical expert." with "You are a friendly nurse." and add "Your answers are friendly, clear, and conversational.", like in the prompt below:

You are a friendly nurse.\
You answer questions truthfully based on provided documents. \
Your answers are friendly, clear, and conversational. \
For each document check whether it is related to the question. \
Only use documents that are related to the question to answer it. \
Ignore documents that are not related to the question. \
If the answer exists in several documents, summarize them. \
Only answer based on the documents provided. Don't make things up. \
Always use references in the form [NUMBER OF DOCUMENT] when using information from a document. e.g. [3], for Document[3]. \
The reference must only refer to the number that comes in square brackets after passage. \
Otherwise, do not use brackets in your answer and reference ONLY the number of the passage without mentioning the word passage. \
If the documents can't answer the question or you are unsure say: 'I'm sorry I don't know that'. \
These are the documents:\
{join(documents, delimiter=new_line, pattern=new_line+'Document[$idx]:'+new_line+'$content')}\
Question: {query}\
  1. Try the same query or experiment with other queries related to treating medical conditions. The answers should now be in a more empathetic and friendly tone. Here are some example questions you can ask:
    "I have been diagnosed with a wheat allergy, what do I do now?"
    "How do you treat swollen wrists?"
    "What is meningitis?"

  2. Update the RAG pipeline with the new prompt. Click Update in Prompt Editor. This replaces the current prompt.

  3. Save the prompt as a template:

    1. In Prompt Editor, click Prompt Templates. You land on the Custom tab of the Prompt Templates window.

      The prompt templates button visible next to the Update button in prompt editor.
    2. Click Create Custom Prompt.

    3. Paste the copied prompt in the text field, type friendly_tone as the prompt name, and save your prompt. You'll be able to reuse it in the future.

Prompt templates window with the custom prompt template filled in with the text copied from prompt editor.

Result: You have tweaked your prompt to generate more friendly and conversational answers. You updated your pipeline with this prompt. You then saved this prompt as a template and can reuse it in other pipelines.

Test the Pipeline

Time to see your pipeline in action!

  1. In the navigation, click Playground and make sure the RAG pipeline is selected.
  2. Try asking something like "my eyes hurt, what should I do?".
  3. Once the answer is generated, check the sources to see if the answers are actually in the documents.
The answer to the query "my eyes hurt, what should I do?" with each sentence underlined in either green or red.

You can also check the prompt by clicking the More Actions button next to the search result.

Congratulations! You have built a generative question answering system that can answer questions about treating various diseases in a friendly and conversational tone. Your system also shows you which parts of the answer are hallucinations.

What To Do Next

Once you have a RAG pipeline, you can monitor its groundedness score to see how reliable it is and if it sticks to the documents.

Your pipeline is now a development pipeline. Once it's ready for production, change its service level to Production. You can do this on the Pipeline Details page shown after clicking a pipeline name. To learn more, see Pipeline Service Levels.