Tutorial: Building a Summarization System with a Large Language Model
This tutorial teaches you how to build a question answering system that generates answers based on your documents. It uses the PromptNode with a large language model.
- Level: Beginner
- Time to complete: 15 minutes
- Prerequisites:
- This tutorial assumes a basic knowledge of NLP, large language models and retrieval-augmented generation. If you need more information, have a look at Language Models.
- You must be an Admin to complete this tutorial.
- This tutorial uses the gpt-3.5-turbo model, so you need an API key from an active OpenAI account.
If you don't have an account with OpenAI, you can replace this model with an open source one, like google/flan-t5-large, but bear in mind it has its limitations, and its performance may not be sufficient.
- Goal: After completing this tutorial, you will have created a system that can generate summaries of reports on child obesity and food advertising regulations. You will have learned how to use PromptNode with a large language model and a custom prompt.
- Keywords: PromptNode, summarization, large language models, prompts
Connect Your OpenAI Account
Perform this step if you want to use the gpt-3.5-turbo model by OpenAI. If you're planning to use an open source model, you can skip this step.
You'll be able to use OpenAI models without having to pass the API keys in the pipeline YAML.
- Click your name in the top right corner and choose Connections.

- Next to OpenAI, click Connect, paste your OpenAI API key, and click Submit.
Result: You're connected to your OpenAI account and can use OpenAI models in your pipelines.

Upload Files
First, let's upload the files we want our search system to run on. The files here are a set of reports on the impact of food marketing on child obesity. You can replace this dataset with any other dataset.
-
Download the .zip file with sample files and unpack it on your computer.
-
Log in to deepset Cloud, make sure you're in the workspace you want to use for this task, and go to _Data>Files.
-
Click Upload Files.
-
Select all the files you extracted and drop them into the Upload Files window. There should be four files in total.
-
Click Upload and wait until the files are uploaded.
Result: Your files are in your workspace, and you can see them on the Files page.

Create the Pipeline
We'll use an out-of-the-box template as a baseline for our pipeline and we'll adjust it a bit:
-
In deepset Cloud, go to Pipelines>New Pipeline.
-
In YAML Editor, click Create Pipeline and select From Template.

-
Find the Generative Question Answering GPT-3 template and click Use Template.
The template YAML opens in the Pipeline Designer. -
Update the template:
- In YAML editor, find line 7, and change the pipeline name to "summarization".
- In line 35, change the
top_k
value to1
. - Delete the code that defines PromptTemplate, so lines 37 to 60.
- In line 41, change the
default_prompt_template
todeepset/summarization
. - Line 44 is where you can change the model.
- In line 45, add the
top_k
parameter and set it to1
.
-
Save your pipeline. This is what your pipeline YAML should look like:
version: '1.21.0' name: 'summarization' # This section defines nodes that you want to use in your pipelines. Each node must have a name and a type. You can also set the node's parameters here. # The name is up to you, you can give your component a friendly name. You then use components' names when specifying their order in the pipeline. # Type is the class name of the component. components: - name: DocumentStore type: DeepsetCloudDocumentStore - name: BM25Retriever # The keyword-based retriever type: BM25Retriever params: document_store: DocumentStore top_k: 10 # The number of results to return - name: EmbeddingRetriever # Selects the most relevant documents from the document store type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query params: document_store: DocumentStore embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search. It has been trained on 215M (question, answer) pairs from diverse sources. model_format: sentence_transformers top_k: 10 # The number of results to return - name: JoinResults # Joins the results from both retrievers type: JoinDocuments params: join_mode: concatenate # Combines documents from multiple retrievers - name: Reranker # Uses a cross-encoder model to rerank the documents returned by the two retrievers type: SentenceTransformersRanker params: model_name_or_path: cross-encoder/ms-marco-MiniLM-L-6-v2 # Fast model optimized for reranking top_k: 1 # The number of results to return batch_size: 20 # Try to keep this number equal or larger to the sum of the top_k of the two retrievers so all docs are processed at once - name: PromptNode type: PromptNode params: default_prompt_template: deepset/summarization max_length: 400 # The maximum number of tokens the generated answer can have model_kwargs: # Specifies additional model settings temperature: 0 # Lower temperature works best for fact-based qa model_name_or_path: gpt-3.5-turbo top_k: 1 - name: FileTypeClassifier # Routes files based on their extension to appropriate converters, by default txt, pdf, md, docx, html type: FileTypeClassifier - name: TextConverter # Converts files into documents type: TextConverter - name: PDFConverter # Converts PDFs into documents type: PDFToTextConverter - name: Preprocessor # Splits documents into smaller ones and cleans them up type: PreProcessor params: # With a vector-based retriever, it's good to split your documents into smaller ones split_by: word # The unit by which you want to split the documents split_length: 250 # The max number of words in a document split_overlap: 20 # Enables the sliding window approach language: en split_respect_sentence_boundary: True # Retains complete sentences in split documents # Here you define how the nodes are organized in the pipelines # For each node, specify its input pipelines: - name: query nodes: - name: BM25Retriever inputs: [Query] - name: EmbeddingRetriever inputs: [Query] - name: JoinResults inputs: [BM25Retriever, EmbeddingRetriever] - name: Reranker inputs: [JoinResults] - name: PromptNode inputs: [Reranker] - name: indexing nodes: # Depending on the file type, we use a Text or PDF converter - name: FileTypeClassifier inputs: [File] - name: TextConverter inputs: [FileTypeClassifier.output_1] # Ensures that this converter receives txt files - name: PDFConverter inputs: [FileTypeClassifier.output_2] # Ensures that this converter receives PDFs - name: Preprocessor inputs: [TextConverter, PDFConverter] - name: EmbeddingRetriever inputs: [Preprocessor] - name: DocumentStore inputs: [EmbeddingRetriever]
-
At the top of the Pipeline Designer, click Deploy and wait until your pipeline is deployed and indexed. Indexing may take a couple of minutes.
Result: You have created a pipeline that summarizes documents using a large language model. Your pipeline is displayed on the Pipelines page in the Deployed section, and you can now try it out.
Test the Pipeline
Now it's time to see how your pipeline is doing. Let's run a search with it.
-
In the navigation, click Search.
-
Make sure the summarization pipeline is selected.
-
Type the query: summarize the report on advertising food to children.
Here's what the pipeline returns:
Result: Congratulations! You just created a summarization pipeline that uses a large language model to generate summaries of documents.
Updated about 22 hours ago