Generative Question Answering Pipelines
These pipelines use a large language model to generate answers based on the model's general knowledge and the documents you feed to it.
Retrieval Augmented Generation (RAG) Pipelines
These pipelines generate answers based on your documents. The Retriever fetches the documents from the Document Store and passes them to the model in the prompt. Essentially, RAG pipelines begin with a document retrieval step. Check Document Retrieval Pipelines for more examples. You can combine the document retrieval step with the PromptNode in your RAG pipelines.
RAG QA with GPT-3, Hybrid Retrieval, and a Custom Prompt
This pipeline uses Open AI's GPT-3 model text-davinci-003 and a combination of vector-based and keyword-based retrievers. You need an API key from an active Open AI account to use this model.
It's a retrieval augmented generation (RAG) pipeline, which means it uses the files you uploaded to deepset Cloud (or the files from your VPC connected to deepset Cloud) rather than the model's knowledge of the world to generate the answers. It passes the files in the prompt using the Document
variable. This pipeline has both, the vector-based and the keyword-based retriever and a JoinDocuments node that joins the results retrieved by both retrievers and passes them on to the model in the prompt.
This pipeline is available as a ready-made template in deepset Cloud. You can choose it when creating a pipeline with the YAML editor and selecting From template.
It uses a custom prompt passed in the prompt
parameter of the PromptTemplate
component. You can modify the prompt directly in the prompt
parameter.
# If you need help with the YAML format, have a look at https://docs.cloud.deepset.ai/docs/create-a-pipeline#create-a-pipeline-using-yaml.
# This is a friendly editor that helps you create your pipelines with autosuggestions. To use them, press control + space on your keyboard.
# Whenever you need to specify a model, this editor helps you out as well. Just type your Hugging Face organization and a forward slash (/) to see available models.
# This is a Generative Question Answering pipeline for English with a good vector-based Retriever and OpenAI's GPT-3.5 model as a PromptNode
version: '1.21.0'
name: 'GenerativeQuestionAnswering_GPT-3.5'
# This section defines nodes that you want to use in your pipelines. Each node must have a name and a type. You can also set the node's parameters here.
# The name is up to you, you can give your component a friendly name. You then use components' names when specifying their order in the pipeline.
# Type is the class name of the component.
components:
- name: DocumentStore
type: DeepsetCloudDocumentStore
- name: BM25Retriever # The keyword-based retriever
type: BM25Retriever
params:
document_store: DocumentStore
top_k: 10 # The number of results to return
- name: EmbeddingRetriever # Selects the most relevant documents from the document store
type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
params:
document_store: DocumentStore
embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search. It has been trained on 215M (question, answer) pairs from diverse sources.
model_format: sentence_transformers
top_k: 10 # The number of results to return
- name: JoinResults # Joins the results from both retrievers
type: JoinDocuments
params:
join_mode: concatenate # Combines documents from multiple retrievers
- name: Reranker # Uses a cross-encoder model to rerank the documents returned by the two retrievers
type: SentenceTransformersRanker
params:
model_name_or_path: cross-encoder/ms-marco-MiniLM-L-6-v2 # Fast model optimized for reranking
top_k: 4 # The number of results to return
batch_size: 20 # Try to keep this number equal or larger to the sum of the top_k of the two retrievers so all docs are processed at once
- name: qa_template
type: PromptTemplate
params:
output_parser:
type: AnswerParser
prompt: "You are a technical expert. \
You answer questions truthfully based on provided documents. \
For each document check whether it is related to the question. \
Only use documents that are related to the question to answer it. \
Ignore documents that are not related to the question. \
If the answer exists in several documents, summarize them. \
Only answer based on the documents provided. Don't make things up. \
Always use references in the form [NUMBER OF DOCUMENT] when using information from a document. e.g. [3], for Document[3]. \
The reference must only refer to the number that comes in square brackets after passage. \
Otherwise, do not use brackets in your answer and reference ONLY the number of the passage without mentioning the word passage. \
If the documents can't answer the question or you are unsure say: 'The answer can't be found in the text'. \
{new_line}\
These are the documents:\
{join(documents, delimiter=new_line, pattern=new_line+'Document[$idx]:'+new_line+'$content')}\
{new_line}\
Question: {query}\
{new_line}\
Answer:\
{new_line}"
- name: PromptNode
type: PromptNode
params:
default_prompt_template: qa_template
max_length: 400 # The maximum number of tokens the generated answer can have
model_kwargs: # Specifies additional model settings
temperature: 0 # Lower temperature works best for fact-based qa
model_name_or_path: gpt-3.5-turbo
- name: FileTypeClassifier # Routes files based on their extension to appropriate converters, by default txt, pdf, md, docx, html
type: FileTypeClassifier
- name: TextConverter # Converts files into documents
type: TextConverter
- name: PDFConverter # Converts PDFs into documents
type: PDFToTextConverter
- name: Preprocessor # Splits documents into smaller ones and cleans them up
type: PreProcessor
params:
# With a vector-based retriever, it's good to split your documents into smaller ones
split_by: word # The unit by which you want to split the documents
split_length: 250 # The max number of words in a document
split_overlap: 20 # Enables the sliding window approach
language: en
split_respect_sentence_boundary: True # Retains complete sentences in split documents
# Here you define how the nodes are organized in the pipelines
# For each node, specify its input
pipelines:
- name: query
nodes:
- name: BM25Retriever
inputs: [Query]
- name: EmbeddingRetriever
inputs: [Query]
- name: JoinResults
inputs: [BM25Retriever, EmbeddingRetriever]
- name: Reranker
inputs: [JoinResults]
- name: PromptNode
inputs: [Reranker]
- name: indexing
nodes:
# Depending on the file type, we use a Text or PDF converter
- name: FileTypeClassifier
inputs: [File]
- name: TextConverter
inputs: [FileTypeClassifier.output_1] # Ensures that this converter receives txt files
- name: PDFConverter
inputs: [FileTypeClassifier.output_2] # Ensures that this converter receives PDFs
- name: Preprocessor
inputs: [TextConverter, PDFConverter]
- name: EmbeddingRetriever
inputs: [Preprocessor]
- name: DocumentStore
inputs: [EmbeddingRetriever]
Baseline RAG QA with Default Prompt and an Open Source Model
This pipeline uses the files you uploaded to deepset Cloud (or the files from your VPC connected to deepset Cloud) to generate answers. This is done by adding the Retriever node, which fetches the documents from the DocumentStore. The documents are then passed on to the model in the prompt.
# If you need help with the YAML format, have a look at https://docs.cloud.deepset.ai/docs/create-a-pipeline#create-a-pipeline-using-yaml.
# This is a friendly editor that helps you create your pipelines with autosuggestions. To use them, press Control + Space on your keyboard.
# Whenever you need to specify a model, this editor helps you out as well. Just type your Hugging Face organization and a forward slash (/) to see available models.
# This is a Generative Question Answering pipeline for English with a good vector-based Retriever and Google's open-source FLAN-T5 model. Recommended for advanced users who want more control over models and prompts.
version: '1.21.0'
name: 'GenerativeQuestionAnswering_FLAN-T5'
# This section defines the nodes you want to use in your pipelines. Each node must have a name and a type. You can also set the node's parameters here.
# The name is up to you, you can give your component a friendly name. You then use components' names when specifying their order in the pipeline.
# Type is the class name of the component.
components:
- name: DocumentStore
type: DeepsetCloudDocumentStore # The only supported document store in deepset Cloud
- name: Retriever # Selects the most relevant documents from the document store so that the LLM can base its generation on it.
type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
params:
document_store: DocumentStore
embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search
model_format: sentence_transformers
top_k: 1 # The number of documents to return
- name: PromptNode # The component that generates the answer based on the documents it gets from the retriever
type: PromptNode
params:
default_prompt_template: deepset/question-answering # A ready-made prompt that passes documents to the model as context
model_name_or_path: google/flan-t5-large # A free large language model for PromptNode. For production scenarios, we recommend a paid model.
top_k: 3 # The number of answers to generate, you can change this value.
- name: FileTypeClassifier # Routes files based on their extension to appropriate converters, by default txt, pdf, md, docx, html
type: FileTypeClassifier
- name: TextConverter # Converts files into documents
type: TextConverter
- name: PDFConverter # Converts PDFs into documents
type: PDFToTextConverter
- name: Preprocessor # Splits documents into smaller ones and cleans them up
type: PreProcessor
params:
# With a vector-based retriever, it's good to split your documents into smaller ones
split_by: word # The unit by which you want to split the documents
split_length: 250 # The max number of words in a document
split_overlap: 30 # Enables the sliding window approach
split_respect_sentence_boundary: True # Retains complete sentences in split documents
language: en # Used by NLTK to best detect the sentence boundaries for that language
# Here you define how the nodes are organized in the pipelines
# For each node, specify its input
pipelines:
- name: query
nodes:
- name: Retriever
inputs: [Query]
- name: PromptNode
inputs: [Retriever]
- name: indexing
nodes:
# Depending on the file type, we use a Text or PDF converter
- name: FileTypeClassifier
inputs: [File]
- name: TextConverter
inputs: [FileTypeClassifier.output_1] # Ensures this converter receives TXT files
- name: PDFConverter
inputs: [FileTypeClassifier.output_2] # Ensures this converter receives PDFs
- name: Preprocessor
inputs: [TextConverter, PDFConverter]
- name: Retriever
inputs: [Preprocessor]
- name: DocumentStore
inputs: [Retriever]
This pipeline template is a good starting point. For production systems, we recommend changing the free FLAN T5 model to a better-performing model, such as OpenAI's gpt-3.5 turbo.
You can modify the PromptNode to use a custom prompt or another ready-made prompt template. For more information, see PromptNode. You may also have a look at Prompt Engineering Guidelines for guidance on how to create prompts and then check Experimenting with Prompts to learn how to use Prompt Explorer to work on your prompts.
Pipeline That Detects Hallucinations
This pipeline uses the HallucinationDetector node to show if the generated answer is grounded in the documents the pipeline runs on:
version: '1.21.0'
name: 'Generative_QA'
components:
- name: HallucinationDetector
type: TransformersHallucinationDetector
- name: DocumentStore
type: DeepsetCloudDocumentStore
- name: EmbeddingRetriever # Selects the most relevant documents from the document store
type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
params:
document_store: DocumentStore
embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search. It has been trained on 215M (question, answer) pairs from diverse sources.
model_format: sentence_transformers
top_k: 10 # The number of results to return
- name: PromptNode
type: PromptNode
params:
default_prompt_template: deepset/question-answering
max_length: 400 # The maximum number of tokens the generated answer can have
model_kwargs: # Specifies additional model settings
temperature: 0 # Lower temperature works best for fact-based qa
model_name_or_path: gpt-3.5-turbo
- name: FileTypeClassifier # Routes files based on their extension to appropriate converters, by default txt, pdf, md, docx, html
type: FileTypeClassifier
- name: TextConverter # Converts files into documents
type: TextConverter
- name: PDFConverter # Converts PDFs into documents
type: PDFToTextConverter
- name: Preprocessor # Splits documents into smaller ones and cleans them up
type: PreProcessor
params:
# With a vector-based retriever, it's good to split your documents into smaller ones
split_by: word # The unit by which you want to split the documents
split_length: 250 # The max number of words in a document
split_overlap: 20 # Enables the sliding window approach
language: en
split_respect_sentence_boundary: True # Retains complete sentences in split documents
# Here you define how the nodes are organized in the pipelines
# For each node, specify its input
pipelines:
- name: query
nodes:
- name: EmbeddingRetriever
inputs: [Query]
- name: PromptNode
inputs: [EmbeddingRetriever]
- name: HallucinationDetector
inputs: [PromptNode]
- name: indexing
nodes:
# Depending on the file type, we use a Text or PDF converter
- name: FileTypeClassifier
inputs: [File]
- name: TextConverter
inputs: [FileTypeClassifier.output_1] # Ensures that this converter receives txt files
- name: PDFConverter
inputs: [FileTypeClassifier.output_2] # Ensures that this converter receives PDFs
- name: Preprocessor
inputs: [TextConverter, PDFConverter]
- name: EmbeddingRetriever
inputs: [Preprocessor]
- name: DocumentStore
inputs: [EmbeddingRetriever]
Updated about 23 hours ago