Generative Question Answering Pipelines

These pipelines use a large language model to generate answers based on the model's general knowledge and the documents you feed to it.

📘

Retrieval Augmented Generation (RAG) Pipelines

These pipelines generate answers based on your documents. The Retriever fetches the documents from the Document Store and passes them to the model in the prompt. Essentially, RAG pipelines begin with a document retrieval step. Check Document Retrieval Pipelines for more examples. You can combine the document retrieval step with the PromptNode in your RAG pipelines.

RAG QA with GPT-3, Hybrid Retrieval, and a Custom Prompt

This pipeline uses Open AI's GPT-3 model text-davinci-003 and a combination of vector-based and keyword-based retrievers. You need an API key from an active Open AI account to use this model.

It's a retrieval augmented generation (RAG) pipeline, which means it uses the files you uploaded to deepset Cloud (or the files from your VPC connected to deepset Cloud) rather than the model's knowledge of the world to generate the answers. It passes the files in the prompt using the Document variable. This pipeline has both, the vector-based and the keyword-based retriever and a JoinDocuments node that joins the results retrieved by both retrievers and passes them on to the model in the prompt.

This pipeline is available as a ready-made template in deepset Cloud. You can choose it when creating a pipeline with the YAML editor and selecting From template.

It uses a custom prompt passed in the prompt parameter of the PromptTemplate component. You can modify the prompt directly in the prompt parameter.

# If you need help with the YAML format, have a look at https://docs.cloud.deepset.ai/docs/create-a-pipeline#create-a-pipeline-using-yaml.
# This is a friendly editor that helps you create your pipelines with autosuggestions. To use them, press control + space on your keyboard.
# Whenever you need to specify a model, this editor helps you out as well. Just type your Hugging Face organization and a forward slash (/) to see available models.

# This is a Generative Question Answering pipeline for English with a good vector-based Retriever and OpenAI's GPT-3.5 model as a PromptNode
version: '1.21.0'
name: 'GenerativeQuestionAnswering_GPT-3.5'

# This section defines nodes that you want to use in your pipelines. Each node must have a name and a type. You can also set the node's parameters here.
# The name is up to you, you can give your component a friendly name. You then use components' names when specifying their order in the pipeline.
# Type is the class name of the component. 
components:
  - name: DocumentStore
    type: DeepsetCloudDocumentStore
  - name: BM25Retriever # The keyword-based retriever
    type: BM25Retriever
    params:
      document_store: DocumentStore
      top_k: 10 # The number of results to return
  - name: EmbeddingRetriever # Selects the most relevant documents from the document store
    type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
    params:
      document_store: DocumentStore
      embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search. It has been trained on 215M (question, answer) pairs from diverse sources.
      model_format: sentence_transformers
      top_k: 10 # The number of results to return
  - name: JoinResults # Joins the results from both retrievers
    type: JoinDocuments
    params:
      join_mode: concatenate # Combines documents from multiple retrievers
  - name: Reranker # Uses a cross-encoder model to rerank the documents returned by the two retrievers
    type: SentenceTransformersRanker
    params:
      model_name_or_path: cross-encoder/ms-marco-MiniLM-L-6-v2 # Fast model optimized for reranking
      top_k: 4 # The number of results to return
      batch_size: 20  # Try to keep this number equal or larger to the sum of the top_k of the two retrievers so all docs are processed at once
  - name: qa_template
    type: PromptTemplate
    params:
      output_parser:
        type: AnswerParser
      prompt: "You are a technical expert. \
        You answer questions truthfully based on provided documents. \
        For each document check whether it is related to the question. \
        Only use documents that are related to the question to answer it. \
        Ignore documents that are not related to the question. \
        If the answer exists in several documents, summarize them. \
        Only answer based on the documents provided. Don't make things up. \
        Always use references in the form [NUMBER OF DOCUMENT] when using information from a document. e.g. [3], for Document[3]. \
        The reference must only refer to the number that comes in square brackets after passage. \
        Otherwise, do not use brackets in your answer and reference ONLY the number of the passage without mentioning the word passage. \
        If the documents can't answer the question or you are unsure say: 'The answer can't be found in the text'. \
        {new_line}\
        These are the documents:\
        {join(documents, delimiter=new_line, pattern=new_line+'Document[$idx]:'+new_line+'$content')}\
        {new_line}\
        Question: {query}\
        {new_line}\
        Answer:\
        {new_line}"
  - name: PromptNode
    type: PromptNode
    params:
      default_prompt_template: qa_template
      max_length: 400 # The maximum number of tokens the generated answer can have
      model_kwargs: # Specifies additional model settings
        temperature: 0 # Lower temperature works best for fact-based qa
      model_name_or_path: gpt-3.5-turbo
  - name: FileTypeClassifier # Routes files based on their extension to appropriate converters, by default txt, pdf, md, docx, html
    type: FileTypeClassifier
  - name: TextConverter # Converts files into documents
    type: TextConverter
  - name: PDFConverter # Converts PDFs into documents
    type: PDFToTextConverter
  - name: Preprocessor # Splits documents into smaller ones and cleans them up
    type: PreProcessor
    params:
      # With a vector-based retriever, it's good to split your documents into smaller ones
      split_by: word # The unit by which you want to split the documents
      split_length: 250 # The max number of words in a document
      split_overlap: 20 # Enables the sliding window approach
      language: en
      split_respect_sentence_boundary: True # Retains complete sentences in split documents

# Here you define how the nodes are organized in the pipelines
# For each node, specify its input
pipelines:
  - name: query
    nodes:
      - name: BM25Retriever
        inputs: [Query]
      - name: EmbeddingRetriever
        inputs: [Query]
      - name: JoinResults
        inputs: [BM25Retriever, EmbeddingRetriever]
      - name: Reranker
        inputs: [JoinResults]
      - name: PromptNode
        inputs: [Reranker]
  - name: indexing
    nodes:
    # Depending on the file type, we use a Text or PDF converter
      - name: FileTypeClassifier
        inputs: [File]
      - name: TextConverter
        inputs: [FileTypeClassifier.output_1] # Ensures that this converter receives txt files
      - name: PDFConverter
        inputs: [FileTypeClassifier.output_2] # Ensures that this converter receives PDFs
      - name: Preprocessor
        inputs: [TextConverter, PDFConverter]
      - name: EmbeddingRetriever
        inputs: [Preprocessor]
      - name: DocumentStore
        inputs: [EmbeddingRetriever]

Baseline RAG QA with Default Prompt and an Open Source Model

This pipeline uses the files you uploaded to deepset Cloud (or the files from your VPC connected to deepset Cloud) to generate answers. This is done by adding the Retriever node, which fetches the documents from the DocumentStore. The documents are then passed on to the model in the prompt.

# If you need help with the YAML format, have a look at https://docs.cloud.deepset.ai/docs/create-a-pipeline#create-a-pipeline-using-yaml.
# This is a friendly editor that helps you create your pipelines with autosuggestions. To use them, press Control + Space on your keyboard.
# Whenever you need to specify a model, this editor helps you out as well. Just type your Hugging Face organization and a forward slash (/) to see available models.

# This is a Generative Question Answering pipeline for English with a good vector-based Retriever and Google's open-source FLAN-T5 model. Recommended for advanced users who want more control over models and prompts.
version: '1.21.0'
name: 'GenerativeQuestionAnswering_FLAN-T5'

# This section defines the nodes you want to use in your pipelines. Each node must have a name and a type. You can also set the node's parameters here.
# The name is up to you, you can give your component a friendly name. You then use components' names when specifying their order in the pipeline.
# Type is the class name of the component. 
components:
  - name: DocumentStore
    type: DeepsetCloudDocumentStore # The only supported document store in deepset Cloud
  - name: Retriever # Selects the most relevant documents from the document store so that the LLM can base its generation on it. 
    type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
    params:
      document_store: DocumentStore
      embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search 
      model_format: sentence_transformers
      top_k: 1 # The number of documents to return
  - name: PromptNode # The component that generates the answer based on the documents it gets from the retriever 
    type: PromptNode
    params:
      default_prompt_template: deepset/question-answering # A ready-made prompt that passes documents to the model as context 
      model_name_or_path: google/flan-t5-large # A free large language model for PromptNode. For production scenarios, we recommend a paid model.
      top_k: 3 # The number of answers to generate, you can change this value.
  - name: FileTypeClassifier # Routes files based on their extension to appropriate converters, by default txt, pdf, md, docx, html
    type: FileTypeClassifier
  - name: TextConverter # Converts files into documents
    type: TextConverter
  - name: PDFConverter # Converts PDFs into documents
    type: PDFToTextConverter
  - name: Preprocessor # Splits documents into smaller ones and cleans them up
    type: PreProcessor
    params:
      # With a vector-based retriever, it's good to split your documents into smaller ones
      split_by: word # The unit by which you want to split the documents
      split_length: 250 # The max number of words in a document
      split_overlap: 30 # Enables the sliding window approach
      split_respect_sentence_boundary: True # Retains complete sentences in split documents
      language: en # Used by NLTK to best detect the sentence boundaries for that language

# Here you define how the nodes are organized in the pipelines
# For each node, specify its input
pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: PromptNode
        inputs: [Retriever]
  - name: indexing
    nodes:
    # Depending on the file type, we use a Text or PDF converter
      - name: FileTypeClassifier
        inputs: [File]
      - name: TextConverter
        inputs: [FileTypeClassifier.output_1] # Ensures this converter receives TXT files
      - name: PDFConverter
        inputs: [FileTypeClassifier.output_2] # Ensures this converter receives PDFs
      - name: Preprocessor
        inputs: [TextConverter, PDFConverter]
      - name: Retriever
        inputs: [Preprocessor]
      - name: DocumentStore
        inputs: [Retriever]

This pipeline template is a good starting point. For production systems, we recommend changing the free FLAN T5 model to a better-performing model, such as OpenAI's gpt-3.5 turbo.

You can modify the PromptNode to use a custom prompt or another ready-made prompt template. For more information, see PromptNode. You may also have a look at Prompt Engineering Guidelines for guidance on how to create prompts and then check Experimenting with Prompts to learn how to use Prompt Explorer to work on your prompts.

Pipeline That Detects Hallucinations

This pipeline uses the HallucinationDetector node to show if the generated answer is grounded in the documents the pipeline runs on:

version: '1.21.0'
name: 'Generative_QA'

components:
  - name: HallucinationDetector
    type: TransformersHallucinationDetector
  - name: DocumentStore
    type: DeepsetCloudDocumentStore
  - name: EmbeddingRetriever # Selects the most relevant documents from the document store
    type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
    params:
      document_store: DocumentStore
      embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search. It has been trained on 215M (question, answer) pairs from diverse sources.
      model_format: sentence_transformers
      top_k: 10 # The number of results to return
  - name: PromptNode
    type: PromptNode
    params:
      default_prompt_template: deepset/question-answering
      max_length: 400 # The maximum number of tokens the generated answer can have
      model_kwargs: # Specifies additional model settings
        temperature: 0 # Lower temperature works best for fact-based qa
      model_name_or_path: gpt-3.5-turbo
  - name: FileTypeClassifier # Routes files based on their extension to appropriate converters, by default txt, pdf, md, docx, html
    type: FileTypeClassifier
  - name: TextConverter # Converts files into documents
    type: TextConverter
  - name: PDFConverter # Converts PDFs into documents
    type: PDFToTextConverter
  - name: Preprocessor # Splits documents into smaller ones and cleans them up
    type: PreProcessor
    params:
      # With a vector-based retriever, it's good to split your documents into smaller ones
      split_by: word # The unit by which you want to split the documents
      split_length: 250 # The max number of words in a document
      split_overlap: 20 # Enables the sliding window approach
      language: en
      split_respect_sentence_boundary: True # Retains complete sentences in split documents

# Here you define how the nodes are organized in the pipelines
# For each node, specify its input
pipelines:
  - name: query
    nodes:
      - name: EmbeddingRetriever
        inputs: [Query]
      - name: PromptNode
        inputs: [EmbeddingRetriever]
      - name: HallucinationDetector
        inputs: [PromptNode]
  - name: indexing
    nodes:
    # Depending on the file type, we use a Text or PDF converter
      - name: FileTypeClassifier
        inputs: [File]
      - name: TextConverter
        inputs: [FileTypeClassifier.output_1] # Ensures that this converter receives txt files
      - name: PDFConverter
        inputs: [FileTypeClassifier.output_2] # Ensures that this converter receives PDFs
      - name: Preprocessor
        inputs: [TextConverter, PDFConverter]
      - name: EmbeddingRetriever
        inputs: [Preprocessor]
      - name: DocumentStore
        inputs: [EmbeddingRetriever]