Use Case: Generative AI Systems

Find out when it's best to use a generative question answering system, what type of data it needs, and who it's best for.

Description

Generative AI search systems are capable of creating new, original text based on the documents you feed to them. Unlike extractive question answering systems that highlight the exact answer in the text, generative systems generate coherent and meaningful answers from scratch.

Generative models can perform a variety of NLP tasks, depending on what you tell them to do in the prompt. Some examples include summarization, translation, or question answering. See the sections below for concrete use cases.

Additional Considerations

Before you decide to use a generative model, there are a couple of things you should take into account:

  • Choose a model trained on diverse and unbiased data. This helps to avoid inaccurate or biased responses.
  • Make sure you design and monitor your generative AI system to prevent it from generating inappropriate or harmful content. Use models from reliable vendors or fine-tune your own model. Formulate the prompt text so that it minimizes the risk of prompt injection, for example, by instructing the model to answer with "I don't know" when the answer is not in the documents.
  • Be transparent and make it clear how your system arrives at the answers. In deepset Cloud, you can show the sources for each answer. This helps build user trust in the system.

Generative QA Systems

A generative QA system is best for:

  • Tasks that require the generation of natural language responses to questions.
  • Chatbots, virtual assistants, and knowledge management systems where human-like interaction is important.

Data

You can use any text data. Some examples are:

  • Company policies, benefits, training opportunities, and other HR-related material.
  • Technical documentation and product FAQ

Users

  • Data scientists: Design the QA system, create the pipelines, and supervise domain experts.
  • End users: Use the system, evaluate its usefulness for business, and provide feedback in the deepset Cloud UI.

Pipeline

Here is an example of a pipeline definition file for this use case. It contains both the indexing and the query pipeline.

# If you need help with the YAML format, have a look at https://docs.cloud.deepset.ai/docs/create-a-pipeline#create-a-pipeline-using-yaml.
# This is a friendly editor that helps you create your pipelines with autosuggestions. To use them, press Control + Space on your keyboard.
# Whenever you need to specify a model, this editor helps you out as well. Just type your Hugging Face organization and a forward slash (/) to see available models.

# This is a Generative Question Answering pipeline for English with a good vector-based Retriever and Google's open-source FLAN-T5 model. Recommended for advanced users who want more control over models and prompts.
version: '1.16.0'
name: 'GenerativeQuestionAnswering_FLAN-T5'

# This section defines the nodes you want to use in your pipelines. Each node must have a name and a type. You can also set the node's parameters here.
# The name is up to you, you can give your component a friendly name. You then use components' names when specifying their order in the pipeline.
# Type is the class name of the component. 
components:
  - name: DocumentStore
    type: DeepsetCloudDocumentStore # The only supported document store in deepset Cloud
  - name: Retriever # Selects the most relevant documents from the document store so that the LLM can base its generation on it. 
    type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
    params:
      document_store: DocumentStore
      embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search 
      model_format: sentence_transformers
      top_k: 1 # The number of documents to return
  - name: PromptNode # The component that generates the answer based on the documents it gets from the retriever 
    type: PromptNode
    params:
      default_prompt_template: question-answering # A default prompt for question answering. 
      model_name_or_path: google/flan-t5-large # A free large language model for PromptNode. For production scenarios, we recommend a paid model.
      top_k: 3 # The number of answers to generate, you can change this value.
  - name: FileTypeClassifier # Routes files based on their extension to appropriate converters, by default txt, pdf, md, docx, html
    type: FileTypeClassifier
  - name: TextConverter # Converts files into documents
    type: TextConverter
  - name: PDFConverter # Converts PDFs into documents
    type: PDFToTextConverter
  - name: Preprocessor # Splits documents into smaller ones and cleans them up
    type: PreProcessor
    params:
      # With a vector-based retriever, it's good to split your documents into smaller ones
      split_by: word # The unit by which you want to split the documents
      split_length: 250 # The max number of words in a document
      split_overlap: 30 # Enables the sliding window approach
      split_respect_sentence_boundary: True # Retains complete sentences in split documents
      language: en # Used by NLTK to best detect the sentence boundaries for that language

# Here you define how the nodes are organized in the pipelines
# For each node, specify its input
pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: PromptNode
        inputs: [Retriever]
  - name: indexing
    nodes:
    # Depending on the file type, we use a Text or PDF converter
      - name: FileTypeClassifier
        inputs: [File]
      - name: TextConverter
        inputs: [FileTypeClassifier.output_1] # Ensures this converter receives TXT files
      - name: PDFConverter
        inputs: [FileTypeClassifier.output_2] # Ensures this converter receives PDFs
      - name: Preprocessor
        inputs: [TextConverter, PDFConverter]
      - name: Retriever
        inputs: [Preprocessor]
      - name: DocumentStore
        inputs: [Retriever]

This pipeline template is a good starting point. For production systems, we recommend changing the free FLAN T5 model to a better-performing model, such as OpenAI's gpt-3.5 turbo.

You can modify the PromptNode to use a custom prompt or another ready-made prompt template. For more information, see PromptNode. You may also have a look at Prompt Engineering Guidelines for guidance on how to create prompts.

What To Do Next?

You can now demo your search system to the users. Share your pipeline prototype and have them test your pipelines. Have a look at the Guidelines for Onboarding Your Users to ensure your demo is successful.