Building with Large Language Models (LLMs)

LLMs show remarkable capabilities in understanding and generating human-like text. Have a look at how you can use them in your deepset Cloud pipelines.

LLMs in Your Pipelines

You can easily integrate LLMs in your deepset Cloud pipelines using the versatile Generators coupled with a PromptBuilder. Generators work well for retrieval augmented generation (RAG) question answering and other tasks such as text classification, summarization, and more. It performs the specific task you define within the prompt you pass to it.

Ready-Made Templates for LLM Apps

Pipeline Templates in deepset Cloud offer a variety of ready-made templates you can use with default settings. Templates with RAG in the title use an LLM and have streaming enabled by default.

In the Conversational category, you'll find RAG Chat templates designed specifically for chat scenarios, such as customer assistants. These chat pipelines include chat history in their responses, ensuring the LLM considers this information to create a human-like conversation experience. All chat pipelines have streaming enabled by default.

Streaming

Streaming refers to the process of generating responses in real time as the model processes input. Instead of waiting for the entire input to be processed before responding, the model generates an answer token by token, making the communication feel more fluid and immediate.

Streaming is particularly useful in applications where timely feedback is crucial, such as in live chat interfaces or conversational agents.

All RAG pipelines in deepset Cloud have streaming enabled by default.

Creating an LLM App

You can start with an out-of-the-box template and then experiment with models and prompts:

  1. Create a Pipeline from a template. Choose any RAG template. You can start with RAG Question Answering GPT-3.5. It uses OpenAIGenerator with the GPT-3.5 model.
    By changing the prompt and the Generator, you can adjust it to NLP tasks beyond generative question answering.
  2. Experiment with your pipeline settings:
    1. Try changing the model. You use models through Generator components. There are Generators for various LLMs, including GPT-4, Cohere, Anthropic, or Meta models. For a full list, see Generators.
    2. Experiment with different prompts. PromptBuilder is the component that contains the prompt. Engineer your prompts in Prompt Studio, a sandbox environment in deepset Cloud, to refine and test your prompts.

Learn More

Here's a collection of information you may want to explore for more information related to LLMs and generative AI in deepset Cloud.

About LLMs

Generative AI in Practice

Prompt Engineering