5-Step Guide to Building a Successful Prototype

Creating a successful AI-based application requires careful planning and execution. This guide outlines five essential steps to help you navigate the process, from identifying your use case to testing your prototype.

Step 1: Identify Your Use Case

Brainstorm and Prioritize

Gather your business and technical teams to brainstorm potential use cases for your LLM app. It’s crucial the team consists of business and technical people. Business teams know and understand business problems, while technical experts can spot what’s doable.

Make sure everyone understands the business value of each use case. For each idea, ask:

  1. How much value would solving this problem bring?
  2. How much effort would it take to solve this problem?

Choose use cases that offer high impact without draining your resources. Aim for a working system that delivers value and functionality, even if it’s not polished yet.

As your first choice, look for use cases that don’t require spending time on data preparation. Make sure you can easily access the data, it’s in a format you can consume, and you have permissions to use it.

When estimating the effort needed to build your solution, remember that it’s often the best choice to start small. AI doesn’t need to replace your whole workflow. Improving one small part of it can already be a big win.

Step 2: Assemble Your Team

To build a successful prototype, you need these key players:

Must-Have Team Members

These are the team members you must have on your team. Otherwise, you’ll miss valuable expertise.

  • Product Leader: Sets the vision, works with stakeholders, and keeps delivery on track.
  • AI Engineer: Knows AI, vector databases, and prompting. Has a product-focused mindset and understands the current LLM and AI landscape. Choose someone who shipped great products before.
  • Domain Expert: Gives advice and tests the app. This will be the subject matter expert who looks at the app from the content perspective.

Additional Team Members

Depending on your app, you might also want:

  • Backend Engineer: Handles the server side of your app and ensures that it is scalable, performant, and secure, among other things.
  • Frontend Engineer: Creates a user-friendly interface and ensures your app is responsive, accessible, and performant.
  • UI/UX Designer: Ensures your app provides a positive and meaningful experience for its users. Conducts user research, designs how users interact with your app, and creates designs for the frontend engineer.
  • DevOps Engineer: Manages and provisions infrastructure, implements systems for monitoring and logging, sets up CI/CD pipelines, ensures efficient allocation of computing resources, and more.
  • Data Engineer: Ensures data quality and consistency.

Ensure that your team has a balanced mix of skills to handle all aspects of development.

💡Team structure when using deepset Cloud

deepset Cloud handles infrastructure, replacing the need for DevOps and Backend Engineers. It offers a customizable testing interface, so you might not need a Frontend Engineer or UX Designer right away. deepset Cloud handles most of the data lifecycle, potentially eliminating the needs for a dedicated Data Engineer.


Step 3: Match Your Use Case to the Technology

At this stage, you plan your app’s architecture, including its components, inputs, and outputs and how the app arrives at the desired output.

Define Inputs and Outputs

Clearly outline what the user inputs into your system and the expected outputs. This ensures the app meets user needs and fits existing workflows. Think about:

  • What’s the input? Text? Images? Audio?
  • How will users phrase their queries? Natural language or keywords?
  • What output do users expect? This could be generated answers, extracted information, a list of documents, generated code, and so on.

Examples of inputs and outputs for common AI-based systems:

  • Recommendation system
    • Input: User’s browsing history and preferences
    • Output: List of recommended products or articles
  • Chatbot
    • Input: User’s text or voice query
    • Output: Relevant information, suggested actions, answers to questions
  • Sentiment analysis tool
    • Input: Customer reviews or social media posts
    • Output: Sentiment scores (positive, negative, neutral)
  • Image recognition tool
    • Input: Uploaded images or photos
    • Output: Object labels, scene description, detected faces
  • Question answering system
    • Input: User’s question in natural language
    • Output: Extracted or generated answer

Sketch System Architecture

Create a rough sketch of your system architecture. Identify the components that will handle the inputs and outputs. Take into account security:

  • Is it an internal application? You might need fewer security controls.
  • Is it high-risk? Plan for strict security controls. Include safeguards against prompt injections and hallucinations.

💡Building with deepset Cloud

deepset Cloud offers templates with built-in safeguards against prompt injections. It also includes groundedness observability dashboard to check if the generated answers stick to your documents.

For an outline of the elements of an LLM system, see Components of an AI-Based App.

Step 4: Build Your Prototype

Get into the building mode. Your goal is to build a testable prototype as fast as possible. Follow these guidelines:

  • Optimize for speed. Prioritize quick delivery over scalability. Don’t get lost in the details.
  • Make it realistic. Your prototype must operate on the data you’ll use in production. Make sure it captures all inputs and outputs.
  • Document everything. Keep track of all data and processes involved.
  • Don’t optimize for scale yet. You first need to verify that what you’re building brings value.
  • Make your prototype interactive. Users must be able to ask a question and get an answer. They must be able to see the value of your app already at this stage.

Timelines

Aim to build your prototype in 1 to 4 weeks. Don’t spend longer - you still need to validate your ideas.

Step 5: Test Your Prototype

Time to ship to actual users. Start small:

  1. Release to 5 to 10 users, gather feedback, and fix major issues. Include Subject Matter Experts or users who will actually use your system, such as those paying for your services. Aim to gather around 100 queries or more. This should be sufficient to find the weak and strong points of your app.
  2. Tweak and release to a few hundred users. For internal apps, let all your users test it. At this stage, you want to collect as many queries as possible.

When releasing for tests, give users a specific timeline, like two weeks or a month. The timeline depends on your project and its constraints. Some projects take a week, while others can take a couple of months.

Make sure users understand how the app works, what data it uses, and what outputs to expect. You may need to educate them on what questions or inputs work and don't work. Ask them to give feedback on each answer. If security matters, encourage users to try breaking it. You can send all these as guidelines to your users before they start testing.

Check feedback frequently to get an idea if it's mostly positive or negative. If negative rating prevails, you need to act and improve the pipeline. While tests are ongoing, don't focus on analyzing feedback in-depth; just check the general metrics.

You may also use additional ways to gain insights from the users, such as a survey to gauge how they liked the app and whether it was useful.

When testing:

  • Record all inputs and outputs.
  • Gather feedback.
  • Refine the prototype to address any issues.

Analyzing Feedback

When the testing phase is over, it's time to take a closer look at the feedback you got. After testing, you’ll have a list of issues to fix before going live. Building an AI-based app is a cycle of test-improve-test. You should constantly monitor your app and gather feedback.

Feedback always skews negatively as people are more likely to give negative feedback than positive feedback. Having this in mind, think about the minimal performance you need for your app. 60% of good feedback may already be sufficient.

When analyzing feedback, look for signs of value:

  • How often was the app used?
  • Do users come back after their first try?
  • Can you spot any usage patterns?

Consider doing user interviews. Ask questions like:

  • Can you show me how you used the app?
  • Was anything unclear?
  • Would you use it if it were a product?
  • Would you pay for it?

Moving to Production

Before going live, check:

  • Performance: Does your app meet the expected response time? Is it fast enough?
  • Scalability: Can it handle the expected number of requests a minute or a day?
  • Security: Do you have all the security controls you need?

By following these steps, you can build a robust LLM application that meets your business needs and delivers measurable value.