Create a Custom Component

Create components tailored specifically to your use case and use them in your pipelines.

About This Task

A component is a Python code snippet that follows our template and performs a specific task on your data. When you create and upload a custom component to deepset Cloud, it becomes accessible to your entire organization. Any member can then use the component in their pipelines.

Custom components are based on Haystack components. Haystack is deepset's open source AI framework, which also powers deepset Cloud. To learn more, visit the Haystack website.

We provide a template for creating your custom components, which are available as a GitHub repository. This template serves as a custom components library for your organization. Components created in the ./dc-custom-component-template/src/dc_custom_component/example_components/ folder and imported into deepset Cloud are the components you can use in your pipelines. Only the components present in the most recently uploaded template are available for use.

For example, if someone in your organization creates a component called WelcomeTextGenerator and uploads it to deepset Cloud, everyone in the organization can use it. However, if later someone adds two new components, GoodbyeTextGenerator and CharacterSplitter, and deletes WelcomeTextGenerator, only the new components will be available to use in your pipelines. WelcomeTextGenerator will no longer be accessible.

Currently, you can't delete custom components.

Prerequisites

Create a Component

Prepare the Template

  1. Fork the dc-custom-component-template GitHub repository. This will let you version control your changes.
  2. Navigate to the directory where you cloned the repository and open the ./dc-custom-component-template/src/dc_custom_component/example_components/ directory. The preprocessors and rankers folders are examples you can modify or remove as needed.
  3. Create a new folder or rename one of the example folders to match your custom component's name and open it. There's a .py file inside. This is where you'll write your component code. You can rename this file as well.
    Example:
    To create a custom component called WelcomeTextGenerator:
    1. Rename ./dc-custom-component-template/src/dc_custom_component/example_components/preprocessors to ./dc-custom-component-template/src/dc_custom_component/components/generators.
    2. Open the generators folder and rename the example file character_splitter.py to welcome_text_generator.py.
    3. Delete the rankers folder if it's not needed. If you're creating multiple components, use the folder structure to keep them organized.
      Note: You can create all your components in one .py file or you can have a separate folder with a .py file in it for each custom component. That's up to you.

      📘

      Component Folder as Group Name

      The folder name containing your component code becomes the component group name in Pipeline Builder. For example, if you place your component in ./dc-custom-component-template/src/dc_custom_component/components/generators, it will appear in the Generators group in the Pipeline Builder component library.

Set Up a Virtual Environment

Creating a virtual environment isolates your project's dependencies from other Python projects and your system Python installation.

The template uses Hatch, a Python project manager, to set up virtual environments. For details, see the Hatch website.

  1. Install Hatch by running: pip install hatch. This installs all the necessary packages, including pytest.
  2. Create a virtual environment by running: hatch shell.

Implement the Component

  1. Write the component code in the .py file. Use the recipe below as a starting point:
  1. If your component has dependencies, add them in the ./dc-custom-component-template/pyproject.toml file in the dependencies section:
dependencies = [
  "haystack-ai>=2.0.0"
]

Note: Do not modify versions of dependencies already listed in this file.

  1. From the project root directory, run thehatch run code-quality:all command to format your code.
  2. Update the component version in the ./dc-custom-component-template/src/dc_custom_component/__about__.py file. You can specify version numbers in any way you like, but we suggest you adopt the major.minor.micro format, for example, 1.1.0. Have a look at Hatch versioning for guidelines.
    The version number applies to all components. Even if you have multiple components, you only need to specify one version number.

Components Connecting to Third-Party Providers

If your component connects to a third-party provider that requires authentication, store the API key in an environment variable and then retrieve it in the component's run() method.

The following component shows how to do this. First, you add the env_var_name init parameter and then you retrieve this value in the run() method using os.getenv(self.env_var_name). The retrieved value is stored in loaded_api_key.

@component
class CustomComponent:
    def __init__(self, env_var_name: str = "API_KEY"):
        self.env_var_name = env_var_name

    @component.output_types(answers=List[GeneratedAnswer])
    def run(self, query: str) -> Dict[str, List[GeneratedAnswer]]:
        loaded_api_key = os.getenv(self.env_var_name)
        # use the key
        pass

Test Your Component

We recommend that you test your component before uploading.

  1. Add unit and integration tests in the ./dc-custom-component-template/tests folder to ensure everything works fine.
  2. Run your tests using: hatch run tests. If the tests pass, your component is ready.

What to Do Next

Upload your component to deepset Cloud to use it in your pipelines.