Create a Custom Component

Create components tailored specifically to your use case and use them in your pipelines.

About This Task

A component is a Python code snippet that follows our template and performs a specific task on your data. When you create and upload a custom component to deepset Cloud, it becomes accessible to your entire organization. Any member can then use the component in their pipelines.

Custom components are based on Haystack components. Haystack is deepset's open source AI framework, which also powers deepset Cloud. To learn more, visit the Haystack website.

We provide a template for creating your custom components, which are available as a GitHub repository. This template serves as a custom components library for your organization. Components created in the ./dc-custom-component-template/src/dc_custom_component/example_components/ folder and imported into deepset Cloud are the components you can use in your pipelines. Only the components present in the most recently uploaded template are available for use.

For example, if someone in your organization creates a component called WelcomeTextGenerator and uploads it to deepset Cloud, everyone in the organization can use it. However, if later someone adds two new components, GoodbyeTextGenerator and CharacterSplitter, and deletes WelcomeTextGenerator, only the new components will be available to use in your pipelines. WelcomeTextGenerator will no longer be accessible.

Currently, you can't delete custom components.

Prerequisites

Create a Component

Prepare the Template

  1. Fork the dc-custom-component-template GitHub repository. This will let you version control your changes.
  2. Navigate to the directory where you cloned the repository and open the ./dc-custom-component-template/src/dc_custom_component/example_components/ directory. The preprocessors and rankers folders are examples you can modify or remove as needed.
  3. Create a new folder or rename one of the example folders to match your custom component's name and open it. There's a .py file inside. This is where you'll write your component code. You can rename this file as well.
    Example:
    To create a custom component called WelcomeTextGenerator:
    1. Rename ./dc-custom-component-template/src/dc_custom_component/example_components/preprocessors to ./dc-custom-component-template/src/dc_custom_component/components/generators.
    2. Open the generators folder and rename the example file character_splitter.py to welcome_text_generator.py.
    3. Delete the rankers folder if it's not needed. If you're creating multiple components, use the folder structure to keep them organized.
      Note: You can create all your components in one .py file or you can have a separate folder with a .py file in it for each custom component. That's up to you.

      📘

      Component Folder as Group Name

      The folder name containing your component code becomes the component group name in Studio. For example, if you place your component in ./dc-custom-component-template/src/dc_custom_component/components/generators, it will appear in the Generators group in the Studio component library.

Set Up a Virtual Environment

Creating a virtual environment isolates your project's dependencies from other Python projects and your system Python installation.

The template uses Hatch, a Python project manager, to set up virtual environments. For details, see the Hatch website.

  1. Install Hatch by running: pip install hatch. This installs all the necessary packages, including pytest.
  2. Create a virtual environment by running: hatch shell.

Implement the Component

  1. Write the component code in the .py file. Use the recipe below as a starting point:
  1. If your component has dependencies, add them in the ./dc-custom-component-template/pyproject.toml file in the dependencies section:
dependencies = [
  "haystack-ai>=2.0.0"
]

Note: Do not modify versions of dependencies already listed in this file.

  1. From the project root directory, run thehatch run code-quality:all command to format your code.
  2. Update the component version in the ./dc-custom-component-template/src/dc_custom_component/__about__.py file. You can specify version numbers in any way you like, but we suggest you adopt the major.minor.micro format, for example, 1.1.0. Have a look at Hatch versioning for guidelines.
    The version number applies to all components. Even if you have multiple components, you only need to specify one version number.

Components Connecting to Third-Party Providers

If your component connects to a third-party provider that requires authentication, we recommend adding the API key as a secret to keep it hidden from the code. You can do this using deepset Cloud's Secrets feature.

First, add the secret on the Secrets page:

  1. In deepset Cloud, click your initials in the top right corner and choose Secrets>Add New Secret.

  2. Give your secret the same name as the environment variable where you want to store it.

  3. Paste the API key into the Secret field and save it.

Then, add the secret to your componentinit() method:

Add the secret to the component's code by specifying the environment variable name, which must match the secret's name:

@component
class MyComponent
    def __init__(self, api_key: Secret = Secret.from_env_var("<ENV_VAR_NAME")):
    # the name of the environment variable must be the same as the name of the secret
        ...

If you have multiple secrets for one provider, you can easily switch between them in your pipeline YAML by updating the secret's name:

llm:
  type: dc_custom_component.components.my_components.component1.MyComponent # the path to your component
  init_parameters:
     api_key: {"type": "env_var", "env_vars": ["ENV_VAR_NAME"], "strict": False} # uses the `ENV_VAR_NAME` secret
     # to use another secret, update its name here
    

Test Your Component

When you upload your component to deepset Cloud, we verify the structure and version of the uploaded .zip file. We recommend that you test your component before uploading.

  1. Add unit and integration tests in the ./dc-custom-component-template/tests folder to ensure everything works fine.
  2. Run your tests using: hatch run tests. If the tests pass, your component is ready.

Import the Component to deepset Cloud

There are two ways to import your component:

  • Through REST API (supported for all systems)
  • Using commands (currently supported for Linux and macOS, support for Windows is coming soon)

Import with REST API:

  1. Zip the repository from the template folder. The zipped repository should contain the same files in the same hierarchy as the dc-custom-component-template repository.
zip -r ../custom_component.zip ./*
Compress-Archive -Path .\* -DestinationPath ..\custom_component.zip -Force

This command creates a zip file called custom_component.zip in the parent directory.

  1. Upload the .zip file to deepset Cloud using the Import Custom Components endpoint. Here's a sample code you can use as a starting point for your request:
curl 
--request POST \
--url https://api.cloud.deepset.ai/api/v2/custom_components \
--header 'accept: application/json' \
--header 'Authorization: Bearer api_XXX' \
--form 'file=@"/path/to/custom/component/custom_component.zip";type=application/zip'

Import with Commands

This method works only on Linux and macOS systems.

  1. Set your deepset Cloud API key:
    export API_KEY=<YOUR_API_KEY>
    
  2. From within this project, run the following command to upload your custom component:
hatch run dc:build-and-push

This command creates a .zip file called custom_component.zip in the dist directory and uploads it to deepset Cloud.

Verify the Import

Check the component status using the Get Custom Components endpoint. If the status is finished, you can use the component in your pipelines. Here is a sample code you can use:

curl --request GET \
     --url 'https://api.cloud.deepset.ai/api/v2/custom_components?limit=10&page_number=1&field=created_at&order=DESC' \
     --header 'accept: application/json'
     --header 'Authorization: Bearer api_XXX'

Add the Component to Your Pipeline

  1. Open the pipeline where you want to add the component in Studio. (On the Pipelines page, click More actions next to the pipeline and choose Studio.)

  2. In the components library, expand the component group that contains your component.
    Tip: The group's name is the same as the name of the template folder where you saved your component. For example, if your component code is in ./dc-custom-component-template/src/dc_custom_component/rankers/my_ranker.py, you'll find it in the Rankers group.

    The rankers group in component studio with a custom component called RegexBooster highlighted
  3. Drag your component onto the canvas and configure its parameters and connections with other components.

Update a Component

You update custom components by uploading their new version to deepset Cloud:

  1. Pull the latest version of the dc-custom-component-template repository.
  2. Update the component code in the ./dc-custom-component-template/src/dc_custom_component/components/<your_components_folder>/<custom_component>.py file.
  3. Update the component version in the ./dc-custom-component-template/src/dc_custom_component/__about__.py file.

📘

How do I check the current version?

If you don't know what version is currently uploaded to deepset Cloud, use the Get Custom Components endpoint. Version is listed in the version parameter of a successful response.

  1. If the component has any dependencies, add them in the dependencies section of the ./dc-custom-component-template/pyproject.toml file. Do not modify versions of dependencies already listed in this file.
  2. Upload the updated component with one of these methods:
    1. With REST API: Zip the repository and upload it using the Import Custom Components endpoint.
    2. With a command (currently only supported for Linux and macOS):
      1. Set your deepset Cloud API key: export API_KEY=<YOUR_API_KEY>
      2. hatch run dc:build-and-push. This creates a custom_component.zip file and uploads it to deepset Cloud.
  3. Check the upload status using the Get Custom Components endpoint.

All new pipelines automatically use the latest version of your custom components. However, running pipelines continue to use the version that was current when they were deployed. To update the component version in running pipelines, undeploy and redeploy them.

Compare Different Versions

To evaluate which version of your component performs better in a pipeline, you can upload two versions of the component simultaneously, each with a unique name.

  1. Pull the latest version of the dc-custom-component-template repository.
  2. Add two versions of the component to the ./dc-custom-component-template/src/dc_custom_component/components/<your_component_folder>/<your_component_name>.py file, treating them as separate components and giving each version a distinct name.
  3. Update the component version in the ./dc-custom-component-template/src/dc_custom_component/__about__.py file.
  4. Zip the repository and upload it with one of these methods:
    1. With REST API: Zip the repository and upload it using the Import Custom Components endpoint.
    2. With a command (currently only supported for Linux and macOS):
      1. Set your deepset Cloud API key: export API_KEY=<YOUR_API_KEY>
      2. hatch run dc:build-and-push. This creates a custom_component.zip file and uploads it to deepset Cloud.
  5. Check the upload status using the Get Custom Component endpoint.

Now, you can create two pipelines—one using the first version and another using the second—to compare their performance.

Troubleshooting Custom Components

If you're having issues with your component, check the pipeline logs for details. To access pipeline logs:

  1. Go to Pipelines and click the pipeline you want to troubleshoot.

  2. Click Logs on the Pipeline Details page to view all information messages, warnings, and errors your pipeline produced.

    The pipeline details page for a RAG pipeline with the logs tab opened and information messages displayed.

Related Links