Create a Custom Component
Create components tailored specifically to your use case and use them in your pipelines.
About This Task
A component is a Python code snippet that follows our template and performs a specific task on your data. When you create and upload a custom component to deepset Cloud, it becomes accessible to your entire organization. Any member can then use the component in their pipelines.
Custom components are based on Haystack components. Haystack is deepset's open source AI framework, which also powers deepset Cloud. To learn more, visit the Haystack website.
We provide a template for creating your custom components, which are available as a GitHub repository. This template serves as a custom components library for your organization. Components created in the ./dc-custom-component-template/src/dc_custom_component/example_components/
folder and imported into deepset Cloud are the components you can use in your pipelines. Only the components present in the most recently uploaded template are available for use.
For example, if someone in your organization creates a component called WelcomeTextGenerator
and uploads it to deepset Cloud, everyone in the organization can use it. However, if later someone adds two new components, GoodbyeTextGenerator
and CharacterSplitter
, and deletes WelcomeTextGenerator
, only the new components will be available to use in your pipelines. WelcomeTextGenerator
will no longer be accessible.
Currently, you can't delete custom components.
Prerequisites
- Read the following resources before completing this task:
- You should have a basic understanding of working with GitHub repositories.
- Generate a deepset Cloud API key to upload the created component to your workspace. For instructions, see Generate an API Key.
Create a Component
Prepare the Template
- Fork the dc-custom-component-template GitHub repository. This will let you version control your changes.
- Navigate to the directory where you cloned the repository and open the
./dc-custom-component-template/src/dc_custom_component/example_components/
directory. Thepreprocessors
andrankers
folders are examples you can modify or remove as needed. - Create a new folder or rename one of the example folders to match your custom component's name and open it. There's a
.py
file inside. This is where you'll write your component code. You can rename this file as well.
Example:
To create a custom component calledWelcomeTextGenerator
:- Rename
./dc-custom-component-template/src/dc_custom_component/example_components/preprocessors
to./dc-custom-component-template/src/dc_custom_component/components/generators
. - Open the
generators
folder and rename the example filecharacter_splitter.py
towelcome_text_generator.py
. - Delete the
rankers
folder if it's not needed. If you're creating multiple components, use the folder structure to keep them organized.
Note: You can create all your components in one.py
file or you can have a separate folder with a.py
file in it for each custom component. That's up to you.Component Folder as Group Name
The folder name containing your component code becomes the component group name in Pipeline Builder. For example, if you place your component in
./dc-custom-component-template/src/dc_custom_component/components/generators
, it will appear in the Generators group in the Pipeline Builder component library.
- Rename
Set Up a Virtual Environment
Creating a virtual environment isolates your project's dependencies from other Python projects and your system Python installation.
The template uses Hatch, a Python project manager, to set up virtual environments. For details, see the Hatch website.
- Install Hatch by running:
pip install hatch
. This installs all the necessary packages, including pytest. - Create a virtual environment by running:
hatch shell
.
Implement the Component
- Write the component code in the
.py
file. Use the recipe below as a starting point:
- If your component has dependencies, add them in the
./dc-custom-component-template/pyproject.toml
file in thedependencies
section:
dependencies = [
"haystack-ai>=2.0.0"
]
Note: Do not modify versions of dependencies already listed in this file.
- From the project root directory, run the
hatch run code-quality:all
command to format your code. - Update the component version in the
./dc-custom-component-template/src/dc_custom_component/__about__.py
file. You can specify version numbers in any way you like, but we suggest you adopt the major.minor.micro format, for example, 1.1.0. Have a look at Hatch versioning for guidelines.
The version number applies to all components. Even if you have multiple components, you only need to specify one version number.
Components Connecting to Third-Party Providers
If your component connects to a third-party provider that requires authentication, store the API key in an environment variable and then retrieve it in the component's run()
method.
The following component shows how to do this. First, you add the env_var_name
init parameter and then you retrieve this value in the run()
method using os.getenv(self.env_var_name)
. The retrieved value is stored in loaded_api_key
.
@component
class CustomComponent:
def __init__(self, env_var_name: str = "API_KEY"):
self.env_var_name = env_var_name
@component.output_types(answers=List[GeneratedAnswer])
def run(self, query: str) -> Dict[str, List[GeneratedAnswer]]:
loaded_api_key = os.getenv(self.env_var_name)
# use the key
pass
Test Your Component
We recommend that you test your component before uploading.
- Add unit and integration tests in the
./dc-custom-component-template/tests
folder to ensure everything works fine. - Run your tests using:
hatch run tests
. If the tests pass, your component is ready.
What to Do Next
Upload your component to deepset Cloud to use it in your pipelines.
Updated about 1 month ago