STACKITDocumentEmbedder
Computes document embeddings using STACKIT as the model provider and stores them in each document's embedding field.
Key Features
- Computes embeddings using STACKIT's OpenAI-compatible embedding API.
- Processes documents in batches with an optional progress bar.
- Embeds metadata fields alongside document content.
- Adds optional prefix and suffix strings to each document before embedding.
- Configurable timeout, retries, and custom HTTP client settings.
Configuration
You need a STACKIT API key to use this component. Create a secret called STACKIT_API_KEY in your workspace. For more information, see Add Secrets.
- Drag the
STACKITDocumentEmbeddercomponent onto the canvas from the Component Library. - Click the component to open the configuration panel.
- On the General tab:
- Enter the name of the STACKIT embedding model to use.
- Go to the Advanced tab to configure the API key, API base URL, timeout, maximum retries, and HTTP client settings.
Connections
STACKITDocumentEmbedder accepts a list of documents as input. It outputs the same documents with embeddings stored in the embedding field.
Use this component in indexing pipelines. Connect a preprocessor like DocumentSplitter to its documents input, and connect its documents output to DocumentWriter.
Usage Example
components:
STACKITDocumentEmbedder:
type: stackit.src.haystack_integrations.components.embedders.stackit.document_embedder.STACKITDocumentEmbedder
init_parameters:
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|
Outputs
| Parameter | Type | Default | Description |
|---|
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| api_key | Secret | Secret.from_env_var('STACKIT_API_KEY') | The STACKIT API key. |
| model | str | The name of the model to use. | |
| api_base_url | Optional[str] | https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1 | The STACKIT API Base url. For more details, see STACKIT docs. |
| prefix | str | A string to add to the beginning of each text. | |
| suffix | str | A string to add to the end of each text. | |
| batch_size | int | 32 | Number of Documents to encode at once. |
| progress_bar | bool | True | Whether to show a progress bar or not. Can be helpful to disable in production deployments to keep the logs clean. |
| meta_fields_to_embed | Optional[List[str]] | None | List of meta fields that should be embedded along with the Document text. |
| embedding_separator | str | \n | Separator used to concatenate the meta fields to the Document text. |
| timeout | Optional[float] | None | Timeout for STACKIT client calls. If not set, it defaults to either the OPENAI_TIMEOUT environment variable, or 30 seconds. |
| max_retries | Optional[int] | None | Maximum number of retries to contact STACKIT after an internal error. If not set, it defaults to either the OPENAI_MAX_RETRIES environment variable, or set to 5. |
| http_client_kwargs | Optional[Dict[str, Any]] | None | A dictionary of keyword arguments to configure a custom httpx.Clientor httpx.AsyncClient. For more information, see the HTTPX documentation. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|
Was this page helpful?