STACKITDocumentEmbedder
Compute document embeddings using STACKIT as the model provider. The embedding of each document is stored in the embedding field of the Document object.
Key Features
- Computes dense vector embeddings for documents using STACKIT embedding models.
- Embeds documents in configurable batch sizes for efficient processing.
- Optionally embeds metadata fields along with document content.
- Configurable prefix and suffix for text preprocessing.
Configuration
- Drag the
STACKITDocumentEmbeddercomponent onto the canvas from the Component Library. - Click on the component to open the configuration panel.
- On the General tab:
- Connect Haystack Platform to your STACKIT account by creating a secret called
STACKIT_API_KEY. For more information about secrets, see Secrets. - Select the embedding model to use.
- Connect Haystack Platform to your STACKIT account by creating a secret called
- Go to the Advanced tab to configure
timeout,max_retries,http_client_kwargs,prefix,suffix, and metadata embedding options.
Connections
STACKITDocumentEmbedder receives documents to embed from PreProcessors like DocumentSplitter. It sends the embedded documents to DocumentWriter, which writes them into a document store.
Source Code
To check this component's source code, open document_embedder.py in the Haystack Core Integrations repository.
Usage Examples
Basic Configuration
STACKITDocumentEmbedder:
type: stackit.src.haystack_integrations.components.embedders.stackit.document_embedder.STACKITDocumentEmbedder
init_parameters: {}
Use this component in indexing pipelines. Connect a preprocessor like DocumentSplitter to its documents input, and connect its documents output to DocumentWriter.
components:
STACKITDocumentEmbedder:
type: stackit.src.haystack_integrations.components.embedders.stackit.document_embedder.STACKITDocumentEmbedder
init_parameters:
Parameters
Inputs
| Parameter | Type | Description |
|---|---|---|
documents | List[Document] | A list of documents to embed. |
Outputs
| Parameter | Type | Description |
|---|---|---|
documents | List[Document] | Documents with embeddings stored in the embedding field. |
meta | Dict[str, Any] | Information about the embedding operation. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| api_key | Secret | Secret.from_env_var('STACKIT_API_KEY') | The STACKIT API key. |
| model | str | The name of the model to use. | |
| api_base_url | Optional[str] | https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1 | The STACKIT API Base url. For more details, see STACKIT docs. |
| prefix | str | A string to add to the beginning of each text. | |
| suffix | str | A string to add to the end of each text. | |
| batch_size | int | 32 | Number of Documents to encode at once. |
| progress_bar | bool | True | Whether to show a progress bar or not. Can be helpful to disable in production deployments to keep the logs clean. |
| meta_fields_to_embed | Optional[List[str]] | None | List of meta fields that should be embedded along with the Document text. |
| embedding_separator | str | \n | Separator used to concatenate the meta fields to the Document text. |
| timeout | Optional[float] | None | Timeout for STACKIT client calls. If not set, it defaults to either the OPENAI_TIMEOUT environment variable, or 30 seconds. |
| max_retries | Optional[int] | None | Maximum number of retries to contact STACKIT after an internal error. If not set, it defaults to either the OPENAI_MAX_RETRIES environment variable, or set to 5. |
| http_client_kwargs | Optional[Dict[str, Any]] | None | A dictionary of keyword arguments to configure a custom httpx.Clientor httpx.AsyncClient. For more information, see the HTTPX documentation. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
This component has no run() method parameters.
Related Information
Was this page helpful?