Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

STACKITDocumentEmbedder

Compute document embeddings using STACKIT as the model provider. The embedding of each document is stored in the embedding field of the Document object.

Key Features

  • Computes dense vector embeddings for documents using STACKIT embedding models.
  • Embeds documents in configurable batch sizes for efficient processing.
  • Optionally embeds metadata fields along with document content.
  • Configurable prefix and suffix for text preprocessing.

Configuration

  1. Drag the STACKITDocumentEmbedder component onto the canvas from the Component Library.
  2. Click on the component to open the configuration panel.
  3. On the General tab:
    • Connect Haystack Platform to your STACKIT account by creating a secret called STACKIT_API_KEY. For more information about secrets, see Secrets.
    • Select the embedding model to use.
  4. Go to the Advanced tab to configure timeout, max_retries, http_client_kwargs, prefix, suffix, and metadata embedding options.

Connections

STACKITDocumentEmbedder receives documents to embed from PreProcessors like DocumentSplitter. It sends the embedded documents to DocumentWriter, which writes them into a document store.

Source Code

To check this component's source code, open document_embedder.py in the Haystack Core Integrations repository.

Usage Examples

Basic Configuration

  STACKITDocumentEmbedder:
type: stackit.src.haystack_integrations.components.embedders.stackit.document_embedder.STACKITDocumentEmbedder
init_parameters: {}

Use this component in indexing pipelines. Connect a preprocessor like DocumentSplitter to its documents input, and connect its documents output to DocumentWriter.

components:
STACKITDocumentEmbedder:
type: stackit.src.haystack_integrations.components.embedders.stackit.document_embedder.STACKITDocumentEmbedder
init_parameters:

Parameters

Inputs

ParameterTypeDescription
documentsList[Document]A list of documents to embed.

Outputs

ParameterTypeDescription
documentsList[Document]Documents with embeddings stored in the embedding field.
metaDict[str, Any]Information about the embedding operation.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
api_keySecretSecret.from_env_var('STACKIT_API_KEY')The STACKIT API key.
modelstrThe name of the model to use.
api_base_urlOptional[str]https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1The STACKIT API Base url. For more details, see STACKIT docs.
prefixstrA string to add to the beginning of each text.
suffixstrA string to add to the end of each text.
batch_sizeint32Number of Documents to encode at once.
progress_barboolTrueWhether to show a progress bar or not. Can be helpful to disable in production deployments to keep the logs clean.
meta_fields_to_embedOptional[List[str]]NoneList of meta fields that should be embedded along with the Document text.
embedding_separatorstr\nSeparator used to concatenate the meta fields to the Document text.
timeoutOptional[float]NoneTimeout for STACKIT client calls. If not set, it defaults to either the OPENAI_TIMEOUT environment variable, or 30 seconds.
max_retriesOptional[int]NoneMaximum number of retries to contact STACKIT after an internal error. If not set, it defaults to either the OPENAI_MAX_RETRIES environment variable, or set to 5.
http_client_kwargsOptional[Dict[str, Any]]NoneA dictionary of keyword arguments to configure a custom httpx.Clientor httpx.AsyncClient. For more information, see the HTTPX documentation.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

This component has no run() method parameters.