Skip to main content

NvidiaDocumentEmbedder

A component for embedding documents using embedding models provided by

Basic Information

  • Type: haystack_integrations.nvidia.src.haystack_integrations.components.embedders.nvidia.document_embedder.NvidiaDocumentEmbedder

Inputs

ParameterTypeDefaultDescription
documentsList[Document]A list of Documents to embed.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]A dictionary with the following keys and values: - documents - List of processed Documents with embeddings. - meta - Metadata on usage statistics, etc.
metaDict[str, Any]A dictionary with the following keys and values: - documents - List of processed Documents with embeddings. - meta - Metadata on usage statistics, etc.

Overview

Work in Progress

Bear with us while we're working on adding pipeline examples and most common components connections.

A component for embedding documents using embedding models provided by NVIDIA NIMs.

Usage Example

components:
NvidiaDocumentEmbedder:
type: nvidia.src.haystack_integrations.components.embedders.nvidia.document_embedder.NvidiaDocumentEmbedder
init_parameters:

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
modelOptional[str]NoneEmbedding model to use. If no specific model along with locally hosted API URL is provided, the system defaults to the available model found using /models API.
api_keyOptional[Secret]Secret.from_env_var('NVIDIA_API_KEY')API key for the NVIDIA NIM.
api_urlstros.getenv('NVIDIA_API_URL', DEFAULT_API_URL)Custom API URL for the NVIDIA NIM. Format for API URL is http://host:port
prefixstrA string to add to the beginning of each text.
suffixstrA string to add to the end of each text.
batch_sizeint32Number of Documents to encode at once. Cannot be greater than 50.
progress_barboolTrueWhether to show a progress bar or not.
meta_fields_to_embedOptional[List[str]]NoneList of meta fields that should be embedded along with the Document text.
embedding_separatorstr\nSeparator used to concatenate the meta fields to the Document text.
truncateOptional[Union[EmbeddingTruncateMode, str]]NoneSpecifies how inputs longer than the maximum token length should be truncated. If None the behavior is model-dependent, see the official documentation for more information.
timeoutOptional[float]NoneTimeout for request calls, if not set it is inferred from the NVIDIA_TIMEOUT environment variable or set to 60 by default.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
documentsList[Document]A list of Documents to embed.