NvidiaDocumentEmbedder
A component for embedding documents using embedding models provided by
Basic Information
- Type:
haystack_integrations.nvidia.src.haystack_integrations.components.embedders.nvidia.document_embedder.NvidiaDocumentEmbedder
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of Documents to embed. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A dictionary with the following keys and values: - documents - List of processed Documents with embeddings. - meta - Metadata on usage statistics, etc. | |
| meta | Dict[str, Any] | A dictionary with the following keys and values: - documents - List of processed Documents with embeddings. - meta - Metadata on usage statistics, etc. |
Overview
Work in Progress
Bear with us while we're working on adding pipeline examples and most common components connections.
A component for embedding documents using embedding models provided by NVIDIA NIMs.
Usage Example
components:
NvidiaDocumentEmbedder:
type: nvidia.src.haystack_integrations.components.embedders.nvidia.document_embedder.NvidiaDocumentEmbedder
init_parameters:
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | Optional[str] | None | Embedding model to use. If no specific model along with locally hosted API URL is provided, the system defaults to the available model found using /models API. |
| api_key | Optional[Secret] | Secret.from_env_var('NVIDIA_API_KEY') | API key for the NVIDIA NIM. |
| api_url | str | os.getenv('NVIDIA_API_URL', DEFAULT_API_URL) | Custom API URL for the NVIDIA NIM. Format for API URL is http://host:port |
| prefix | str | A string to add to the beginning of each text. | |
| suffix | str | A string to add to the end of each text. | |
| batch_size | int | 32 | Number of Documents to encode at once. Cannot be greater than 50. |
| progress_bar | bool | True | Whether to show a progress bar or not. |
| meta_fields_to_embed | Optional[List[str]] | None | List of meta fields that should be embedded along with the Document text. |
| embedding_separator | str | \n | Separator used to concatenate the meta fields to the Document text. |
| truncate | Optional[Union[EmbeddingTruncateMode, str]] | None | Specifies how inputs longer than the maximum token length should be truncated. If None the behavior is model-dependent, see the official documentation for more information. |
| timeout | Optional[float] | None | Timeout for request calls, if not set it is inferred from the NVIDIA_TIMEOUT environment variable or set to 60 by default. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of Documents to embed. |
Was this page helpful?