Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

NvidiaTextEmbedder

Embeds a query string using NVIDIA models and returns a vector for use in semantic search.

Embedding Models in Query Pipelines and Indexes

The embedding model you use to embed documents in your indexing pipeline must be the same as the embedding model you use to embed the query in your query pipeline.

This means the embedders for your indexing and query pipelines must match. For example, if you use CohereDocumentEmbedder to embed your documents, you should use CohereTextEmbedder with the same model to embed your queries.

Key Features

  • Embeds query strings using NVIDIA embedding models.
  • Supports both self-hosted NVIDIA NIM models and NVIDIA API Catalog models.
  • Returns the query as a vector for use with embedding retrievers.
  • Configurable prefix and suffix text for instruction-following models.
  • Configurable truncation mode for inputs that exceed the maximum token length.

Configuration

Authentication

Connect Haystack Platform to NVIDIA on the Integrations page. For detailed instructions, see Use NVIDIA Models.

  1. Drag the NvidiaTextEmbedder component onto the canvas from the Component Library.
  2. Click the component to open the configuration panel.
  3. On the General tab:
    1. Enter the name of the NVIDIA embedding model to use, such as nvidia/nv-embedqa-e5-v5.
  4. Go to the Advanced tab to configure the API key, API URL, prefix and suffix text, and truncation mode.

Connections

NvidiaTextEmbedder accepts a text string as input and outputs an embedding (list of floats) and a meta dictionary with usage statistics.

Connect the Input component to its text input. Connect its embedding output to an embedding retriever, such as OpenSearchEmbeddingRetriever, to find semantically similar documents.

For embedding documents in indexes, use NvidiaDocumentEmbedder instead. Make sure to use the same model in both components.

Usage Example

This is an example of a query pipeline with NvidiaTextEmbedder that receives a query, embeds it, and sends it to OpenSearchEmbeddingRetriever to find matching documents.

components:
NvidiaTextEmbedder:
type: haystack_integrations.components.embedders.nvidia.text_embedder.NvidiaTextEmbedder
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: nvidia/nv-embedqa-e5-v5
api_url: https://integrate.api.nvidia.com/v1
prefix: ''
suffix: ''
truncate:
timeout:
OpenSearchEmbeddingRetriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
filters:
top_k: 10
filter_policy: replace
custom_query:
raise_on_failure: true
efficient_filtering: true
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: nvidia-embeddings-index
max_chunk_bytes: 104857600
embedding_dim: 1024
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
similarity: cosine

connections:
- sender: NvidiaTextEmbedder.embedding
receiver: OpenSearchEmbeddingRetriever.query_embedding

max_runs_per_component: 100

metadata: {}

inputs:
query:
- NvidiaTextEmbedder.text

outputs:
documents: OpenSearchEmbeddingRetriever.documents

Parameters

Inputs

ParameterTypeDefaultDescription
textstrThe text to embed.

Outputs

ParameterTypeDefaultDescription
embeddingList[float]The embedding of the text.
metaDict[str, Any]Metadata about the request, including usage statistics.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
modelOptional[str]NoneEmbedding model to use. If no specific model along with locally hosted API URL is provided, the system defaults to the available model found using /models API.
api_keyOptional[Secret]Secret.from_env_var('NVIDIA_API_KEY')API key for the NVIDIA NIM.
api_urlstros.getenv('NVIDIA_API_URL', DEFAULT_API_URL)Custom API URL for the NVIDIA NIM. Format for API URL is http://host:port
prefixstrA string to add to the beginning of each text.
suffixstrA string to add to the end of each text.
truncateOptional[Union[EmbeddingTruncateMode, str]]NoneSpecifies how inputs longer that the maximum token length should be truncated. If None the behavior is model-dependent, see the official documentation for more information.
timeoutOptional[float]NoneTimeout for request calls, if not set it is inferred from the NVIDIA_TIMEOUT environment variable or set to 60 by default.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
textstrThe text to embed.