NvidiaGenerator

Generate text using NVIDIA's models through the NVIDIA NIM API.

Basic Information

Type: haystack_integrations.components.generators.nvidia.generator.NvidiaGenerator
Components it can connect with:
- PromptBuilder: NvidiaGenerator receives a prompt from PromptBuilder.
- DeepsetAnswerBuilder: NvidiaGenerator sends the generated replies to DeepsetAnswerBuilder.

Inputs

Parameter	Type	Default	Description
prompt	str		Text to be sent to the generative model.

Outputs

Parameter	Type	Default	Description
replies	List[str]		A list of replies generated by the model.
meta	List[Dict[str, Any]]		Information about the request, such as token count and model details.

Overview

NvidiaGenerator provides an interface for generating text using LLMs self-hosted with NVIDIA NIM or models hosted on the NVIDIA API Catalog.

You can configure how the model generates text by passing additional arguments to the model through model_arguments. For example, you can set temperature, top_p, and max_tokens.

Authorization

You need an NVIDIA API key to use this component. Connect Haystack Enterprise Platform to NVIDIA on the Integrations page. For detailed instructions, see Use NVIDIA Models.

Usage Example

This pipeline uses NvidiaGenerator to generate replies to a question. It uses DeepsetAnswerBuilder to build the answers with references.

components:
  retriever:
    type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          hosts:
          index: ''
          max_chunk_bytes: 104857600
          embedding_dim: 1024
          return_embedding: false
          method:
          mappings:
          settings:
          create_index: true
          http_auth:
          use_ssl:
          verify_certs:
          timeout:
          similarity: cosine
      top_k: 10
  NvidiaTextEmbedder:
    type: haystack_integrations.components.embedders.nvidia.text_embedder.NvidiaTextEmbedder
    init_parameters:
      api_key:
        type: env_var
        env_vars:
        - NVIDIA_API_KEY
        strict: true
      model: nvidia/nv-embedqa-e5-v5
      api_url: https://integrate.api.nvidia.com/v1
      prefix: ''
      suffix: ''
      truncate:
      timeout:
  prompt_builder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
      required_variables: "*"
      template: |-
        You are a technical expert.
        You answer questions truthfully based on provided documents.
        If the answer exists in several documents, summarize them.
        Ignore documents that don't contain the answer to the question.
        Only answer based on the documents provided. Don't make things up.
        If no information related to the question can be found in the document, say so.
        Always use references in the form [NUMBER OF DOCUMENT] when using information from a document, for example [3] for Document [3].
        Never name the documents, only enter a number in square brackets as a reference.

        These are the documents:
        {%- if documents|length > 0 %}
        {%- for document in documents %}
        Document [{{ loop.index }}]:
        {{ document.content }}
        {% endfor -%}
        {%- else %}
        No relevant documents found.
        {% endif %}

        Question: {{ question }}
        Answer:
  NvidiaGenerator:
    type: haystack_integrations.components.generators.nvidia.generator.NvidiaGenerator
    init_parameters:
      api_key:
        type: env_var
        env_vars:
        - NVIDIA_API_KEY
        strict: true
      model: meta/llama3-70b-instruct
      api_url: https://integrate.api.nvidia.com/v1
      model_arguments:
        temperature: 0.2
        top_p: 0.7
        max_tokens: 1024
      timeout:
  answer_builder:
    type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
    init_parameters:
      reference_pattern: acm

connections:
- sender: NvidiaTextEmbedder.embedding
  receiver: retriever.query_embedding
- sender: retriever.documents
  receiver: prompt_builder.documents
- sender: prompt_builder.prompt
  receiver: NvidiaGenerator.prompt
- sender: NvidiaGenerator.replies
  receiver: answer_builder.replies
- sender: retriever.documents
  receiver: answer_builder.documents
- sender: prompt_builder.prompt
  receiver: answer_builder.prompt

inputs:
  query:
  - NvidiaTextEmbedder.text
  - prompt_builder.question
  - answer_builder.query
  filters:
  - retriever.filters

outputs:
  documents: retriever.documents
  answers: answer_builder.answers

max_runs_per_component: 100

metadata: {}

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

Parameter	Type	Default	Description
model	Optional[str]	None	Name of the model to use for text generation. See the NVIDIA NIMs for more information on the supported models. `Note`: If no specific model along with locally hosted API URL is provided, the system defaults to the available model found using /models API. Check supported models at NVIDIA NIM.
api_key	Optional[Secret]	Secret.from_env_var('NVIDIA_API_KEY')	API key for the NVIDIA NIM. Set it as the `NVIDIA_API_KEY` environment variable or pass it here.
api_url	str	os.getenv('NVIDIA_API_URL', DEFAULT_API_URL)	Custom API URL for the NVIDIA NIM.
model_arguments	Optional[Dict[str, Any]]	None	Additional arguments to pass to the model provider. These arguments are specific to a model. Search your model in the NVIDIA NIM to find the arguments it accepts.
timeout	Optional[float]	None	Timeout for request calls, if not set it is inferred from the `NVIDIA_TIMEOUT` environment variable or set to 60 by default.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

Parameter	Type	Default	Description
prompt	str		Text to be sent to the generative model.

Was this page helpful?

Basic Information​

Inputs​

Outputs​

Overview​

Authorization​

Usage Example​

Parameters​

Init Parameters​

Run Method Parameters​