NvidiaGenerator
Generate text using NVIDIA's models through the NVIDIA NIM API.
Key Features
- Connects to models self-hosted with NVIDIA NIM or hosted on the NVIDIA API Catalog.
- Accepts string prompts and returns string replies.
- Supports configurable generation parameters such as
temperature,top_p, andmax_tokensviamodel_arguments. - Requires an NVIDIA API key set via the
NVIDIA_API_KEYenvironment variable.
Configuration
- Drag the
NvidiaGeneratorcomponent onto the canvas from the Component Library. - Click on the component to open the configuration panel.
- On the General tab:
- Set the model name. See NVIDIA NIMs for supported models.
- Set the NVIDIA API key. Connect the platform to NVIDIA first. For instructions, see Use NVIDIA Models.
- Go to the Advanced tab to configure
api_url,model_arguments, andtimeout.
Connections
NvidiaGenerator accepts a prompt string as input. Connect its prompt input to the prompt output of PromptBuilder.
It outputs replies as a list of strings and meta as a list of metadata dictionaries. Connect its replies output to DeepsetAnswerBuilder.
Source Code
To check this component's source code, open generator.py in the Haystack Core Integrations repository.
Usage Examples
Basic Configuration
NvidiaGenerator:
type: haystack_integrations.components.generators.nvidia.generator.NvidiaGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: meta/llama3-70b-instruct
api_url: https://integrate.api.nvidia.com/v1
model_arguments:
temperature: 0.2
top_p: 0.7
max_tokens: 1024
This pipeline uses NvidiaGenerator to generate replies to a question. It uses DeepsetAnswerBuilder to build the answers with references.
components:
retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: ''
max_chunk_bytes: 104857600
embedding_dim: 1024
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
similarity: cosine
top_k: 10
NvidiaTextEmbedder:
type: haystack_integrations.components.embedders.nvidia.text_embedder.NvidiaTextEmbedder
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: nvidia/nv-embedqa-e5-v5
api_url: https://integrate.api.nvidia.com/v1
prefix: ''
suffix: ''
truncate:
timeout:
prompt_builder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
required_variables: "*"
template: |-
You are a technical expert.
You answer questions truthfully based on provided documents.
If the answer exists in several documents, summarize them.
Ignore documents that don't contain the answer to the question.
Only answer based on the documents provided. Don't make things up.
If no information related to the question can be found in the document, say so.
Always use references in the form [NUMBER OF DOCUMENT] when using information from a document, for example [3] for Document [3].
Never name the documents, only enter a number in square brackets as a reference.
These are the documents:
{%- if documents|length > 0 %}
{%- for document in documents %}
Document [{{ loop.index }}]:
{{ document.content }}
{% endfor -%}
{%- else %}
No relevant documents found.
{% endif %}
Question: {{ question }}
Answer:
NvidiaGenerator:
type: haystack_integrations.components.generators.nvidia.generator.NvidiaGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- NVIDIA_API_KEY
strict: true
model: meta/llama3-70b-instruct
api_url: https://integrate.api.nvidia.com/v1
model_arguments:
temperature: 0.2
top_p: 0.7
max_tokens: 1024
timeout:
answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm
connections:
- sender: NvidiaTextEmbedder.embedding
receiver: retriever.query_embedding
- sender: retriever.documents
receiver: prompt_builder.documents
- sender: prompt_builder.prompt
receiver: NvidiaGenerator.prompt
- sender: NvidiaGenerator.replies
receiver: answer_builder.replies
- sender: retriever.documents
receiver: answer_builder.documents
- sender: prompt_builder.prompt
receiver: answer_builder.prompt
inputs:
query:
- NvidiaTextEmbedder.text
- prompt_builder.question
- answer_builder.query
filters:
- retriever.filters
outputs:
documents: retriever.documents
answers: answer_builder.answers
max_runs_per_component: 100
metadata: {}
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
prompt | str | Text to be sent to the generative model. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
replies | List[str] | A list of replies generated by the model. | |
meta | List[Dict[str, Any]] | Information about the request, such as token count and model details. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
model | Optional[str] | None | Name of the model to use for text generation. See the NVIDIA NIMs for more information on the supported models. |
api_key | Optional[Secret] | Secret.from_env_var('NVIDIA_API_KEY') | API key for the NVIDIA NIM. Set it as the NVIDIA_API_KEY environment variable or pass it here. |
api_url | str | os.getenv('NVIDIA_API_URL', DEFAULT_API_URL) | Custom API URL for the NVIDIA NIM. |
model_arguments | Optional[Dict[str, Any]] | None | Additional arguments to pass to the model provider. These arguments are specific to a model. Search your model in the NVIDIA NIM to find the arguments it accepts. |
timeout | Optional[float] | None | Timeout for request calls, if not set it is inferred from the NVIDIA_TIMEOUT environment variable or set to 60 by default. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
prompt | str | Text to be sent to the generative model. |
Related Information
Was this page helpful?