Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

VertexAIGeminiGenerator

Generate text using Google Gemini models through Vertex AI.

Deprecation Notice

This integration will be deprecated soon. We recommend using GoogleGenAIChatGenerator instead, which provides unified access to both Gemini Developer API and Vertex AI.

Key Features

  • Generates text responses using Google Gemini models through Vertex AI.
  • Supports multimodal inputs, including text and images.
  • Authenticates using Google Cloud Application Default Credentials.
  • Configurable safety settings, generation config, and system instruction.
  • Supports streaming for real-time token delivery.

Configuration

Authentication

This component authenticates using Google Cloud Application Default Credentials (ADCs). For more information, see the official Google documentation.

Create secrets for GCP_PROJECT_ID and optionally GCP_DEFAULT_REGION. For detailed instructions on creating secrets, see Create Secrets.

  1. Drag the VertexAIGeminiGenerator component onto the canvas from the Component Library.
  2. Click the component to open the configuration panel.
  3. On the General tab:
    1. Enter the model name (for example, gemini-2.0-flash). For available models, see Vertex AI models.
  4. Go to the Advanced tab to configure the project ID, location, generation config, safety settings, system instruction, and streaming callback.

Connections

VertexAIGeminiGenerator accepts a variadic input (parts) of strings, ByteStream objects, or Part objects. It outputs a list of generated text strings (replies).

Typically, you connect PromptBuilder to the parts input and AnswerBuilder to the replies output. This component is designed for text generation, not chat. For chat capabilities, use GoogleGenAIChatGenerator instead.

Usage Example

This query pipeline uses VertexAIGeminiGenerator to generate text responses:

components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'default'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 10
fuzziness: 0

PromptBuilder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: |
Given the following information, answer the question.

Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}

Question: {{ query }}
required_variables:
variables:

VertexAIGeminiGenerator:
type: haystack_integrations.components.generators.google_vertex.gemini.VertexAIGeminiGenerator
init_parameters:
project_id:
model: gemini-2.0-flash
location:
generation_config:
safety_settings:
system_instruction:
streaming_callback:

AnswerBuilder:
type: haystack.components.builders.answer_builder.AnswerBuilder
init_parameters:
pattern:
reference_pattern:

connections:
- sender: bm25_retriever.documents
receiver: PromptBuilder.documents
- sender: PromptBuilder.prompt
receiver: VertexAIGeminiGenerator.parts
- sender: VertexAIGeminiGenerator.replies
receiver: AnswerBuilder.replies
- sender: bm25_retriever.documents
receiver: AnswerBuilder.documents

inputs:
query:
- bm25_retriever.query
- PromptBuilder.query
- AnswerBuilder.query

outputs:
answers: AnswerBuilder.answers

max_runs_per_component: 100

metadata: {}

Parameters

Inputs

ParameterTypeDefaultDescription
partsVariadic[Union[str, ByteStream, Part]]Prompt for the model.
streaming_callbackOptional[Callable[[StreamingChunk], None]]NoneA callback function that is called when a new token is received from the stream.

Outputs

ParameterTypeDefaultDescription
repliesList[str]A list of generated content.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
project_idOptional[str]NoneID of the GCP project to use. By default, it is set during Google Cloud authentication.
modelstrgemini-2.0-flashName of the model to use. For available models, see Vertex AI models.
locationOptional[str]NoneThe default location to use when making API calls. If not set, uses us-central-1.
generation_configOptional[Union[GenerationConfig, Dict[str, Any]]]NoneThe generation config to use. Accepted fields: temperature, top_p, top_k, candidate_count, max_output_tokens, stop_sequences.
safety_settingsOptional[Dict[HarmCategory, HarmBlockThreshold]]NoneThe safety settings to use.
system_instructionOptional[Union[str, ByteStream, Part]]NoneDefault system instruction to use for generating content.
streaming_callbackOptional[Callable[[StreamingChunk], None]]NoneA callback function that is called when a new token is received from the stream.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
partsVariadic[Union[str, ByteStream, Part]]Prompt for the model.
streaming_callbackOptional[Callable[[StreamingChunk], None]]NoneA callback function that is called when a new token is received from the stream.