GoogleGenAIDocumentEmbedder
Computes document embeddings using Google AI models.
Basic Information
- Type:
haystack_integrations.components.embedders.google_genai.document_embedder.GoogleGenAIDocumentEmbedder
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of documents to embed. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A dictionary with the following keys: - documents: A list of documents with embeddings. - meta: Information about the usage of the model. | |
| meta | Dict[str, Any] | A dictionary with the following keys: - documents: A list of documents with embeddings. - meta: Information about the usage of the model. |
Overview
Work in Progress
Bear with us while we're working on adding pipeline examples and most common components connections.
Computes document embeddings using Google AI models.
Authentication examples
1. Gemini Developer API (API Key Authentication)
from haystack_integrations.components.embedders.google_genai import GoogleGenAIDocumentEmbedder
# export the environment variable (GOOGLE_API_KEY or GEMINI_API_KEY)
document_embedder = GoogleGenAIDocumentEmbedder(model="text-embedding-004")
**2. Vertex AI (Application Default Credentials)**
```python
from haystack_integrations.components.embedders.google_genai import GoogleGenAIDocumentEmbedder
# Using Application Default Credentials (requires gcloud auth setup)
document_embedder = GoogleGenAIDocumentEmbedder(
api="vertex",
vertex_ai_project="my-project",
vertex_ai_location="us-central1",
model="text-embedding-004"
)
3. Vertex AI (API Key Authentication)
from haystack_integrations.components.embedders.google_genai import GoogleGenAIDocumentEmbedder
# export the environment variable (GOOGLE_API_KEY or GEMINI_API_KEY)
document_embedder = GoogleGenAIDocumentEmbedder(
api="vertex",
model="text-embedding-004"
)
Usage Example
components:
GoogleGenAIDocumentEmbedder:
type: integrations.google_genai.src.haystack_integrations.components.embedders.google_genai.document_embedder.GoogleGenAIDocumentEmbedder
init_parameters:
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| api_key | Secret | Secret.from_env_var(['GOOGLE_API_KEY', 'GEMINI_API_KEY'], strict=False) | Google API key, defaults to the GOOGLE_API_KEY and GEMINI_API_KEY environment variables. Not needed if using Vertex AI with Application Default Credentials. Go to https://aistudio.google.com/app/apikey for a Gemini API key. Go to https://cloud.google.com/vertex-ai/generative-ai/docs/start/api-keys for a Vertex AI API key. |
| api | Literal['gemini', 'vertex'] | gemini | Which API to use. Either "gemini" for the Gemini Developer API or "vertex" for Vertex AI. |
| vertex_ai_project | Optional[str] | None | Google Cloud project ID for Vertex AI. Required when using Vertex AI with Application Default Credentials. |
| vertex_ai_location | Optional[str] | None | Google Cloud location for Vertex AI (e.g., "us-central1", "europe-west1"). Required when using Vertex AI with Application Default Credentials. |
| model | str | text-embedding-004 | The name of the model to use for calculating embeddings. The default model is text-embedding-ada-002. |
| prefix | str | A string to add at the beginning of each text. | |
| suffix | str | A string to add at the end of each text. | |
| batch_size | int | 32 | Number of documents to embed at once. |
| progress_bar | bool | True | If True, shows a progress bar when running. |
| meta_fields_to_embed | Optional[List[str]] | None | List of metadata fields to embed along with the document text. |
| embedding_separator | str | \n | Separator used to concatenate the metadata fields to the document text. |
| config | Optional[Dict[str, Any]] | None | A dictionary of keyword arguments to configure embedding content configuration types.EmbedContentConfig. If not specified, it defaults to {"task_type": "SEMANTIC_SIMILARITY"}. For more information, see the Google AI Task types. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of documents to embed. |
Was this page helpful?