Skip to main content

VertexAITextEmbedder

Embed text using VertexAI Text Embeddings API.

Basic Information

  • Type: haystack_integrations.components.embedders.google_vertex.text_embedder.VertexAITextEmbedder

Inputs

ParameterTypeDefaultDescription
textUnion[List[Document], List[str], str]The text to embed.

Outputs

ParameterTypeDefaultDescription
embeddingList[float]A dictionary with the following keys: - embedding: The embedding of the input text.

Overview

Work in Progress

Bear with us while we're working on adding pipeline examples and most common components connections.

Embed text using VertexAI Text Embeddings API.

See available models in the official Google documentation.

Usage example:

from haystack_integrations.components.embedders.google_vertex import VertexAITextEmbedder

text_to_embed = "I love pizza!"

text_embedder = VertexAITextEmbedder(model="text-embedding-005")

print(text_embedder.run(text_to_embed))
# {'embedding': [-0.08127457648515701, 0.03399784862995148, -0.05116401985287666, ...]

Usage Example

components:
VertexAITextEmbedder:
type: google_vertex.src.haystack_integrations.components.embedders.google_vertex.text_embedder.VertexAITextEmbedder
init_parameters:

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
modelLiteral['text-embedding-004', 'text-embedding-005', 'textembedding-gecko-multilingual@001', 'text-multilingual-embedding-002', 'text-embedding-large-exp-03-07']Name of the model to use.
task_typeLiteral['RETRIEVAL_DOCUMENT', 'RETRIEVAL_QUERY', 'SEMANTIC_SIMILARITY', 'CLASSIFICATION', 'CLUSTERING', 'QUESTION_ANSWERING', 'FACT_VERIFICATION', 'CODE_RETRIEVAL_QUERY']RETRIEVAL_QUERYThe type of task for which the embeddings are being generated. For more information see the official Google documentation.
gcp_region_nameOptional[Secret]Secret.from_env_var('GCP_DEFAULT_REGION', strict=False)The default location to use when making API calls, if not set uses us-central-1.
gcp_project_idOptional[Secret]Secret.from_env_var('GCP_PROJECT_ID', strict=False)ID of the GCP project to use. By default, it is set during Google Cloud authentication.
progress_barboolTrueWhether to display a progress bar during processing.
truncate_dimOptional[int]NoneThe dimension to truncate the embeddings to, if specified.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
textUnion[List[Document], List[str], str]The text to embed.