Skip to main content

Use Voyage AI Models

Create embeddings using top-performing embedding models by Voyage AI.


About This Task

Third Party Integration

Voyage AI is a third party integration developed by an external provider and is not maintained by deepset. While we encourage you to explore it, we recommend reviewing it carefully to ensure it meets your needs.

Use Voyage AI models to calculate embeddings for documents and queries in your pipelines. For available models, see Voyage AI documentation.

Prerequisites

You need an API key from Voyage AI. For details, see the Voyage website.

Use Voyage Models

First, connect deepset AI Platform to Voyage AI through the Integrations page. You can set up the connection for a single workspace or for the whole organization:

Add Workspace-Level Integration

  1. Click your profile icon and choose Settings.
  2. Go to Workspace>Integrations.
  3. Find the provider you want to connect and click Connect next to them.
  4. Enter the API key and any other required details.
  5. Click Connect. You can use this integration in pipelines and indexes in the current workspace.

Add Organization-Level Integration

  1. Click your profile icon and choose Settings.
  2. Go to Organization>Integrations.
  3. Find the provider you want to connect and click Connect next to them.
  4. Enter the API key and any other required details.
  5. Click Connect. You can use this integration in pipelines and indexes in all workspaces in the current organization.

Then, add a component that uses a Voyage model. There are two components available:

  • VoyageTextEmbedder: Embeds text strings. You can use it in a query pipeline to embed the query and pass it to an embedding retriever.
  • VoyageDocumentEmbedder: Embeds documents. You can use it in indexes to calculate embeddings for documents.

    Embedding Models in Query Pipelines and Indexes

    The embedding model you use to embed documents in your indexing pipeline must be the same as the embedding model you use to embed the query in your query pipeline.

    This means the embedders for your indexing and query pipelines must match. For example, if you use CohereDocumentEmbedder to embed your documents, you should use CohereTextEmbedder with the same model to embed your queries.

Usage Examples

This is an example of an index and a query pipeline (each in a separate tab) that uses Voyage models to embed text (query pipeline) and documents (index):

components:
...
splitter:
type: haystack.components.preprocessors.document_splitter.DocumentSplitter
init_parameters:
split_by: word
split_length: 250
split_overlap: 30

document_embedder:
type: haystack_integrations.components.embedders.voyage_embedders.voyage_document_embedder.VoyageDocumentEmbedder
init_parameters:
model: "voyage-2" # the model to use

writer:
type: haystack.components.writers.document_writer.DocumentWriter
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
embedding_dim: 768
similarity: cosine
policy: OVERWRITE

connections: # Defines how the components are connected
...
- sender: splitter.documents
receiver: document_embedder.documents
- sender: document_embedder.documents
receiver: writer.documents

This is how to connect the components in Pipeline Builder. In the index, VoyageDocumentEmbedder embeds documents from DocumentSplitter and sends them to DocumentWriter that writes them into the document store where the query pipeline can access them:

In the index, VoyageDocumentEmbedder embeds documents from DocumentSplitter and sends them to DocumentWriter that writes them into the document store where the query pipeline can access them

In the query pipeline, VoyageTextEmbedder embeds the query using the same model as VoyageDocumentEmbedder in the index. It then sends the embedded query to the Retriever:

In the query pipeline, VoyageTextEmbedder embeds the query using the same model as VoyageDocumentEmbedder in the index. It then sends the embedded query to the Retriever