ℹ️
Third Party Integration
Voyage AI is a third party integration developed by an external provider and is not maintained by deepset. While we encourage you to explore it, we recommend reviewing it carefully to ensure it meets your needs.

Use Voyage AI models to calculate embeddings for documents and queries in your pipelines. For available models, see Voyage AI documentation.

Prerequisites

You need an API key from Voyage AI. For details, see the Voyage website.

Use Voyage Models

First, connect deepset AI Platform to Voyage AI through the Integrations page. You can do so for a single workspace or for the whole organization:

Add Workspace-Level Integration

Click your profile icon and choose Settings.
Go to Workspace>Integrations.
Find the provider you want to connect and click Connect next to them.
Enter the API key and any other required details.
Click Connect. You can use this integration in pipelines and indexes in the current workspace.

Add Organization-Level Integration

Click your profile icon and choose Settings.
Go to Organization>Integrations.
Find the provider you want to connect and click Connect next to them.
Enter the API key and any other required details.
Click Connect. You can use this integration in pipelines and indexes in all workspaces in the current organization.

For details, see Add Integrations.

Then, add a component that uses a Voyage model. There are two components available:

VoyageTextEmbedder: Embeds text strings. You can use it in a query pipeline to embed the query and pass it to an embedding retriever.
VoyageDocumentEmbedder: Embeds documents. You can use it in indexes to calculate embeddings for documents.

ℹ️
Embedding Models in Query Pipelines and Indexes
The embedding model you use to embed documents in your index must be the same as the embedding model you use to embed the query in your pipeline.
This means the embedders for your indexes and pipelines must match. For example, if you use CohereDocumentEmbedder to embed your documents, you should use CohereTextEmbedder with the same model to embed your queries.

Usage Examples

This is an example of an index and a query pipeline (each in a separate tab) that uses Voyage models to embed text (query pipeline) and documents (index):

components:
  ...
    splitter:
      type: haystack.components.preprocessors.document_splitter.DocumentSplitter
      init_parameters:
        split_by: word
        split_length: 250
        split_overlap: 30

    document_embedder:
      type: haystack_integrations.components.embedders.voyage_embedders.voyage_document_embedder.VoyageDocumentEmbedder
      init_parameters:
        model: "voyage-2" # the model to use

    writer:
      type: haystack.components.writers.document_writer.DocumentWriter
      init_parameters:
        document_store:
          type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
          init_parameters:
            embedding_dim: 768
            similarity: cosine
        policy: OVERWRITE
        
connections:  # Defines how the components are connected
  ...
  - sender: splitter.documents
    receiver: document_embedder.documents
  - sender: document_embedder.documents
    receiver: writer.documents

components:
  ...
    query_embedder:
      type: haystack_integrations.components.embedders.voyage_embedders.voyage_text_embedder.VoyageTextEmbedder
      init_parameters:
        model: "voyage-2" # the model to use
        
    retriever:
      type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
      init_parameters: 
        document_store:
          init_parameters:
            use_ssl: True
            verify_certs: False
            http_auth:
              - "${OPENSEARCH_USER}"
              - "${OPENSEARCH_PASSWORD}"
          type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        top_k: 20 
        
    prompt_builder:
      type: haystack.components.builders.prompt_builder.PromptBuilder
      init_parameters:
        template: |-
          You are a technical expert.
          You answer questions truthfully based on provided documents.
          For each document check whether it is related to the question.
          Only use documents that are related to the question to answer it.
          Ignore documents that are not related to the question.
          If the answer exists in several documents, summarize them.
          Only answer based on the documents provided. Don't make things up.
          If the documents can't answer the question or you are unsure say: 'The answer can't be found in the text'.
          These are the documents:
          {% for document in documents %}
          Document[{{ loop.index }}]:
          {{ document.content }}
          {% endfor %}
          Question: {{question}}
          Answer:

    generator:
      type: haystack.components.generators.openai.OpenAIGenerator
      init_parameters:
        model: "gpt-3.5-turbo" # the model to use
        generation_kwargs: # additional parameters for the model
          max_tokens: 400
          temperature: 0.0
          seed: 0
       
    answer_builder:
      init_parameters: {}
      type: haystack.components.builders.answer_builder.AnswerBuilder
   
      ...
      
  connections:  # Defines how the components are connected
  ...
  - sender: query_embedder.embedding # AmazonBedrockTextEmbedder sends the embedded query to the retriever
    receiver: retriever.query_embedding 
  - sender: retriever.documents
    receiver: prompt_builder.documents
  - sender: prompt_builder.prompt
    receiver: generator.prompt
  - sender: generator.replies
    receiver: answer_builder.replies
    ...
    
  inputs:
   query:
   ..
   - "query_embedder.text" # TextEmbedder needs query as input and it's not getting it
   - "retriever.query"     # from any component it's connected to, so it needs to receive it from the pipeline.
   - "prompt_builder.question"
   - "answer_builder.query"                       
	
    ...													 
   ...

This is how to connect the components in Pipeline Builder. In the index, VoyageDocumentEmbedder embeds documents from DocumentSplitter and sends them to DocumentWriter that writes them into the document store where the query pipeline can access them:

In the query pipeline, VoyageTextEmbedder embeds the query using the same model as VoyageDocumentEmbedder in the index. It then sends the embedded query to the Retriever:

Use Voyage AI Models

ℹ️
Third Party Integration

Prerequisites

Use Voyage Models

Add Workspace-Level Integration

Add Organization-Level Integration

ℹ️
Embedding Models in Query Pipelines and Indexes

Usage Examples

ℹ️Third Party Integration

Prerequisites

Use Voyage Models

Add Workspace-Level Integration

Add Organization-Level Integration

ℹ️Embedding Models in Query Pipelines and Indexes

Usage Examples

ℹ️
Third Party Integration

ℹ️
Embedding Models in Query Pipelines and Indexes