Use Together AI Models

Use models, such as DeepSeek-R1, hosted on Together AI.

You can use models hosted on Together AI in your deepset pipelines. For a complete list of models, see the Inference section of Together AI documentation.

Prerequisites

You need an API key from an active Together AI account. For details on obtaining it, see Together AI Quickstart.

Use Models Hosted on Together AI

First, connect deepset Cloud to Together AI through the Connections page:

  1. Click your initials in the top right corner and select Connections.

  2. Click Connect next to the provider.

  3. Enter your API key and submit it.

Then, add the DeepsetTogetherAIGenerator component to your pipeline and specify the path to the model you want to use.

Usage Examples

This is an example of a RAG pipeline that uses the DeepSeek-R1 model hosted on Together AI:

# If you need help with the YAML format, have a look at https://docs.cloud.deepset.ai/v2.0/docs/create-a-pipeline#create-a-pipeline-using-pipeline-editor.
# This section defines components that you want to use in your pipelines. Each component must have a name and a type. You can also set the component's parameters here.
# The name is up to you, you can give your component a friendly name. You then use components' names when specifying the connections in the pipeline.
# Type is the class path of the component. You can check the type on the component's documentation page.
components:
  bm25_retriever: # Selects the most similar documents from the document store
    type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          embedding_dim: 768
      top_k: 20 # The number of results to return
      fuzziness: 0

  query_embedder:
    type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
    init_parameters:
      normalize_embeddings: true
      model: intfloat/e5-base-v2

  embedding_retriever: # Selects the most similar documents from the document store
    type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          embedding_dim: 768
      top_k: 20 # The number of results to return

  document_joiner:
    type: haystack.components.joiners.document_joiner.DocumentJoiner
    init_parameters:
      join_mode: concatenate

  ranker:
    type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
    init_parameters:
      model: intfloat/simlm-msmarco-reranker
      top_k: 8

  meta_field_grouping_ranker:
    type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
    init_parameters:
      group_by: file_id
      subgroup_by:
      sort_docs_by: split_id

  prompt_builder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
      template: |-
        You are a technical expert.
        You answer questions truthfully based on provided documents.
        Ignore typing errors in the question.
        For each document check whether it is related to the question.
        Only use documents that are related to the question to answer it.
        Ignore documents that are not related to the question.
        If the answer exists in several documents, summarize them.
        Only answer based on the documents provided. Don't make things up.
        Just output the structured, informative and precise answer and nothing else.
        If the documents can't answer the question, say so.
        Always use references in the form [NUMBER OF DOCUMENT] when using information from a document, e.g. [3] for Document [3] .
        Never name the documents, only enter a number in square brackets as a reference.
        The reference must only refer to the number that comes in square brackets after the document.
        Otherwise, do not use brackets in your answer and reference ONLY the number of the document without mentioning the word document.

        These are the documents:
        {%- if documents|length > 0 %}
        {%- for document in documents %}
        Document [{{ loop.index }}] :
        Name of Source File: {{ document.meta.file_name }}
        {{ document.content }}
        {% endfor -%}
        {%- else %}
        No relevant documents found.
        Respond with "Sorry, no matching documents were found, please adjust the filters or try a different question."
        {% endif %}

        Question: {{ question }}
        Answer:

      required_variables: "*"

  llm:
    type: deepset_cloud_custom_nodes.generators.togetherai.DeepsetTogetherAIGenerator
    init_parameters:
      api_key: {"type": "env_var", "env_vars": ["TOGETHERAI_API_KEY"], "strict": false}
      model: deepseek-ai/DeepSeek-R1
      generation_kwargs:
        max_tokens: 650
        temperature: 0
        seed: 0

  answer_builder:
    type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
    init_parameters:
      reference_pattern: acm

connections:  # Defines how the components are connected
- sender: bm25_retriever.documents
  receiver: document_joiner.documents
- sender: query_embedder.embedding
  receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
  receiver: document_joiner.documents
- sender: document_joiner.documents
  receiver: ranker.documents
- sender: ranker.documents
  receiver: meta_field_grouping_ranker.documents
- sender: meta_field_grouping_ranker.documents
  receiver: prompt_builder.documents
- sender: meta_field_grouping_ranker.documents
  receiver: answer_builder.documents
- sender: prompt_builder.prompt
  receiver: llm.prompt
- sender: prompt_builder.prompt
  receiver: answer_builder.prompt
- sender: llm.replies
  receiver: answer_builder.replies

inputs:  # Define the inputs for your pipeline
  query:  # These components will receive the query as input
  - "bm25_retriever.query"
  - "query_embedder.text"
  - "ranker.query"
  - "prompt_builder.question"
  - "answer_builder.query"
  filters:  # These components will receive a potential query filter as input
  - "bm25_retriever.filters"
  - "embedding_retriever.filters"

outputs:  # Defines the output of your pipeline
  documents: "meta_field_grouping_ranker.documents"  # The output of the pipeline is the retrieved documents
  answers: "answer_builder.answers"  # The output of the pipeline is the generated answers

max_runs_per_component: 100

metadata: {}

Here is how to connect the components in Pipeline Builder. PromptBuilder is connected to DeepsetTogetherAIGenerator, which then sends the generated answers to DeepsetAnswerBuilder.

In a query pipeline, the generator is connected to prompt builder to receive the prompt from it and then sends the generated replies to an answer builder, which also adds references to the generated answers