Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

TogetherAIGenerator

Generate text using large language models running on Together AI. This component provides a simple interface for text generation without the chat message format.

Key Features

  • Supports all text completion models available through Together AI.
  • Accepts all parameters supported by the Together AI API via generation_kwargs.
  • Supports optional system prompts for context or instructions.
  • Supports streaming responses token by token.
  • Returns generated text as strings along with metadata.

Configuration

  1. Drag the TogetherAIGenerator component onto the canvas from the Component Library.
  2. Click on the component to open the configuration panel.
  3. On the General tab:
    • Connect Haystack Platform to your Together AI account on the Integrations page. For detailed instructions, see Use Together AI Models.
    • Select the model to use.
    • Optionally, set a system prompt.
  4. Go to the Advanced tab to configure generation_kwargs, timeout, and max_retries.

Connections

TogetherAIGenerator receives a rendered prompt string from PromptBuilder. It sends generated replies to AnswerBuilder or DeepsetAnswerBuilder.

Source Code

To check this component's source code, open generator.py in the Haystack Core Integrations repository.

Usage Examples

Basic Configuration

  TogetherAIGenerator:
type: haystack_integrations.components.generators.togetherai.generator.TogetherAIGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- TOGETHER_API_KEY
strict: false
model: meta-llama/Llama-3.3-70B-Instruct-Turbo
api_base_url: https://api.together.xyz/v1

This is an example RAG pipeline with TogetherAIGenerator and DeepsetAnswerBuilder:

components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'Standard-Index-English'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20
fuzziness: 0

query_embedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
normalize_embeddings: true
model: intfloat/e5-base-v2

embedding_retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'Standard-Index-English'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20

document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate

ranker:
type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
init_parameters:
model: intfloat/simlm-msmarco-reranker
top_k: 8

meta_field_grouping_ranker:
type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
init_parameters:
group_by: file_id
subgroup_by:
sort_docs_by: split_id

answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm

PromptBuilder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: "You are a helpful assistant answering the user's questions based on the provided documents.\nDo not use your own knowledge.\n\nProvided documents:\n{% for document in documents %}\nDocument [{{ loop.index }}]:\n{{ document.content }}\n{% endfor %}\n\nQuestion: {{ query }}\nAnswer:"

TogetherAIGenerator:
type: haystack_integrations.components.generators.togetherai.generator.TogetherAIGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- TOGETHER_API_KEY
strict: false
model: meta-llama/Llama-3.3-70B-Instruct-Turbo
streaming_callback:
api_base_url: https://api.together.xyz/v1
system_prompt:
generation_kwargs:
timeout:
max_retries:

connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: query_embedder.embedding
receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: document_joiner.documents
receiver: ranker.documents
- sender: ranker.documents
receiver: meta_field_grouping_ranker.documents
- sender: meta_field_grouping_ranker.documents
receiver: answer_builder.documents
- sender: meta_field_grouping_ranker.documents
receiver: PromptBuilder.documents
- sender: PromptBuilder.prompt
receiver: TogetherAIGenerator.prompt
- sender: TogetherAIGenerator.replies
receiver: answer_builder.replies

inputs:
query:
- "bm25_retriever.query"
- "query_embedder.text"
- "ranker.query"
- "answer_builder.query"
- "PromptBuilder.query"
filters:
- "bm25_retriever.filters"
- "embedding_retriever.filters"

outputs:
documents: "meta_field_grouping_ranker.documents"
answers: "answer_builder.answers"

max_runs_per_component: 100

metadata: {}

Parameters

Inputs

ParameterTypeDescription
promptstrThe input prompt string for text generation.
system_promptOptional[str]An optional system prompt to provide context or instructions for the generation.
streaming_callbackOptional[StreamingCallbackT]A callback function called when a new token is received from the stream.
generation_kwargsOptional[Dict[str, Any]]Additional keyword arguments for text generation. These parameters override the parameters in pipeline configuration.

Outputs

ParameterTypeDescription
repliesList[str]A list of generated text completions as strings.
metaList[Dict[str, Any]]A list of metadata dictionaries containing information about each generation, including model name, finish reason, and token usage statistics.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
api_keySecretSecret.from_env_var('TOGETHER_API_KEY')The Together AI API key.
modelstrmeta-llama/Llama-3.3-70B-Instruct-TurboThe name of the model to use.
api_base_urlOptional[str]https://api.together.xyz/v1The base URL of the Together AI API.
streaming_callbackOptional[StreamingCallbackT]NoneA callback function called when a new token is received from the stream. The callback function accepts StreamingChunk as an argument.
system_promptOptional[str]NoneThe system prompt to use for text generation. If not provided, the system prompt is omitted and the default system prompt of the model is used.
generation_kwargsOptional[Dict[str, Any]]NoneOther parameters to use for the model. These parameters are sent directly to the Together AI endpoint. See Together AI documentation for more details. Supported parameters include: max_tokens (maximum number of tokens the output text can have), temperature (sampling temperature for creativity control), top_p (nucleus sampling probability mass), n (number of completions to generate for each prompt), stop (sequences after which the LLM should stop generating tokens), presence_penalty (penalty for tokens already present), frequency_penalty (penalty for frequently generated tokens), logit_bias (add logit bias to specific tokens).
timeoutOptional[float]NoneTimeout for Together AI client calls. If not set, it is inferred from the OPENAI_TIMEOUT environment variable or set to 30.
max_retriesOptional[int]NoneMaximum retries to establish contact with Together AI if it returns an internal error. If not set, it is inferred from the OPENAI_MAX_RETRIES environment variable or set to five.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
promptstrThe input prompt string for text generation.
system_promptOptional[str]NoneAn optional system prompt to provide context or instructions for the generation. If not provided, the system prompt set during initialization is used.
streaming_callbackOptional[StreamingCallbackT]NoneA callback function called when a new token is received from the stream. If provided, this overrides the streaming_callback set during initialization.
generation_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for text generation. These parameters override the parameters passed during initialization. Supported parameters include temperature, max_tokens, top_p, and others.