TogetherAIChatGenerator
Generates text using large language models hosted on Together AI.
Key Features
- Access to a wide range of open-source LLMs through Together AI's platform.
- Customizable generation with
generation_kwargsfor temperature, token limits, and more. - Tool calling support for agentic workflows.
- Streaming support for real-time token delivery.
- Compatible with the Together AI chat completion API.
Configuration
To use this component, connect Haystack Platform with Together AI first. For detailed instructions, see Use Together AI Models.
- Drag the
TogetherAIChatGeneratorcomponent onto the canvas from the Component Library. - Click the component to open the configuration panel.
- On the General tab:
- Enter the name of the Together AI model to use (for example,
meta-llama/Llama-3.3-70B-Instruct-Turbo). For supported models, see Together AI documentation.
- Enter the name of the Together AI model to use (for example,
- Go to the Advanced tab to configure the API key, API base URL, generation parameters, and streaming callback.
Connections
TogetherAIChatGenerator accepts a list of ChatMessage objects as input. It outputs a list of ChatMessage objects containing the generated responses.
Connect a ChatPromptBuilder to its messages input. Connect its replies output through an OutputAdapter to a DeepsetAnswerBuilder.
Usage Example
This is an example RAG pipeline with TogetherAIChatGenerator and DeepsetAnswerBuilder connected through OutputAdapter:
components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'Standard-Index-English'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20
fuzziness: 0
query_embedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
normalize_embeddings: true
model: intfloat/e5-base-v2
embedding_retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'Standard-Index-English'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
ranker:
type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
init_parameters:
model: intfloat/simlm-msmarco-reranker
top_k: 8
meta_field_grouping_ranker:
type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
init_parameters:
group_by: file_id
subgroup_by:
sort_docs_by: split_id
answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm
ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- _content:
- text: "You are a helpful assistant answering the user's questions based on the provided documents.\nDo not use your own knowledge.\n"
_role: system
- _content:
- text: "Provided documents:\n{% for document in documents %}\nDocument [{{ loop.index }}] :\n{{ document.content }}\n{% endfor %}\n\nQuestion: {{ query }}\n"
_role: user
required_variables:
variables:
OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]
custom_filters:
unsafe: false
TogetherAIChatGenerator:
type: haystack_integrations.components.generators.togetherai.chat.chat_generator.TogetherAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- TOGETHER_API_KEY
strict: false
model: meta-llama/Llama-3.3-70B-Instruct-Turbo
streaming_callback:
api_base_url: https://api.together.xyz/v1
generation_kwargs:
tools:
timeout:
max_retries:
http_client_kwargs:
connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: query_embedder.embedding
receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: document_joiner.documents
receiver: ranker.documents
- sender: ranker.documents
receiver: meta_field_grouping_ranker.documents
- sender: meta_field_grouping_ranker.documents
receiver: answer_builder.documents
- sender: meta_field_grouping_ranker.documents
receiver: ChatPromptBuilder.documents
- sender: OutputAdapter.output
receiver: answer_builder.replies
- sender: ChatPromptBuilder.prompt
receiver: TogetherAIChatGenerator.messages
- sender: TogetherAIChatGenerator.replies
receiver: OutputAdapter.replies
inputs:
query:
- "bm25_retriever.query"
- "query_embedder.text"
- "ranker.query"
- "answer_builder.query"
- "ChatPromptBuilder.query"
filters:
- "bm25_retriever.filters"
- "embedding_retriever.filters"
outputs:
documents: "meta_field_grouping_ranker.documents"
answers: "answer_builder.answers"
max_runs_per_component: 100
metadata: {}
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| messages | List[ChatMessage] | A list of ChatMessage instances representing the input messages. | |
| streaming_callback | Optional[StreamingCallbackT] | None | A callback function called when the LLM receives a new token from the stream. |
| generation_kwargs | Optional[Dict[str, Any]] | None | Additional keyword arguments for text generation. These parameters override the parameters in pipeline configuration. |
| tools | Optional[Union[List[Tool], Toolset]] | None | A list of tools or a Toolset for which the model can prepare calls. If set, it overrides the tools parameter set during component initialization. |
| tools_strict | Optional[bool] | None | Whether to enable strict schema adherence for tool calls. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| replies | List[ChatMessage] | A list containing the generated responses as ChatMessage instances. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| api_key | Secret | Secret.from_env_var('TOGETHER_API_KEY') | The Together AI API key. |
| model | str | meta-llama/Llama-3.3-70B-Instruct-Turbo | The name of the Together AI chat completion model to use. |
| streaming_callback | Optional[StreamingCallbackT] | None | A callback function called when a new token is received from the stream. The callback function accepts StreamingChunk as an argument. |
| api_base_url | Optional[str] | https://api.together.xyz/v1 | The Together AI API base URL. For more details, see Together AI documentation. |
| generation_kwargs | Optional[Dict[str, Any]] | None | Other parameters to use for the model. These parameters are sent directly to the Together AI endpoint. See Together AI API documentation for more details. Supported parameters include: max_tokens (maximum number of tokens the output text can have), temperature (sampling temperature for creativity control), top_p (nucleus sampling probability mass), stream (whether to stream back partial progress), safe_prompt (whether to inject a safety prompt before all conversations), random_seed (the seed to use for random sampling), response_format (a JSON schema or Pydantic model that enforces the structure of the model's response). |
| tools | Optional[Union[List[Tool], Toolset]] | None | A list of tools or a Toolset for which the model can prepare calls. Each tool should have a unique name. |
| timeout | Optional[float] | None | The timeout for the Together AI API call. |
| max_retries | Optional[int] | None | Maximum number of retries to contact Together AI after an internal error. If not set, it defaults to either the OPENAI_MAX_RETRIES environment variable or five. |
| http_client_kwargs | Optional[Dict[str, Any]] | None | A dictionary of keyword arguments to configure a custom httpx.Client or httpx.AsyncClient. For more information, see the HTTPX documentation. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| messages | List[ChatMessage] | A list of ChatMessage instances representing the input messages. | |
| streaming_callback | Optional[StreamingCallbackT] | None | A callback function called when a new token is received from the stream. |
| generation_kwargs | Optional[Dict[str, Any]] | None | Additional keyword arguments for text generation. These parameters override the parameters in pipeline configuration. |
| tools | Optional[Union[List[Tool], Toolset]] | None | A list of tools or a Toolset for which the model can prepare calls. If set, it overrides the tools parameter set during component initialization. |
| tools_strict | Optional[bool] | None | Whether to enable strict schema adherence for tool calls. |
Was this page helpful?