DeepsetDeepLDocumentTranslator
Translate the content of your documents using DeepL Python SDK.
Basic Information
- Type:
deepset_cloud_custom_nodes.converters.deepl_document_translator.DeepsetDeepLDocumentTranslator - Components it can connect with:
- Converters: You can use
DeepsetDeeplDocumentTranslatorafter converters to translate the documents converters return. - Retrievers: You can use this component to translate documents fetched by a retriever.
PromptBuilder:DeepsetDeepLDocumentTranslatorcan send the translated documents to aPromptBuilder, which then includes them in the prompt for the LLM.
- Converters: You can use
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | List of Haystack documents to be translated. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of translated documents. |
Overview
DeepsetDeepLDocumentTranslator uses the DeepL Python library to translate documents into the languages you specify. For a list of supported languages, see DeepL documentation. You can translate one set of documents into multiple languages at once; just pass the language codes in thetarget_languages\ parameter.
Authorization
You must have an active DeepL account and a DeepL API key to use this component. Connect DeepL to deepset on the Integrations page:
Connection Instructions
- Click your profile icon in the top right corner and choose Integrations.

- Click Connect next to the provider.
- Enter your API key and submit it.
Once deepset is connected, you can use DeepsetDeepLDocumentTranslator without passing the API key in the pipeline YAML.
Usage Example
Initializing the Component
components:
DeepsetDeepLDocumentTranslator:
type: converters.deepl_translator.DeepsetDeepLDocumentTranslator
init_parameters:
Using the Component in a Pipeline
This is an example of a query pipeline where DeepsetDeepLDocumentTranslator receives documents from a Ranker and translates them into German. The output of the pipeline are the translated documents:

To specify the languages you want DeepL to translate into, you list their codes in the target_languages parameter:

Here's the pipeline YAML:
components:
query_embedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
normalize_embeddings: true
model: intfloat/multilingual-e5-base
embedding_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
init_parameters:
embedding_dim: 768
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
top_k: 20 # The number of results to return
ranker:
type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
init_parameters:
model: "jeffwan/mmarco-mMiniLMv2-L12-H384-v1"
top_k: 20
model_kwargs:
torch_dtype: "torch.float16"
deepl_translator:
type: deepset_cloud_custom_nodes.converters.deepl_translator.DeepsetDeepLDocumentTranslator
# For more information about DeepL supported languages, see https://developers.deepl.com/docs/resources/supported-languages
init_parameters:
api_key: {"type": "env_var", "env_vars": ["DEEPL_API_KEY"], "strict": false}
target_languages: ["DE"] # Translate documents into German
source_language: # Auto-detects the source language when set to "null"
preserve_formatting: true # Prevent automatic correction of formatting
include_score: true # Display relevance score
connections: # Defines how the components are connected
- sender: query_embedder.embedding
receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
receiver: ranker.documents
- sender: ranker.documents
receiver: deepl_translator.documents
inputs: # Define the inputs for your pipeline
query: # These components will receive the query as input
- "query_embedder.text"
- "ranker.query"
filters: # These components will receive a potential query filter as input
- "embedding_retriever.filters"
outputs: # Defines the output of your pipeline
documents: "deepl_translator.documents" # The output of the pipeline is the retrieved documents translated into German.
max_runs_per_component: 100
metadata: {}
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| target_languages | Union[List[str], str] | The target language code or a list of target language codes. For a list of target languages, refer to target languages. If multiple languages are specified, a translated document is returned for each language. | |
| source_language | Optional[str] | None | The source language code. If set to None, the source language is auto-detected. For a list of source languages, refer to source languages. |
| api_key | Secret | Secret.from_env_var('DEEPL_API_KEY') | DeepL API key. |
| preserve_formatting | Optional[bool] | None | Controls automatic formatting correction. If set to None, it acts as True to prevent automatic correction of formatting. |
| split_sentences | Literal[0, 1, 'nonewlines', None] | None | Controls how the translation engine should split input into sentences before translation. Sets whether the translation engine should first split the input into sentences. This is enabled by default. Possible values are: - 0: 0 means OFF. No splitting at all, whole input is treated as one sentence. Use this option if the input text is already split into sentences, to prevent the engine from splitting the sentence unintentionally. - 1: 1 means ALL. (default) splits on punctuation and on newlines. - 'nonewlines': splits on punctuation only, ignoring newlines. |
| context | Optional[str] | None | Makes it possible to include additional context that can influence a translation without being translated itself. Providing additional context can potentially improve translation quality, especially for short, low-context source texts such as product names on an e-commerce website, article headlines on a news website, or UI elements. For more information and examples, refer to the API documentation. |
| formality | Literal[None, 'less', 'default', 'more', 'prefer_more', 'prefer_less'] | None | Controls whether translations should lean toward informal or formal language. This feature currently only works for the following languages DE (German), FR (French), IT (Italian), ES (Spanish), NL (Dutch), PL (Polish), PT-BR and PT-PT (Portuguese), JA (Japanese), and RU (Russian). The available options are: - 'less': Translate using informal language. - 'default': Translate using the default formality. - 'more': Translate using formal language. - 'prefer_more': Translate using formal language if the target language supports formality, otherwise use default formality. - 'prefer_less': Translate using informal language if the target language supports formality, otherwise use default formality. |
| max_retries | Optional[int] | 5 | Maximum number of network retries after failed HTTP request. Default retries is set to 5. |
| glossary | Union[str, None] | None | (Optional) glossary ID to use for translation. Must match specified source_lang and target_lang. |
| tag_handling | Literal[None, 'xml', 'html'] | None | (Optional) Type of tags to parse before translation, only "xml" and "html" are currently available. |
| outline_detection | Optional[bool] | None | (Optional) Set to False to disable automatic tag detection. |
| non_splitting_tags | Union[str, List[str], None] | None | (Optional) XML tags that should not split a sentence. |
| splitting_tags | Union[str, List[str], None] | None | (Optional) XML tags that should split a sentence. |
| ignore_tags | Union[str, List[str], None] | None | (Optional) XML tags containing text that should not be translated. |
| include_score | bool | True | Whether to include the original document score in the translated document. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | List of Haystack documents to be translated. |
Was this page helpful?