Use DeepL Translation Services
Translate your documents using DeepL services.
Prerequisites
You need an active DeepL account and a DeepL API key. For guidance on how to obtain it, see API Key for DeepL's API.
Use DeepL
First, connect deepset Cloud to DeepL through the Connections page:
-
Click your initials in the top right corner and select Connections.
-
Click Connect next to the provider.
-
Enter your user access token and submit it.
Then, add the DeepsetDeepLDocumentTranslator component to your query pipeline.
Usage Examples
This is an example of how to use DeepL in your pipeline. You connect the DeepL translator's input to a component that outputs documents like a Ranker. Then, you connect the translated documents to the Output component so that the pipeline can return the documents as an answer:
You configure the target languages as a list:
This is the YAML configuration:
components:
query_embedder:
type: haystack.components.embedders.sentence_transformers_text_embedder.SentenceTransformersTextEmbedder
init_parameters:
model: "intfloat/multilingual-e5-base"
device: null
embedding_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
init_parameters:
use_ssl: True
verify_certs: False
http_auth:
- "${OPENSEARCH_USER}"
- "${OPENSEARCH_PASSWORD}"
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
top_k: 20 # The number of results to return
ranker:
type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
init_parameters:
model: "jeffwan/mmarco-mMiniLMv2-L12-H384-v1"
top_k: 20
device: null
model_kwargs:
torch_dtype: "torch.float16"
deepl_translator:
type: deepset_cloud_custom_nodes.converters.deepl_document_translator.DeepsetDeepLDocumentTranslator
# For more information about DeepL supported languages, see https://developers.deepl.com/docs/resources/supported-languages
init_parameters:
api_key: {"type": "env_var", "env_vars": ["DEEPL_API_KEY"], "strict": False}
target_languages: ["DE"] # Translate documents into German
source_language: null # Auto-detects the source language when set to "null"
preserve_formatting: true # Prevent automatic correction of formatting
connections: # Defines how the components are connected
- sender: query_embedder.embedding
receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
receiver: ranker.documents
- sender: ranker.documents
receiver: deepl_translator.documents
max_loops_allowed: 100
inputs: # Define the inputs for your pipeline
query: # These components will receive the query as input
- "query_embedder.text"
- "ranker.query"
filters: # These components will receive a potential query filter as input
- "embedding_retriever.filters"
outputs: # Defines the output of your pipeline
documents: "deepl_translator.translated_documents" # The output of the pipeline is the retrieved documents translated into German.
Updated about 1 month ago