Trace with Langfuse
Monitor your deepset pipelines with Langfuse.
About This Task
Langfuse is a powerful tool for observability and tracing in complex AI workflows. It captures detailed traces of your pipeline operations, making it easier to debug, monitor, and improve your applications.
The deepset AI Platform integrates with Langfuse using the LangfuseConnector. To get started, connect the platform to Langfuse with your API key. Then, simply add the LangfuseConnector to your pipelines to start collecting and sending traces to Langfuse.
Prerequisites
You need the Langfuse public and secret API keys. You can obtain them from your Langfuse project settings. You must create new keys as the secret key can only be viewed and copied once, during creation.
Use Langfuse
-
Connect deepset AI Platform to Langfuse by creating two Langfuse secrets:
-
In deepset AI Platform, click your initials in the top right corner and choose Secrets.
-
Click Add New Secret.
-
Copy the secret key from your Langfuse project and paste it into the Secret field.
-
Type
LANGFUSE_SECRET_KEY
as the secret name and save the secret. -
Click Add New Secret.
-
Copy the public key from your Langfuse project and paste it into the Secret field.
-
Type
LANGFUSE_PUBLIC_KEY
as the secret name and save the secret.
-
-
Add the LangfuseConnector component to the pipeline you want to trace but do not connect it to any other component.
-
Set the
name
parameter inLangfuseConnector
to your pipeline name and save the pipeline.
When you run a query with this pipeline, its traces will appear in your Langfuse project under Traces.

Example
This is an example of a RAG pipeline with Langfuse tracing enabled. LangfuseConnector
is in the pipeline but it's not connected to any other component:

YAML configuration
components:
bm25_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'Standard-Index-English'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20 # The number of results to return
fuzziness: 0
query_embedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
normalize_embeddings: true
model: intfloat/e5-base-v2
embedding_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'Standard-Index-English'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20 # The number of results to return
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
ranker:
type: deepset_cloud_custom_nodes.rankers.nvidia.ranker.DeepsetNvidiaRanker
init_parameters:
model: intfloat/simlm-msmarco-reranker
top_k: 8
meta_field_grouping_ranker:
type: haystack.components.rankers.meta_field_grouping_ranker.MetaFieldGroupingRanker
init_parameters:
group_by: file_id
subgroup_by:
sort_docs_by: split_id
prompt_builder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: |-
You are a technical expert.
You answer questions truthfully based on provided documents.
If the answer exists in several documents, summarize them.
Ignore documents that don't contain the answer to the question.
Only answer based on the documents provided. Don't make things up.
If no information related to the question can be found in the document, say so.
Always use references in the form [NUMBER OF DOCUMENT] when using information from a document, e.g. [3] for Document [3] .
Never name the documents, only enter a number in square brackets as a reference.
The reference must only refer to the number that comes in square brackets after the document.
Otherwise, do not use brackets in your answer and reference ONLY the number of the document without mentioning the word document.
These are the documents:
{%- if documents|length > 0 %}
{%- for document in documents %}
Document [{{ loop.index }}] :
Name of Source File: {{ document.meta.file_name }}
{{ document.content }}
{% endfor -%}
{%- else %}
No relevant documents found.
Respond with "Sorry, no matching documents were found, please adjust the filters or try a different question."
{% endif %}
Question: {{ question }}
Answer:
required_variables: "*"
llm:
type: deepset_cloud_custom_nodes.generators.deepset_amazon_bedrock_generator.DeepsetAmazonBedrockGenerator
init_parameters:
model: anthropic.claude-3-5-sonnet-20241022-v2:0
aws_region_name: us-west-2
max_length: 650
temperature: 0
answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm
LangfuseConnector:
type: haystack_integrations.components.connectors.langfuse.langfuse_connector.LangfuseConnector
init_parameters:
name: RAG-QA-Claude-3.5-Sonnet-en
public: false
public_key:
type: env_var
env_vars:
- LANGFUSE_PUBLIC_KEY
strict: false
secret_key:
type: env_var
env_vars:
- LANGFUSE_SECRET_KEY
strict: false
httpx_client:
span_handler:
connections: # Defines how the components are connected
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: query_embedder.embedding
receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: document_joiner.documents
receiver: ranker.documents
- sender: ranker.documents
receiver: meta_field_grouping_ranker.documents
- sender: meta_field_grouping_ranker.documents
receiver: prompt_builder.documents
- sender: meta_field_grouping_ranker.documents
receiver: answer_builder.documents
- sender: prompt_builder.prompt
receiver: llm.prompt
- sender: prompt_builder.prompt
receiver: answer_builder.prompt
- sender: llm.replies
receiver: answer_builder.replies
inputs: # Define the inputs for your pipeline
query: # These components will receive the query as input
- "bm25_retriever.query"
- "query_embedder.text"
- "ranker.query"
- "prompt_builder.question"
- "answer_builder.query"
filters: # These components will receive a potential query filter as input
- "bm25_retriever.filters"
- "embedding_retriever.filters"
outputs: # Defines the output of your pipeline
documents: "meta_field_grouping_ranker.documents" # The output of the pipeline is the retrieved documents
answers: "answer_builder.answers" # The output of the pipeline is the generated answers
max_runs_per_component: 100
metadata: {}
Updated 1 day ago