Skip to main content

Enable References for Generated Answers

Enhance your AI-generated answers with source references to make your app more trustworthy and verifiable.


About This Task

When using an LLM to generate answers, making it cite its resources can significantly enhance the credibility and traceability of the information provided. In the prompt, instruct the LLM to generate references.

Pipeline Templates

The reference functionality is already included in RAG pipeline templates in deepset AI Platform. The default prompt instructs the LLM to generate references to the documents it used to generate the answer.

Adding References with an LLM

To add references using an LLM, add specific instructions in your prompt. Here is a prompt we've tested and recommend for you to use:

 You are a technical expert.
You answer the questions truthfully on the basis of the documents provided.
For each document, check whether it is related to the question.
To answer the question, only use documents that are related to the question.
Ignore documents that do not relate to the question.
If the answer is contained in several documents, summarize them.
Always use references in the form [NUMBER OF DOCUMENT] if you use information from a document, e.g. [3] for document [3].
Never name the documents, only enter a number in square brackets as a reference.
The reference may only refer to the number in square brackets after the passage.
Otherwise, do not use brackets in your answer and give ONLY the number of the document without mentioning the word document.
Give a precise, accurate and structured answer without repeating the question.
Answer only on the basis of the documents provided. Do not make up facts.
If the documents cannot answer the question or you are not sure, say so.
These are the documents:
{% for document in documents %}
Document[{{ loop.index }}]:
{{ document.content }}
{% endfor %}
Question: {{question}}
Answer:

The Generator using the prompt must be connected with DeepsetAnswerBuilder, whose reference_pattern is set to acm for the references to be displayed properly.

Here are the detailed steps to add references with an LLM:

  1. Add the prompt with instructions to generate references to PromptBuilder:
    components:
    ...
    prompt_builder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
    template: |-
    <s>[INST] You are a technical expert.
    You answer the questions truthfully on the basis of the documents provided.
    For each document, check whether it is related to the question.
    To answer the question, only use documents that are related to the question.
    Ignore documents that do not relate to the question.
    If the answer is contained in several documents, summarize them.
    Always use references in the form [NUMBER OF DOCUMENT] if you use information from a document, e.g. [3] for document [3].
    Never name the documents, only enter a number in square brackets as a reference.
    The reference may only refer to the number in square brackets after the passage.
    Otherwise, do not use brackets in your answer and give ONLY the number of the document without mentioning the word document.
    Give a precise, accurate and structured answer without repeating the question.
    Answer only on the basis of the documents provided. Do not make up facts.
    If the documents cannot answer the question or you are not sure, say so.
    These are the documents:
    {% for document in documents %}
    Document[{{ loop.index }}]:
    {{ document.content }}
    {% endfor %}
    Question: {{question}}
    Answer:
    [/INST]
    ...

    # Here you would also configure other components
  2. Add DeepsetAnswerBuilder and configure its reference_pattern parameter:
    components:
    ...
    prompt_builder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
    template: |-
    <s>[INST] You are a technical expert.
    You answer the questions truthfully on the basis of the documents provided.
    For each document, check whether it is related to the question.
    To answer the question, only use documents that are related to the question.
    Ignore documents that do not relate to the question.
    If the answer is contained in several documents, summarize them.
    Always use references in the form [NUMBER OF DOCUMENT] if you use information from a document, e.g. [3] for document [3].
    Never name the documents, only enter a number in square brackets as a reference.
    The reference may only refer to the number in square brackets after the passage.
    Otherwise, do not use brackets in your answer and give ONLY the number of the document without mentioning the word document.
    Give a precise, accurate and structured answer without repeating the question.
    Answer only on the basis of the documents provided. Do not make up facts.
    If the documents cannot answer the question or you are not sure, say so.
    These are the documents:
    {% for document in documents %}
    Document[{{ loop.index }}]:
    {{ document.content }}
    {% endfor %}
    Question: {{question}}
    Answer:
    [/INST]

    answer_builder:
    type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
    init_parameters:
    reference_pattern: acm
    ...
    # here you would also configure other components, like the Generator
  3. Connect the components:
    1. Connect PromptBuilder's prompt to the Generator.
    2. Connect the Generator's replies to DeepsetAnswerBuilder.
      Here's what it should look like:
      connections:
      ...
      - sender: prompt_builder.prompt
      receiver: generator.prompt
      - sender: generator.replies
      receiver: answer_builder.replies
  4. Indicate DeepsetAnswerBuilder's output as the pipeline output:
    outputs:
    answers: answer_builder.answers

Here's an example of a query pipeline that uses this approach:

components:
bm25_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
init_parameters:
use_ssl: True
verify_certs: False
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- "${OPENSEARCH_USER}"
- "${OPENSEARCH_PASSWORD}"
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
top_k: 20 # The number of results to return

query_embedder:
type: haystack.components.embedders.sentence_transformers_text_embedder.SentenceTransformersTextEmbedder
init_parameters:
model: "intfloat/e5-base-v2"
device: null

embedding_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
init_parameters:
use_ssl: True
verify_certs: False
http_auth:
- "${OPENSEARCH_USER}"
- "${OPENSEARCH_PASSWORD}"
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
top_k: 20 # The number of results to return

document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate

ranker:
type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
init_parameters:
model: "intfloat/simlm-msmarco-reranker"
top_k: 8
device: null
model_kwargs:
torch_dtype: "torch.float16"

chat_prompt_builder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template: |-
<s>[INST] You are a technical expert.
You answer the questions truthfully on the basis of the documents provided.
For each document, check whether it is related to the question.
To answer the question, only use documents that are related to the question.
Ignore documents that do not relate to the question.
If the answer is contained in several documents, summarize them.
Always use references in the form [NUMBER OF DOCUMENT] if you use information from a document, e.g. [3] for document [3].
Never name the documents, only enter a number in square brackets as a reference.
The reference may only refer to the number in square brackets after the passage.
Otherwise, do not use brackets in your answer and give ONLY the number of the document without mentioning the word document.
Give a precise, accurate and structured answer without repeating the question.
Answer only on the basis of the documents provided. Do not make up facts.
If the documents cannot answer the question or you are not sure, say so.
These are the documents:
{% for document in documents %}
Document[{{ loop.index }}]:
{{ document.content }}
{% endfor %}
Question: {{question}}
Answer:
[/INST]
llm:
type: haystack_integrations.components.generators.amazon_bedrock.chat.chat_generator.AmazonBedrockChatGenerator
init_parameters:
model: "mistral.mistral-large-2402-v1:0"
aws_region_name: us-east-1
max_length: 400 # The maximum number of tokens the generated answer can have
model_max_length: 32000 # The maximum number of tokens the prompt and the generated answer can use
temperature: 0 # Lower temperature works best for fact-based qa

answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm

connections: # Defines how the components are connected
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: query_embedder.embedding
receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: document_joiner.documents
receiver: ranker.documents
- sender: ranker.documents
receiver: chat_prompt_builder.documents
- sender: ranker.documents
receiver: answer_builder.documents
- sender: chat_prompt_builder.prompt
receiver: llm.messages
- sender: llm.replies
receiver: answer_builder.replies

max_loops_allowed: 100

inputs: # Define the inputs for your pipeline
query: # These components will receive the query as input
- "bm25_retriever.query"
- "query_embedder.text"
- "ranker.query"
- "chat_prompt_builder.question"
- "answer_builder.query"

filters: # These components will receive a potential query filter as input
- "bm25_retriever.filters"
- "embedding_retriever.filters"

outputs: # Defines the output of your pipeline
documents: "ranker.documents" # The output of the pipeline is the retrieved documents
answers: "answer_builder.answers"