Skip to main content

DeepsetMetadataRetriever

Search metadata stored in deepset AI Platform and return the top matches.

Basic Information

  • Type: deepset_cloud_custom_nodes.retrievers.deepset_metadata_retriever.DeepsetMetadataRetriever
  • Components it can connect with:
    • Input or any component that emits a query string.
    • PromptBuilder: Use the CSV metadata string inside a prompt for downstream LLMs.
    • AnswerBuilder or any logger component that needs the CSV output.

Inputs

ParameterTypeDefaultDescription
querystrSearch phrase or a comma separated list of field values to match.

Outputs

ParameterTypeDefaultDescription
retrieved_metadatastrCSV string containing the best matching metadata rows for the selected fields.

Overview

DeepsetMetadataRetriever calls the deepset Search API and filters the metadata fields you configure. It boosts exact matches and sorts the combined list by score before returning up to top_k results. The component flattens the metadata into CSV so you can quickly inject the rows into prompts, logs, or dashboards.

Usage Example

Using the Component in a Pipeline

In this example, DeepsetMetadataRetriever receives query from the Input component, then sends the retrieved metadata to PromptBuilder.

components:
metadata_retriever:
type: deepset_cloud_custom_nodes.retrievers.deepset_metadata_retriever.DeepsetMetadataRetriever
init_parameters:
workspace: ${DEEPSET_WORKSPACE_ID}
fields:
- summary
- owner
prompt_builder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: |
Use the following metadata rows when you craft the answer.
{{retrieved_metadata}}

Question: {{query}}
answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters: {}
OpenAIGenerator:
type: haystack.components.generators.openai.OpenAIGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o-mini
streaming_callback:
api_base_url:
organization:
system_prompt:
generation_kwargs:
timeout:
max_retries:
http_client_kwargs:

connections:
- sender: metadata_retriever.retrieved_metadata
receiver: prompt_builder.retrieved_metadata
- sender: prompt_builder.prompt
receiver: OpenAIGenerator.prompt
- sender: OpenAIGenerator.replies
receiver: answer_builder.replies

inputs:
query:
- metadata_retriever.query
- answer_builder.query
- prompt_builder.query

outputs:
answers: answer_builder.answers

max_runs_per_component: 100

metadata: {}

Parameters

Init Parameters

These are the parameters you can configure in Builder:

ParameterTypeDefaultDescription
fieldsList[str]Metadata fields to inspect and include in the CSV output.
workspacestrWorkspace ID that scopes the metadata search.
deepset_api_keySecretenv var DEEPSET_API_KEYAPI key used when calling the workspace search endpoint.
top_kint20Maximum number of rows returned after sorting by score.
exact_match_weightfloat0.6Score boost applied when the query exactly matches a field value.
timeoutfloat80.0Total timeout in seconds for the HTTP call.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
querystrMetadata search string supplied at query time.