DeepsetMetadataRetriever
Search metadata stored in deepset AI Platform and return the top matches.
Basic Information
- Type:
deepset_cloud_custom_nodes.retrievers.deepset_metadata_retriever.DeepsetMetadataRetriever - Components it can connect with:
Inputor any component that emits a query string.PromptBuilder: Use the CSV metadata string inside a prompt for downstream LLMs.AnswerBuilderor any logger component that needs the CSV output.
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | Search phrase or a comma separated list of field values to match. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| retrieved_metadata | str | CSV string containing the best matching metadata rows for the selected fields. |
Overview
DeepsetMetadataRetriever calls the deepset Search API and filters the metadata fields you configure. It boosts exact matches and sorts the combined list by score before returning up to top_k results. The component flattens the metadata into CSV so you can quickly inject the rows into prompts, logs, or dashboards.
Usage Example
Using the Component in a Pipeline
In this example, DeepsetMetadataRetriever receives query from the Input component, then sends the retrieved metadata to PromptBuilder.
components:
metadata_retriever:
type: deepset_cloud_custom_nodes.retrievers.deepset_metadata_retriever.DeepsetMetadataRetriever
init_parameters:
workspace: ${DEEPSET_WORKSPACE_ID}
fields:
- summary
- owner
prompt_builder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: |
Use the following metadata rows when you craft the answer.
{{retrieved_metadata}}
Question: {{query}}
answer_builder:
type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters: {}
OpenAIGenerator:
type: haystack.components.generators.openai.OpenAIGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o-mini
streaming_callback:
api_base_url:
organization:
system_prompt:
generation_kwargs:
timeout:
max_retries:
http_client_kwargs:
connections:
- sender: metadata_retriever.retrieved_metadata
receiver: prompt_builder.retrieved_metadata
- sender: prompt_builder.prompt
receiver: OpenAIGenerator.prompt
- sender: OpenAIGenerator.replies
receiver: answer_builder.replies
inputs:
query:
- metadata_retriever.query
- answer_builder.query
- prompt_builder.query
outputs:
answers: answer_builder.answers
max_runs_per_component: 100
metadata: {}
Parameters
Init Parameters
These are the parameters you can configure in Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| fields | List[str] | Metadata fields to inspect and include in the CSV output. | |
| workspace | str | Workspace ID that scopes the metadata search. | |
| deepset_api_key | Secret | env var DEEPSET_API_KEY | API key used when calling the workspace search endpoint. |
| top_k | int | 20 | Maximum number of rows returned after sorting by score. |
| exact_match_weight | float | 0.6 | Score boost applied when the query exactly matches a field value. |
| timeout | float | 80.0 | Total timeout in seconds for the HTTP call. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | Metadata search string supplied at query time. |
Was this page helpful?