DeepsetMetadataRetriever

Search metadata stored in Haystack Enterprise Platform and return the top matches.

Basic Information

Type: deepset_cloud_custom_nodes.retrievers.deepset_metadata_retriever.DeepsetMetadataRetriever
Components it can connect with:
- Input or any component that emits a query string.
- PromptBuilder: Use the CSV metadata string inside a prompt for downstream LLMs.
- AnswerBuilder or any logger component that needs the CSV output.

Inputs

Parameter	Type	Default	Description
query	str		Search phrase or a comma separated list of field values to match.

Outputs

Parameter	Type	Default	Description
retrieved_metadata	str		CSV string containing the best matching metadata rows for the selected fields.

Overview

DeepsetMetadataRetriever calls the Haystack Platform Search API and filters the metadata fields you configure. It boosts exact matches and sorts the combined list by score before returning up to top_k results. The component flattens the metadata into CSV so you can quickly inject the rows into prompts, logs, or dashboards.

Usage Example

Using the Component in a Pipeline

In this example, DeepsetMetadataRetriever receives query from the Input component, then sends the retrieved metadata to PromptBuilder.

components:
  metadata_retriever:
    type: deepset_cloud_custom_nodes.retrievers.deepset_metadata_retriever.DeepsetMetadataRetriever
    init_parameters:
      workspace: ${DEEPSET_WORKSPACE_ID}
      fields:
      - summary
      - owner
  prompt_builder:
    type: haystack.components.builders.prompt_builder.PromptBuilder
    init_parameters:
      template: |
        Use the following metadata rows when you craft the answer.
        {{retrieved_metadata}}

        Question: {{query}}
  answer_builder:
    type: deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
    init_parameters: {}
  OpenAIGenerator:
    type: haystack.components.generators.openai.OpenAIGenerator
    init_parameters:
      api_key:
        type: env_var
        env_vars:
        - OPENAI_API_KEY
        strict: false
      model: gpt-4o-mini
      streaming_callback:
      api_base_url:
      organization:
      system_prompt:
      generation_kwargs:
      timeout:
      max_retries:
      http_client_kwargs:

connections:
- sender: metadata_retriever.retrieved_metadata
  receiver: prompt_builder.retrieved_metadata
- sender: prompt_builder.prompt
  receiver: OpenAIGenerator.prompt
- sender: OpenAIGenerator.replies
  receiver: answer_builder.replies

inputs:
  query:
  - metadata_retriever.query
  - answer_builder.query
  - prompt_builder.query

outputs:
  answers: answer_builder.answers

max_runs_per_component: 100

metadata: {}

Parameters

Init Parameters

These are the parameters you can configure in Builder:

Parameter	Type	Default	Description
fields	List[str]		Metadata fields to inspect and include in the CSV output.
workspace	str		Workspace ID that scopes the metadata search.
deepset_api_key	Secret	env var `DEEPSET_API_KEY`	API key used when calling the workspace search endpoint.
top_k	int	20	Maximum number of rows returned after sorting by score.
exact_match_weight	float	0.6	Score boost applied when the query exactly matches a field value.
timeout	float	80.0	Total timeout in seconds for the HTTP call.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

Parameter	Type	Default	Description
query	str		Metadata search string supplied at query time.

Was this page helpful?

Basic Information​

Inputs​

Outputs​

Overview​

Usage Example​

Using the Component in a Pipeline​

Parameters​

Init Parameters​

Run Method Parameters​