FilterRetriever
Retrieve documents that match the provided filters.
Basic Information
- Type:
haystack_integrations.retrievers.filter_retriever.FilterRetriever
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| filters | Optional[Dict[str, Any]] | None | A dictionary with filters to narrow down the search space. If not specified, the FilterRetriever uses the values provided at initialization. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A list of retrieved documents. |
Overview
FilterRetriever retrieves documents that match the provided filters. It's useful when you want to narrow down search results based on document metadata without performing keyword or semantic search.
FilterRetriever can work with any Document Store. Be careful when using it on a Document Store that contains many Documents, as FilterRetriever returns all documents that match the filters. Running it with no filters can easily overwhelm other components in the Pipeline (for example, Generators).
FilterRetriever does not score your Documents or rank them in any way. If you need to rank the Documents by similarity to a query, consider using Ranker components.
Usage Example
This example shows how to use FilterRetriever to retrieve documents based on metadata filters:
components:
filter_retriever:
type: haystack.components.retrievers.filter_retriever.FilterRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
- ${OPENSEARCH_HOST}
index: ''
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
use_ssl: true
verify_certs: false
timeout:
prompt_builder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: |-
Given these documents, answer the question.
Documents:
{% for doc in documents %}
{{ doc.content }}
{% endfor %}
Question: {{question}}
Answer:
llm:
type: haystack.components.generators.openai.OpenAIGenerator
init_parameters:
model: gpt-5-mini
generation_kwargs:
temperature: 0.7
answer_builder:
type: haystack.components.builders.answer_builder.AnswerBuilder
init_parameters: {}
connections:
- sender: filter_retriever.documents
receiver: prompt_builder.documents
- sender: prompt_builder.prompt
receiver: llm.prompt
- sender: llm.replies
receiver: answer_builder.replies
max_runs_per_component: 100
inputs:
query:
- prompt_builder.question
- answer_builder.query
outputs:
answers: answer_builder.answers
metadata: {}
In this example, you can pass filters at query time to narrow down the documents. For instance, to retrieve only documents from a specific year, you would pass:
{
"filters": {
"field": "year",
"operator": "==",
"value": 2021
}
}
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| document_store | DocumentStore | An instance of a Document Store to use with the Retriever. | |
| filters | Optional[Dict[str, Any]] | None | A dictionary with filters to narrow down the search space. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| filters | Optional[Dict[str, Any]] | None | A dictionary with filters to narrow down the search space. If not specified, the FilterRetriever uses the values provided at initialization. |
Was this page helpful?