DeepsetOpenSearchRecursiveRetriever
Recursively retrieve documents from a document store based on their metadata.
Basic Information
- Type:
deepset_cloud_custom_nodes.retrievers.recursive_retriever.DeepsetOpenSearchRecursiveRetriever
- Components it can connect with:
- Rankers:
DeepsetOpenSearchRecursiveRetriever
can receive documents from a Ranker. DocumentJoiner
: You can useDeepsetOpenSearchRecursiveRetriever
together with another Retriever and then send the retrieved documents to aDocumentJoiner
that combines two document lists into a single list.
- Rankers:
Inputs
Parameter | Type | Default | Description |
---|---|---|---|
top_k | Optional[int] | None | The maximum number of documents to retrieve. |
documents | Optional[List[Document]] | The documents to retrieve from. | |
depth | Optional[int] | None | The depth of the recursive retrieval. For example, at depth 2, the component retrieves both the documents linked to the original documents and those linked to the retrieved documents. The default depth is 1, meaning only documents directly linked to the original documents are retrieved. |
filters | Optional[Dict[str, Any]] | None | Filters to narrow down the search. |
Outputs
Parameter | Type | Default | Description |
---|---|---|---|
documents | List[Document] | The retrieved documents. |
Overview
To use this Retriever, make sure the documents it receives include links to other documents in their metadata.
DeepsetOpenSearchRecursiveRetriever
then retrieves the linked documents and adds them to the final document list.
It distinguishes between "original documents" (those initially passed to the component) and "recursively retrieved documents" (those retrieved based on the metadata of the original documents).
Usage Example
components:
DeepsetOpenSearchRecursiveRetriever:
type: retrievers.recursive_retriever.DeepsetOpenSearchRecursiveRetriever
init_parameters:
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
Parameter | Type | Default | Description |
---|---|---|---|
filter_key | str | The key in the document metadata that is used to identify documents for filtering. | |
relevant_doc_keys | List[str] | The keys in the document metadata that are used to retrieve relevant documents. | |
document_store | OpenSearchDocumentStore | The document store instance to retrieve documents from. | |
top_k | Optional [int] | 10 | The maximum number of documents the component outputs. |
depth | Optional [int] | 1 | The depth of the recursive retrieval. For example, at depth 2, the component retrieves both the documents linked to the original documents and those linked to the retrieved documents. The default depth is 1, meaning only documents directly linked to the original documents are retrieved. |
sampling_strategy | Optional[List[Literal['rank', 'source', 'depth']]] | None | The sampling strategy to use for the recursive retrieval. The strategy must include the following values: [rank , source , depth ]. rank : The index of the document in the list for a given metadata key. source : The metadata key used to retrieve the document. depth : The document's depth in the recursive retrieval. The default strategy is ['rank', 'source', 'depth']. The order of values determines how documents are sampled to create the final list of top_k documents. The first value has the highest priority, and the last value has the lowest. For example, with the default strategy, documents are sampled by first iterating over depth, then source, and finally rank. At each step, one document is added to the final list until the top_k number is reached. The default order of iteration will be rank0, source0, depth0, then rank0, source0, depth1, rank0, source1, depth1, and so on, until the top_k is reached. |
force_keep_original_documents | bool | False | Whether to force keeping the original documents in the final list. |
filters | Optional[Dict[str, Any]] | None | Filters to narrow down the search. |
filter_policy | Union[str, FilterPolicy] | FilterPolicy.REPLACE | The policy to determine how to apply filters. Possible values: - REPLACE : The filters provided at search time replace the filters in the component configuration. - MERGE : The filters provided at search time are merged with the filters in the component configuration. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
Parameter | Type | Default | Description |
---|---|---|---|
top_k | Optional[int] | None | The maximum number of documents the component outputs. |
documents | Optional[List[Document]] | The list of documents to retrieve from. | |
depth | Optional[int] | None | The depth of the recursive retrieval. For example, at depth 2, the component retrieves both the documents linked to the original documents and those linked to the retrieved documents. The default depth is 1, meaning only documents directly linked to the original documents are retrieved. |
filters | Optional[Dict[str, Any]] | None | The filters to narrow down the search. |
Updated about 9 hours ago