DeepsetOpenSearchRecursiveRetriever

Recursively retrieve documents from a document store based on their metadata.

Basic Information

  • Type: deepset_cloud_custom_nodes.retrievers.recursive_retriever.DeepsetOpenSearchRecursiveRetriever
  • Components it can connect with:
    • Rankers: DeepsetOpenSearchRecursiveRetriever can receive documents from a Ranker.
    • DocumentJoiner: You can use DeepsetOpenSearchRecursiveRetriever together with another Retriever and then send the retrieved documents to a DocumentJoiner that combines two document lists into a single list.

Inputs

ParameterTypeDefaultDescription
top_kOptional[int]NoneThe maximum number of documents to retrieve.
documentsOptional[List[Document]]The documents to retrieve from.
depthOptional[int]NoneThe depth of the recursive retrieval. For example, at depth 2, the component retrieves both the documents linked to the original documents and those linked to the retrieved documents. The default depth is 1, meaning only documents directly linked to the original documents are retrieved.
filtersOptional[Dict[str, Any]]NoneFilters to narrow down the search.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]The retrieved documents.

Overview

To use this Retriever, make sure the documents it receives include links to other documents in their metadata.
DeepsetOpenSearchRecursiveRetriever then retrieves the linked documents and adds them to the final document list.
It distinguishes between "original documents" (those initially passed to the component) and "recursively retrieved documents" (those retrieved based on the metadata of the original documents).

Usage Example

components:
  DeepsetOpenSearchRecursiveRetriever:
    type: retrievers.recursive_retriever.DeepsetOpenSearchRecursiveRetriever
    init_parameters:

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
filter_keystrThe key in the document metadata that is used to identify documents for filtering.
relevant_doc_keysList[str]The keys in the document metadata that are used to retrieve relevant documents.
document_storeOpenSearchDocumentStoreThe document store instance to retrieve documents from.
top_kOptional [int]10The maximum number of documents the component outputs.
depthOptional [int]1The depth of the recursive retrieval. For example, at depth 2, the component retrieves both the documents linked to the original documents and those linked to the retrieved documents. The default depth is 1, meaning only documents directly linked to the original documents are retrieved.
sampling_strategyOptional[List[Literal['rank', 'source', 'depth']]]NoneThe sampling strategy to use for the recursive retrieval. The strategy must include the following values: [rank, source, depth]. rank: The index of the document in the list for a given metadata key. source: The metadata key used to retrieve the document. depth: The document's depth in the recursive retrieval. The default strategy is ['rank', 'source', 'depth']. The order of values determines how documents are sampled to create the final list of top_k documents. The first value has the highest priority, and the last value has the lowest. For example, with the default strategy, documents are sampled by first iterating over depth, then source, and finally rank. At each step, one document is added to the final list until the top_k number is reached. The default order of iteration will be rank0, source0, depth0, then rank0, source0, depth1, rank0, source1, depth1, and so on, until the top_k is reached.
force_keep_original_documentsboolFalseWhether to force keeping the original documents in the final list.
filtersOptional[Dict[str, Any]]NoneFilters to narrow down the search.
filter_policyUnion[str, FilterPolicy]FilterPolicy.REPLACEThe policy to determine how to apply filters. Possible values:
- REPLACE: The filters provided at search time replace the filters in the component configuration.
- MERGE: The filters provided at search time are merged with the filters in the component configuration.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
top_kOptional[int]NoneThe maximum number of documents the component outputs.
documentsOptional[List[Document]]The list of documents to retrieve from.
depthOptional[int]NoneThe depth of the recursive retrieval. For example, at depth 2, the component retrieves both the documents linked to the original documents and those linked to the retrieved documents. The default depth is 1, meaning only documents directly linked to the original documents are retrieved.
filtersOptional[Dict[str, Any]]NoneThe filters to narrow down the search.