MongoDBAtlasFullTextRetriever
Retrieves documents from the MongoDBAtlasDocumentStore by full-text search. This retriever is only compatible with the MongoDBAtlasDocumentStore.
Key Features
- Retrieves documents from MongoDB Atlas using full-text search.
- Only compatible with
MongoDBAtlasDocumentStore. - Supports single or multiple query strings.
- Configurable fuzzy matching, match criteria (
anyorall), synonyms, and score customization. - Supports runtime filter overrides with configurable
filter_policy(MERGEorREPLACE). - Relies on the
full_text_search_indexconfigured inMongoDBAtlasDocumentStore.
Configuration
- Drag the
MongoDBAtlasFullTextRetrievercomponent onto the canvas from the Component Library. - Click on the component to open the configuration panel.
- On the General tab:
- Configure the
MongoDBAtlasDocumentStoreconnection, includingmongo_connection_string,database_name,collection_name, andfull_text_search_index. Create a secret with your MongoDB connection string usingMONGO_CONNECTION_STRINGas the secret key. For instructions, see Create Secrets. - Set
top_kto control the maximum number of documents returned.
- Configure the
- Go to the Advanced tab to configure
filtersandfilter_policy.
Connections
MongoDBAtlasFullTextRetriever accepts a query string (or list of strings) through its query input. It outputs retrieved documents through its documents output.
Connect the Input component's query output to MongoDBAtlasFullTextRetriever's query input. Connect its documents output to a Ranker or directly to the pipeline output.
Source Code
To check this component's source code, open full_text_retriever.py in the Haystack Core Integrations repository.
Usage Examples
Basic Configuration
MongoDBAtlasFullTextRetriever:
type: haystack_integrations.components.retrievers.mongodb_atlas.full_text_retriever.MongoDBAtlasFullTextRetriever
init_parameters:
top_k: 10
filter_policy: replace
document_store:
type: haystack_integrations.document_stores.mongodb_atlas.document_store.MongoDBAtlasDocumentStore
init_parameters:
mongo_connection_string:
type: env_var
env_vars:
- MONGO_CONNECTION_STRING
strict: false
database_name: my-db
collection_name: my-collection
vector_search_index: vector-search
full_text_search_index: full-text-search
embedding_field: embedding
content_field: content
This is a query pipeline with MongoDBAtlasFullTextRetriever that searches for documents in the MongoDBAtlasDocumentStore.
components:
TransformersSimilarityRanker:
type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
init_parameters:
model: cross-encoder/ms-marco-MiniLM-L-6-v2
device:
token:
type: env_var
env_vars:
- HF_API_TOKEN
- HF_TOKEN
strict: false
top_k: 10
query_prefix: ''
document_prefix: ''
meta_fields_to_embed:
embedding_separator: \n
scale_score: true
calibration_factor: 1
score_threshold:
model_kwargs:
tokenizer_kwargs:
batch_size: 16
MongoDBAtlasFullTextRetriever:
type: haystack_integrations.components.retrievers.mongodb_atlas.full_text_retriever.MongoDBAtlasFullTextRetriever
init_parameters:
filters:
top_k: 10
filter_policy: replace
document_store:
type: haystack_integrations.document_stores.mongodb_atlas.document_store.MongoDBAtlasDocumentStore
init_parameters:
mongo_connection_string:
type: env_var
env_vars:
- MONGO_CONNECTION_STRING
strict: false
database_name: my-db
collection_name: my-collection
vector_search_index: vector-search
full_text_search_index: full-text-search
embedding_field: embedding
content_field: content
connections:
- sender: MongoDBAtlasFullTextRetriever.documents
receiver: TransformersSimilarityRanker.documents
max_runs_per_component: 100
metadata: {}
inputs:
query:
- TransformersSimilarityRanker.query
- MongoDBAtlasFullTextRetriever.query
outputs:
documents: TransformersSimilarityRanker.documents
Parameters
Inputs
| Parameter | Type | Description |
|---|---|---|
query | Union[str, List[str]] | The query string or a list of query strings to search for. If the query contains multiple terms, Atlas Search evaluates each term separately for matches. |
fuzzy | Optional[Dict[str, int]] | Enables finding strings similar to the search terms. Note that you can't use fuzzy with synonyms. Configurable options include maxEdits, prefixLength, and maxExpansions. For more details, refer to MongoDB Atlas documentation. |
match_criteria | Optional[Literal['any', 'all']] | Defines how terms in the query are matched. Supported options are "any" and "all". For more details, refer to MongoDB Atlas documentation. |
score | Optional[Dict[str, Dict]] | Specifies the scoring method for matching results. Supported options include boost, constant, and function. For more details, refer to MongoDB Atlas documentation. |
synonyms | Optional[str] | The name of the synonym mapping definition in the index. This value cannot be an empty string. Note that you can't use synonyms with fuzzy. |
filters | Optional[Dict[str, Any]] | Filters applied to the retrieved Documents. The way runtime filters are applied depends on the filter_policy configured for the retriever. |
top_k | int | Maximum number of Documents to return. |
Outputs
| Parameter | Type | Description |
|---|---|---|
documents | List[Document] | List of Documents most similar to the given query. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| document_store | MongoDBAtlasDocumentStore | An instance of MongoDBAtlasDocumentStore. | |
| filters | Optional[Dict[str, Any]] | None | Filters applied to the retrieved Documents. Make sure that the fields used in the filters are included in the configuration of the full_text_search_index. The configuration must be done manually in the Web UI of MongoDB Atlas. |
| top_k | int | 10 | Maximum number of Documents to return. |
| filter_policy | Union[str, FilterPolicy] | FilterPolicy.REPLACE | Policy to determine how filters are applied if they're configured for the component but also passed at runtime. Possible values: MERGE and REPLACE. MERGE: If both filter types target the same field, the runtime filter takes precedence. Logical filters are combined unly if they have the same operator (AND, OR). Comparison filters are combined using the default logical operator (defaults to AND). |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | Union[str, List[str]] | The query string or a list of query strings to search for. If the query contains multiple terms, Atlas Search evaluates each term separately for matches. | |
| fuzzy | Optional[Dict[str, int]] | None | Enables finding strings similar to the search terms. Note that you can't use fuzzy with synonyms. Configurable options include maxEdits, prefixLength, and maxExpansions. For more details refer to MongoDB Atlas documentation. |
| match_criteria | Optional[Literal['any', 'all']] | None | Defines how terms in the query are matched. Supported options are "any" and "all". For more details refer to MongoDB Atlas documentation. |
| score | Optional[Dict[str, Dict]] | None | Specifies the scoring method for matching results. Supported options include boost, constant, and function. For more details refer to MongoDB Atlas documentation. |
| synonyms | Optional[str] | None | The name of the synonym mapping definition in the index. This value cannot be an empty string. Note that you can't use synonyms with fuzzy. |
| filters | Optional[Dict[str, Any]] | None | Filters applied to the retrieved Documents. The way runtime filters are applied depends on the filter_policy configured for the retriever. |
| top_k | int | 10 | Maximum number of Documents to return. |
Was this page helpful?