YAML Init Parameters

📘
When you create DeepsetCloudDocumentStore in the deepset Cloud Pipeline Designer, these parameters are ignored:

api_key

workspace

index

api_endpoint

label_index

In the Python SDK, all parameters are used.

These are the parameters you can specify for DeepsetCloudDocumentStore in the pipeline YAML:

Parameter	Type	Possible Values	Description
`api_key`	String		The secret value of the API key. This is the value that you copy in step 4 of Generate an API Key. If you don't specify it, it is read from the `DEEPSET_CLOUD_API_KEY` environment variable. Optional.
`workspace`	String	Default: `default`	Specifies the deepset Cloud workspace you want to use. Required.
`index`	String	Default: `None`	The name of the pipeline to access within the deepset Cloud workspace. In deepset Cloud, indexes share the names with their respective pipelines. Optional
`duplicate_documents`	String	`skip` - Ignores duplicate documents. `overwrite` - Updates any existing documents with the same ID when adding documents. `fail` - Raises an error if a document ID of the document that is being added already exists. Default: `overwrite`	Specifies how to handle duplicate documents. This setting only has an effect if you specify the fields you want to use to identify duplicate documents in the PreProcessor's `id_hash_keys` parameter. For example, to identify duplicate documents by their content, set `id_hash_keys: content`. Note that we add contextual metadata, like `file_id`, to your documents during indexing. This is why setting `id_hash_keys: meta` doesn't work. Required.
`api_endpoint`	String	Default: `None`	Specifies the URL of the deepset Cloud API. The API endpoint is: `<https://api.cloud.deepset.ai/api/v1`>. If you don't specify it, it's read from the `DEEPSET_CLOUD_API_ENDPOINT` environment variable. Optional.
`similarity`	String	`dot_product` - Default, use it if an embedding model was optimized for dot_product similarity. `cosine` - Recommended if the embedding model was optimized for cosine similarity. Default: `dot_product`	Specifies the similarity function used to compare document vectors. Required.
`label_index`	String	Default: `default`	Specifies the name of the evaluation set uploaded to deepset Cloud. In deepset Cloud, label indexes share the name with their corresponding evaluation sets. Required.
`return_embedding`	Boolean	`True`/`False` Default: `False`	Returns document embeddings. Required.
`embedding_dim`	int	Default: `768`	Specifies the dimensionality of the embedding vector. You only need this parameter if you're using a vector-based retriever, such as a `DensePassageRetriever` or `EmbeddingRetriever`. Required.
`use_prefiltering`	Boolean	True/False Default: `False`	Specifies when to apply filters to search. This is only relevant if you use an `EmbeddingRetriever`. With `EmbeddingRetriever`, DeepsetCloudDocumentStore defaults to post-filtering when querying with filters. This means the filters are applied after the documents are retrieved. You can change it to pre-filtering, where the filters are applied before retrieving the documents. this comes at the cost of higher latency, though. For the `BM25Retriever` filtering is always applied before a search. Required.
`search_fields`	Union[str, list]	Default: `content`	The names of fields BM25Retriever uses to find matches to the incoming query in the documents. For example: `["content", "title"]`. Required.

REST API Runtime Parameters

There are no runtime parameters you can pass to this node when making a request to the Search REST API endpoint.