Check the init and runtime parameters you can configure for CohereDocumentEmbedder.
YAML Init Parameters
These are the parameters you can specify in pipeline YAML:
Parameter | Type | Possible values | Description |
---|---|---|---|
api_key | Secret | Default: {"type": "env_var", "env_vars": ["COHERE_API_KEY", "CO_API_KEY"], "strict": False} | The Cohere API key. Required. |
model | String | embed-english-v3.0 ,embed-english-light-v3.0 ,embed-multilingual-v3.0 ,embed-multilingual-light-v3.0 ,embed-english-v2.0 ,embed-english-light-v2.0 ,embed-multilingual-v2.0 Default: embed-english-v2.0 | The name of the model to use. For a list of supported models, see Cohere documentation. Required. |
input_type | String | search_document ,search_query ,classification ,clustering Default: search_document | Specifies the type of input for the model. Optional for models lower than v3, required for v3 and higher. |
api_base_url | String | Default: https://api.cohere.com | The Cohere API base URL. Required. |
truncate | String | NONE ,START ,END | Truncate embeddings that are too long from the start or the end. Possible values: START discards the start of the input until the remaining input is exactly the maximum input token length for the model.END discards the end of the input until the remaining input is exactly the maximum input token length for the model.NONE returns an error if the input exceeds the maximum input token length.Required. |
use_async_client | Boolean | True , False Default: False | Flag to select the AsyncClient. Recommended for applications with many concurrent calls. Required. |
timeout | Integer | Default: 120 | Request timeout in seconds. Required. |
batch_size | Integer | Default: 32 | Number of documents to encode at once. Required. |
progress_bar | Boolean | True , False Default: True | Shows a progress bar. Can be helpful to disable in production deployments to keep the logs clean. Required. |
meta_fields_to_embed | List of strings | Default: None | List of metadata fields that should be embedded along with the document text. Optional. |
embedding_separator | String | Default: \n | Separator used to concatenate the meta fields to the Document text. Required. |