Skip to main content

CohereTextEmbedder

A component for embedding strings using Cohere models.

Basic Information

  • Type: haystack_integrations.components.embedders.cohere.text_embedder.CohereTextEmbedder
  • Comonents it can connect with:
    • Query: CohereTextEmbedder receives the query to embed from Query.
    • Embedding Retrievers: CohereTextEmbedder can send the embedded query to an embedding retriever that uses it to find matching documents.

Inputs

ParameterTypeDescription
textstrThe text to embed.

Outputs

ParameterTypeDefaultDescription
embeddingList[float]The embedding of the text.
metaDict[str, Any]Metadata about the request.

Overview

CohereTextEmbedder uses Cohere models to embed strings, such as user queries. Use this component in apps with embedding retrieval to transform your query into a vector.

For a list of supported models, see the Cohere documentation.

Embedding Models in Query Pipelines and Indexes

The embedding model you use to embed documents in your indexing pipeline must be the same as the embedding model you use to embed the query in your query pipeline.

This means the embedders for your indexing and query pipelines must match. For example, if you use CohereDocumentEmbedder to embed your documents, you should use CohereTextEmbedder with the same model to embed your queries.

Authorization

You need a Cohere API key to use this component. Connect deepset to your Cohere account on the Integrations page.

Connection Instructions

  1. Click your profile icon in the top right corner and choose Integrations.
    Integrations menu screenshot
  2. Click Connect next to the provider.
  3. Enter your API key and submit it.

Usage Example

Initializing the Component

components:
CohereTextEmbedder:
type: haystack_integrations.components.embedders.cohere.text_embedder.CohereTextEmbedder
init_parameters:
api_key: <your-cohere-api-key>
model: embed-english-v2.0
input_type: search_query
api_base_url: https://api.cohere.com
truncate: END
timeout: 120
embedding_type: float

Using the Component in a Pipeline

This is an example of a query pipeline with CohereTextEmbedder that receives a query to embed and then sends the embedded query to OpenSearchEmbeddingRetriever to find matching documents.

components:
CohereTextEmbedder:
type: haystack_integrations.components.embedders.cohere.text_embedder.CohereTextEmbedder
init_parameters:
api_key:
type: env_var
env_vars:
- COHERE_API_KEY
- CO_API_KEY
strict: false
model: embed-english-v2.0
input_type: search_query
api_base_url: https://api.cohere.com
truncate: END
use_async_client: false
timeout: 120
embedding_type:
OpenSearchEmbeddingRetriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
filters:
top_k: 10
filter_policy: replace
custom_query:
raise_on_failure: true
efficient_filtering: true
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: Standard-Index-English
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:

connections:
- sender: CohereTextEmbedder.embedding
receiver: OpenSearchEmbeddingRetriever.query_embedding

max_runs_per_component: 100

metadata: {}

inputs:
query:
- CohereTextEmbedder.text

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
api_keySecretSecret.from_env_var(['COHERE_API_KEY', 'CO_API_KEY'])The Cohere API key.
modelstrembed-english-v2.0The name of the model to use. Choose a model from the list on the component card.
input_typestrsearch_querySpecifies the type of input you're giving to the model. Supported values are "search_document", "search_query", "classification", and "clustering". Not required for older versions of the embedding models (meaning anything lower than v3), but is required for more recent versions (meaning anything bigger than v2).
api_base_urlstrhttps://api.cohere.comThe Cohere API base url.
truncatestrENDTruncates embeddings that are too long from start or end, ("NONE"|"START"|"END"). Passing "START" discards the start of the input. "END" discards the end of the input. In both cases, input is discarded until the remaining input is exactly the maximum input token length for the model. If "NONE" is selected, when the input exceeds the maximum input token length, an error is returned.
timeoutint120Request timeout in seconds.
embedding_typeOptional[EmbeddingTypes]NoneThe type of embeddings to return. Defaults to float embeddings. Note that int8, uint8, binary, and ubinary are only valid for v3 models.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
textstrThe text to embed.