CohereRanker
Rank documents based on their semantic similarity to the query using Cohere models.
Key Features
- Uses Cohere's rerank endpoint to score documents by their relevance to the query.
- Returns documents ordered from most to least semantically relevant.
- Supports a configurable number of top results to return.
- Works with additional metadata fields for more context during reranking.
- For a list of supported ranking models, see the Cohere documentation.
Configuration
- Drag the
CohereRankercomponent onto the canvas from the Component Library. - Click on the component to open the configuration panel.
- On the General tab:
- Select the Cohere ranking model to use. Make sure Haystack Platform is connected to your Cohere account. For details, see Use Cohere Models.
- Set
top_kto the maximum number of documents to return.
- Go to the Advanced tab to configure additional settings such as
api_base_url,meta_fields_to_embed,meta_data_separator, andmax_tokens_per_doc.
Connections
CohereRanker receives documents from a retriever and the user query from Input. It outputs a ranked list of documents through its documents output. You can connect its output to PromptBuilder or use it as the pipeline's final document output.
It works well after DocumentJoiner in hybrid retrieval pipelines that combine keyword and semantic search results.
Source Code
To check this component's source code, open ranker.py in the Haystack Core Integrations repository.
Usage Examples
Basic Configuration
CohereRanker:
type: haystack_integrations.components.rankers.cohere.ranker.CohereRanker
init_parameters:
model: rerank-v3.5
top_k: 10
api_key:
type: env_var
env_vars:
- COHERE_API_KEY
- CO_API_KEY
strict: false
api_base_url: https://api.cohere.com
meta_data_separator: \n
max_tokens_per_doc: 4096
Using the Component in a Pipeline
This is an example of a document search pipeline where CohereRanker receives joined documents from both a keyword and a semantic retriever. It then ranks the documents based on their similarity to the user query and outputs them as the final result.
components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 768
similarity: cosine
index: ''
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
top_k: 20
embedding_retriever:
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 768
similarity: cosine
index: ''
max_chunk_bytes: 104857600
return_embedding: false
method:
mappings:
settings:
create_index: true
timeout:
top_k: 20
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
DeepsetNvidiaTextEmbedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
model: intfloat/multilingual-e5-base
prefix: ''
suffix: ''
truncate:
normalize_embeddings: false
timeout:
backend_kwargs:
CohereRanker:
type: haystack_integrations.components.rankers.cohere.ranker.CohereRanker
init_parameters:
model: rerank-v3.5
top_k: 10
api_key:
type: env_var
env_vars:
- COHERE_API_KEY
- CO_API_KEY
strict: false
api_base_url: https://api.cohere.com
meta_fields_to_embed:
meta_data_separator: \n
max_tokens_per_doc: 4096
connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: DeepsetNvidiaTextEmbedder.embedding
receiver: embedding_retriever.query_embedding
- sender: document_joiner.documents
receiver: CohereRanker.documents
max_runs_per_component: 100
metadata: {}
inputs:
query:
- bm25_retriever.query
- DeepsetNvidiaTextEmbedder.text
- CohereRanker.query
filters:
- bm25_retriever.filters
- embedding_retriever.filters
outputs:
documents: CohereRanker.documents
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The user query. | |
| documents | List[Document] | The documents to rank. | |
| top_k | Optional[int] | None | The maximum number of documents you want the Ranker to return. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | List of Documents most similar to the given query in descending order of similarity. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | str | rerank-v3.5 | Cohere model name. Check the list of supported models in the Cohere documentation. |
| top_k | int | 10 | The maximum number of documents to return. |
| api_key | Secret | Secret.from_env_var(['COHERE_API_KEY', 'CO_API_KEY']) | Cohere API key. |
| api_base_url | str | https://api.cohere.com | The base URL of the Cohere API. |
| meta_fields_to_embed | Optional[List[str]] | None | List of meta fields that should be concatenated with the document content for reranking. |
| meta_data_separator | str | \n | Separator used to concatenate the meta fields to the Document content. |
| max_tokens_per_doc | int | 4096 | The maximum number of tokens to embed for each document defaults to 4096. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | Query string. | |
| documents | List[Document] | List of documents to rank. | |
| top_k | Optional[int] | None | The maximum number of documents you want the Ranker to return. |
Related Information
Was this page helpful?