Ranker

Ranker orders documents according to the given criteria. For example, you can use it in your query pipeline to prioritize the most recent documents.

Rankers are made to improve your retrieval results. Besides fast and simple feature-based rankers (such as RecentnessRanker), you can use rich transformer-based models which are sensitive to word order and syntax. In a pipeline, you can pair Ranker with a keyword-based Retriever, like BM25Retriever. BM25Retriever is lightweight but insensitive to word order. Adding a Ranker after BM25Retriever in your query pipeline offsets this weakness and results in a better-sorted list of relevant documents

The improvement the Ranker brings comes at the cost of computational time.

Basic Information

  • Pipeline type: Used in query pipelines.
  • Nodes that can precede it in a pipeline: JoinDocuments, Retriever
  • Nodes that can follow it in a pipeline: JoinDocuments, PromptNode, Reader
  • Node input: Documents
  • Node output: Documents
  • Available node classes: CohereRanker, DiversityRanker, EmbeddingRanker, LostInTheMiddleRanker, RecentnessRanker, SentenceTransformersRanker

Rankers Overview

There are two categories of rankers: model-based and feature-based. See the table below for a comparison of the two:

Model-Based RankersFeature-Based Rankers
DescriptionModel-based rankers work with document embeddings. They embed both the documents and the query using a transformer model. They then sort the documents by their similarity to the query.
These rankers are trained using labeled datasets. They're language-specific.
Feature-based rankers work with document metadata. They sort documents based on features in their metadata fields, such as recentness.
Advantages- Powerful
- Use transformer encoder models that take word order and syntax into account
- Build strong semantic representations of text
- Can improve the initial ranking done by a weak but fast retriever
- Lightweight
- No training required
- Can sort based on document features, such as recentness
- Can improve the initial ranking that lacks a given document feature
Disadvantages- More expensive computationally
- Have problems with out-of-vocabulary words
Available classesSentenceTransformersRanker
EmbeddingRanker
CohereRanker
DiversityRanker
RecentnessRanker
LostInTheMiddleRanker

CohereRanker

CohereRanker uses models by Cohere to rerank documents. Cohere models are trained with a context length of 512 tokens. The model takes into account both the tokens from the query and the document. If your query is longer than 256 tokens, it's shortened to the first 256 tokens.

For more information and best practices on re-ranking in Cohere, see Cohere documentation.

Diversity Ranker

This ranker is typically used in retrieval augmented generation (RAG) pipelines where a large language model (LLM) generates the answer based on the documents you feed to it in the prompt. DiversityRanker ensures that the generated answer is based on diverse documents.

It uses a sentence transformer model to calculate the semantic representation (embedding) for each document. Then, it ranks the documents so that each subsequent document is the least similar to the ones it already selected. This results in a diverse set of documents.

You can use it in combination with other rankers. If you do so, place it after the similarity ranker, like SentenceTransformersRanker, but before the LostInTheMiddleRanker. Such setup is typical for the long form question answering task.

For more information, see our blog post.

EmbeddingRanker

EmbeddingRanker re-ranks documents based on the similarity score calculated by comparing the query vector embedding with each document vector embedding. It uses bi-encoder models. When combined with a keyword-based retriever (like BM25Retriever), it brings better results without the additional indexing costs of vector-based retrievers.

LostInTheMiddleRanker

This ranker is intended for pipelines using a large language model (LLM). Recent research showed that LLMs struggle with focusing on relevant passages located in the middle of a long context. The goal of the LostInTheMiddleRanker is to make it easy for an LLM to access the most relevant documents by placing them at the beginning and at the end of the context window.

LostInTheMiddleRanker is meant to be used in combination with other rankers. In a RAG pipeline, place it as the last ranker after the relevance and diversity rankers.

For more information, see the Lost in the Middle: How Language Models Use Long Contexts paper by Liu et al and our blog post.

RecentnessRanker

This ranker re-ranks documents based on both their age and their relevance. You can specify the date metadata field you want it to rank on. Additionally, you can set weight and method parameters for RecentnessRanker. The weight parameter controls whether you want to rank documents based on their recentness, relevance, or both. The method parameter determines how you want to combine the recentness and relevance ranking.

SentenceTransformersRanker

This ranker uses one model (cross-encoder) to encode both the documents and the query at the same time. It can use a sentence-transformer-based cross-encoder model. Cross-encoders naturally produce better results at the cost of latency when compared to bi-encoders.

Recommended Models

For an overview of the ranking models, we recommend, see Models in deepset Cloud.

Usage Example

name: 'cohereranker_pipeline'

components:
	- name: Ranker
  	type: CohereRanker
    params:
    	model_name_or_path: rerank-multilingual-v2.0
      api_key: <your_cohere_api_key>
      top_k: 10
      ...
pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: Ranker
        inputs: [Retriever]
      - name: PromptNode
        inputs: [Ranker]
  - name: indexing
    nodes:
    	# here comes your indexing pipeline configuration
# These rankers are usually used in combination with other rankers
# LostIntheMiddleRanker comes at the end of the query pipeline, before PromptNode
# DiversityRanker comes after a similarity ranker but before LostInTheMiddleRanker

name: 'diversity_lostinthemiddle_rankers_pipeline'

components:
	- name: DiversityRanker
  	type: DiversityRanker
  - name: LostIntheMiddle
    type: LostInTheMiddleRanker
    params:
    	word_count_threshold: 1024
  ...
pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: DiversityRanker
        inputs: [Retriever]
      - name: LostInTheMiddle
      	inputs: [DiversityRanker]
      - name: PromptNode
        inputs: [LostInTheMiddle]
  - name: indexing
    nodes:
    	# here comes your indexing pipeline configuration
name: 'embedding_ranker_pipeline'

components:
	- name: Ranker
  	type: EmbeddingRanker
    params:
    	embedding_model: sentence-transformers/all-mpnet-base-v2
      ...
pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: Ranker
        inputs: [Retriever]
      - name: PromptNode
        inputs: [Ranker]
  - name: indexing
    nodes:
    	# here comes your indexing pipeline configuration
name: 'search_by_recency'

components:
	- name: Ranker
  	type: RecentnessRanker
    params:
    	date_identifier: updated_at # this is the name of the document's metadata field containing the date
      ...
pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: Ranker
        inputs: [Retriever]
      - name: PromptNode
        inputs: [Ranker]
  - name: indexing
    nodes:
    	# here comes your indexing pipeline configuration
name: 'ranker_pipeline'

components:
	- name: Ranker
  	type: SentenceTransformersRanker
    params:
    	model_name_or_path: cross-encoder/ms-marco-MiniLM-L-12-v2
      ...
pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: Ranker
        inputs: [Retriever]
      - name: PromptNode
        inputs: [Ranker]
  - name: indexing
    nodes:
    	# here comes your indexing pipeline configuration

Combining Rankers

You can use multiple Rankers in your pipeline. Combining RecentnessRanker with a model-based Ranker is a handy way of handling situations when the Retriever you're using returns improper scores (scores that don't scale between 0 and 1), but you want to use the score method for the RecentnessRanker.

...
components:
  - name: Retriever 
    type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
    params:
      document_store: DocumentStore
      embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search
      model_format: sentence_transformers
      top_k: 20
  - name: RecentnessRanker 
    type: RecentnessRanker # Incorporate recentness into the semantic ranking
    params:
      date_identifier: updated_at
      method: score
  - name: SentenceTransformersRanker 
    type: SentenceTransformersRanker # Model-based Ranker
    params:
      model_name_or_path: cross-encoder/ms-marco-MiniLM-L-12-v2
      top_k: 15

pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: SentenceTransformersRanker
        inputs: [Retriever]
      - name: RecentnessRanker
        inputs: [SentenceTransformersRanker]
      - name: PromptNode
        inputs: [RecentnessRanker]
        ...

Arguments

Here are the arguments each ranker type can take:

CohereRanker Arguments

ArgumentTypePossible ValuesDescription
api_keyStringThe Cohere API key so that you can use Cohere models.
Required.
model_name_or_pathStringThe name of the Cohere model you want to use for reranking. Check the list of supported models in the Cohere documentation.
Required.
top_kIntegerDefault: 10The maximum number of documents to return.
Required.
max_chunks_per_docIntegerDefault: NoneSpecifies the maximum number of chunks your document is split into if the document exceeds 512 tokens. (This is because Cohere models break documents into chunks of 512 tokens.)
The default None setting splits your documents into a maximum of 10 chunks. Note that this parameter counts in the length of the query as well. So if your query is 32 tokens, the split will be:
First chunk: 32 tokens from query + first 480 tokens from the document
Second chunk: 32 tokens from the query + tokens 481 to 961 from the document,
and so on. If after splitting into 10 chunks there are still some tokens from the document left, they're disregarded.
Optional.
embed_meta_fieldsList of stringsConcatenates the provided metadata fields into the text passage that is then used in reranking. The concatenated metadata are not included in the documents that are returned. (Original documents are returned.)

DiversityRanker Arguments

ArgumentTypePossible ValuesDescription
model_name_or_pathUnion[String, Path]Model path
Default: all-MiniLM-L6-v2
The path to the directory of a saved sentence transformers model.
Mandatory.
top_kIntegerDefault: NoneThe maximum number of documents the Ranker should return.
Optional.
use_gpuBooleanTrue
False
Default: True
Specifies whether to use all available GPUs or a CPU. Falls back on a CPU if no GPU is available.
Optional.
devicesString, torch.deviceDefault: NoneA list of devices to use for inference.
Optional.
similarityLiteraldot_product
cosine
Default: dot_product
Specifies the function to apply for calculating the similarity of query and passage embeddings.
Required.

EmbeddingRanker Arguments

ArgumentTypePossible ValuesDescription
embedding_modelUnion[String, Path]Model pathThe path to the directory of a saved model or the name of a public model, for example: sentence-transformers/all-mpnet-base-v2.
Required.
top_kIntegerDefault: 10The maximum number of documents the Ranker should return.
Required.
use_gpuBooleanTrue
False
Default: True
Specifies whether to use all available GPUs or a CPU. Falls back on a CPU if no GPU is available.
Required.
devicesString, torch.deviceDefault: NoneA list of devices to use for inference.
Optional.
batch_sizeIntegerDefault: 16The number of documents you want the ranker to process at a time.
Required.
scale_scoreBooleanTrue
False
Default: True
Scales the similarity score to a unit interval in the range of 0 to 1, where 1 means extremely relevant.
True - Scales similarity scores that naturally have a different value range, such as cosine or dot_product.
False - Uses raw similarity scores.
Required.
max_seq_lenIntegerDefault: 512Specifies the maximum number of tokens the document text can have. Longer documents are truncated.
Required.
similarityStringdot-product
cosine
Default: dot-product
Specifies the function to apply for calculating the similarity of query and passage embeddings.
Required.
embedding_dimIntegerDefault: 768Specifies the dimensionality of the embedding vector.
return_embeddingBooleanTrue
False
Default:False
Returns document embeddings.
use_auth_tokenUnion[string, Boolean]Default: NoneIf you're using a private model from Hugging Face, pass the API token used to download the model in this parameter.
If this parameter is set to True, then the token generated when running transformers-cli login (stored in ~/.huggingface) is used.
Optional.
raise_for_missing_embeddingsBooleanTrue
False
Default: True
Raises an error if there are embeddings missing.

LostInTheMiddleRanker

ArgumentTypePossible ValuesDescription
word_count_thresholdIntegerDefault: NoneThe maximum total number of words across all documents the ranker selects. If you specify this parameter, the ranker includes all documents up to the point where adding another document would exceed the word_count_threshold. The last document that exceeds the threshold is included in the resulting list of documents, but all subsequent documents are discarded.
Optional.
top_kIntegerDefault: NoneThe maximum number you want the ranker to return.
Optional.

RecentnessRanker Arguments

ArgumentTypePossible ValuesDescription
date_meta_fieldStringThe name of the metadata field in your documents that contains the date, for example updated_at. This is a required parameter, as dates are needed for sorting documents from newest to oldest.
Required.
top_kIntegerDefault: None (all documents are returned)Specifies how many documents to return. You may want to set larger top_k for the Retriever and then use the RecentnessRanker top_k to filter the documents down.
Optional.
weightFloatValues in range [0,1]
Default: 0.5
Specifies how documents are sorted.
0 means sorting by document age is disabled.
0.5 means both document content and its age have the same impact.
1 means documents are sorted by age only. The most recent documents come first.
Optional.
ranking_modeStringscore
reciprocal_rank_fusion
Default: reciprocal_rank_fusion
Specifies the method used to combine the documents fetched by the Retriever with recentness ranking.
reciprocal_rank_fusion - combines individual document rankings and uses the outcome to rank the documents.
score - uses the document's score returned by the Retriever.
Make sure you use score only with Retrievers or Rankers that return properly distributed scores in the range [0,1]
If your Retriever doesn't return properly distributed scores (like BM25Retriever), you can either set the method to reciprocal_rank_fusion or combine RecentnessRanker with a model-based Ranker, such as SentenceTransfomersRanker.

SentenceTransformersRanker Arguments

ArgumentTypePossible ValuesDescription
model_name_or_pathStringExample: cross-encoder/ms-marco-MiniLM-L-12-v2The path to a saved model or the name of a public model from Hugging Face.
For a list of available models, see cross encoders.
Required.
model_versionStringDefault: NoneThe version of the model from Hugging Face. This can be a tag name, a branch name, or a commit hash.
Optional.
top_kIntegerDefault: 10The maximum number of documents to return.
Required.
use_gpuBooleanTrue
False
Default: True
Specifies whether to use all available GPUs or a CPU. Falls back on a CPU if no GPU is available.
Required
batch_sizeIntegerDefault: 16The number of documents you want the ranker to process at a time.
Required
scale_scoreBooleanTrue
False
Default: True
If the model only predicts a single label, the raw predictions are transformed using a Sigmoid activation function.
No scaling is applied to multi-label predictions.
If you don't want to scale raw predictions, set this value to False.
Required.
progress_barBooleanTrue
False
Default: True
Shows a progress bar when processing the documents.
Required.
use_auth_tokenUnion[string, Boolean]Default: NoneIf you're using a private model from Hugging Face, pass the API token used to download the model in this parameter.
If this parameter is set to True, then the token generated when running transformers-cli login (stored in ~/.huggingface) is used.
Optional.
embed_meta_fieldsList of stringsDefault: NoneConcatenates the provided meta fields to a text passage that is then used in reranking. The concatenated metadata are not included in the returned documents.
model_kwargsDictionary.Default: NoneAdditional keyword arguments passed to AutoModelForSequenceClassification.from_pretrained
when loading the model specified in model_name_or_path. See the model's documentation for details on what kwargs you can pass.