Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

SentenceTransformersSimilarityRanker

Rank documents based on their semantic similarity to the query.

Key Features

  • Uses a pre-trained cross-encoder model from Hugging Face to rank documents by semantic relevance.
  • Configurable top_k and score_threshold to control the number and quality of returned documents.
  • Supports score scaling with Sigmoid activation for normalized confidence scores.
  • Configurable query and document prefix strings for instruction-tuned reranking models.
  • Supports multiple backends including torch, ONNX, and OpenVINO for performance optimization.

Configuration

  1. Drag the SentenceTransformersSimilarityRanker component onto the canvas from the Component Library.
  2. Click the component to open the configuration panel.
  3. On the General tab:
    1. Enter the model name or local path, such as cross-encoder/ms-marco-MiniLM-L-6-v2.
  4. Go to the Advanced tab to configure the device, token, top_k, query prefix, document prefix, score threshold, model kwargs, and tokenizer kwargs.

Connections

SentenceTransformersSimilarityRanker receives a query string and a documents list — typically from a Retriever or DocumentJoiner. It outputs a ranked documents list sorted from most to least relevant. Connect its output to ChatPromptBuilder or AnswerBuilder.

Usage Example

components:
SentenceTransformersSimilarityRanker:
type: components.rankers.sentence_transformers_similarity.SentenceTransformersSimilarityRanker
init_parameters:

Parameters

Inputs

ParameterTypeDefaultDescription
querystrThe input query to compare the documents to.
documentsList[Document]A list of documents to be ranked.
top_kOptional[int]NoneThe maximum number of documents to return.
scale_scoreOptional[bool]NoneIf True, scales the raw logit predictions using a Sigmoid activation function. If False, disables scaling of the raw logit predictions. If set, overrides the value set at initialization.
score_thresholdOptional[float]NoneUse it to return documents only with a score above this threshold. If set, overrides the value set at initialization.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]A list of documents closest to the query, sorted from most similar to least similar.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
modelUnion[str, Path]cross-encoder/ms-marco-MiniLM-L-6-v2The ranking model. Pass a local path or the Hugging Face model name of a cross-encoder model.
deviceOptional[ComponentDevice]NoneThe device on which the model is loaded. If None, the default device is automatically selected.
tokenOptional[Secret]Secret.from_env_var(['HF_API_TOKEN', 'HF_TOKEN'], strict=False)The API token to download private models from Hugging Face.
top_kint10The maximum number of documents to return per query.
query_prefixstrA string to add at the beginning of the query text before ranking. Use it to prepend the text with an instruction, as required by reranking models like bge.
document_prefixstrA string to add at the beginning of each document before ranking. You can use it to prepend the document with an instruction, as required by embedding models like bge.
meta_fields_to_embedOptional[List[str]]NoneList of metadata fields to embed with the document.
embedding_separatorstr\nSeparator to concatenate metadata fields to the document.
scale_scoreboolTrueIf True, scales the raw logit predictions using a Sigmoid activation function. If False, disables scaling of the raw logit predictions.
score_thresholdOptional[float]NoneUse it to return documents with a score above this threshold only.
trust_remote_codeboolFalseIf False, allows only Hugging Face verified model architectures. If True, allows custom models and scripts.
model_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for AutoModelForSequenceClassification.from_pretrained when loading the model.
tokenizer_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for AutoTokenizer.from_pretrained when loading the tokenizer.
config_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for AutoConfig.from_pretrained when loading the model configuration.
backendLiteral['torch', 'onnx', 'openvino']torchThe backend to use for the Sentence Transformers model. Choose from "torch", "onnx", or "openvino". Refer to the Sentence Transformers documentation for more information on acceleration and quantization options.
batch_sizeint16The batch size to use for inference. The higher the batch size, the more memory is required. If you run into memory issues, reduce the batch size.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
querystrThe input query to compare the documents to.
documentsList[Document]A list of documents to be ranked.
top_kOptional[int]NoneThe maximum number of documents to return.
scale_scoreOptional[bool]NoneIf True, scales the raw logit predictions using a Sigmoid activation function. If False, disables scaling of the raw logit predictions. If set, overrides the value set at initialization.
score_thresholdOptional[float]NoneUse it to return documents only with a score above this threshold. If set, overrides the value set at initialization.