Skip to main content

LLMRanker

Rank documents by relevance to a query using an LLM. Unlike traditional rankers that rely on similarity scoring, LLMRanker treats relevance as a semantic reasoning task. This can produce better results for complex or multi-step queries. The component can also filter out irrelevant or duplicate documents entirely, helping keep context windows lean.

When prompted with the query and document contents, LLMRanker returns a JSON response with ranked document indices ordered from most to least relevant.

Key Features

  • Model-agnostic: Works with any LLM.
  • Semantic reasoning: Treats relevance as a semantic reasoning task.
  • Duplicate document filtering: Filters out irrelevant or duplicate documents.
  • Context window management: Helps keep context windows lean.

When to Use LLMRanker

Use LLMRanker when:

  • Your queries are complex, multi-step, or require nuanced understanding of relevance.
  • You want to filter out irrelevant documents, not just reorder them.
  • You need higher-quality context in RAG pipelines or agent workflows.

For simpler similarity-based reranking, consider TransformersSimilarityRanker or SentenceTransformersSimilarityRanker.

How It Works

  1. The component receives a query and a list of candidate documents.
  2. It builds a prompt that includes the query and the documents.
  3. The prompt is sent to an LLM (by default, gpt-4.1-mini).
  4. The LLM returns a JSON response with ranked document indices, listing only the relevant documents.
  5. The component reorders the documents based on the LLM's ranking and returns the top results.

Before ranking, the component automatically removes duplicate documents. If the query is empty, the documents are returned without reranking.

Usage Example

Basic Configuration

components:
LLMRanker:
type: haystack.components.rankers.llm_ranker.LLMRanker
init_parameters:
top_k: 5

This uses the default OpenAIChatGenerator with gpt-4.1-mini and the built-in ranking prompt.

In a RAG Pipeline

Use LLMRanker after a retriever to improve the quality of documents passed to the LLM:

components:
retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
- ${OPENSEARCH_HOST}
index: standard-index
embedding_dim: 768
return_embedding: false
create_index: true
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
use_ssl: true
verify_certs: false
top_k: 20

llm_ranker:
type: haystack.components.rankers.llm_ranker.LLMRanker
init_parameters:
top_k: 5

llm:
type: haystack.components.generators.chat.llm.LLM
init_parameters:
chat_generator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
model: gpt-5-mini
system_prompt: "Answer the question based on the provided documents."
user_prompt: |-
{% message role="user" %}
Documents:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{ question }}
{% endmessage %}
required_variables:
- documents
- question

AnswerBuilder:
type: haystack.components.builders.answer_builder.AnswerBuilder
init_parameters:
pattern:
reference_pattern:
last_message_only: false

connections:
- sender: retriever.documents
receiver: llm_ranker.documents
- sender: llm_ranker.documents
receiver: llm.documents
- sender: llm.messages
receiver: AnswerBuilder.replies

max_runs_per_component: 100

inputs:
query:
- retriever.query
- llm_ranker.query
- AnswerBuilder.query
question:
- llm.question

outputs:
answers: AnswerBuilder.answers

With a Custom Chat Generator

You can use a different LLM for reranking by passing a custom chat_generator. The chat generator must return JSON output with ranked document indices.

components:
LLMRanker:
type: haystack.components.rankers.llm_ranker.LLMRanker
init_parameters:
chat_generator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
model: gpt-5-mini
generation_kwargs:
temperature: 0.0
response_format:
type: json_schema
json_schema:
name: document_ranking
schema:
type: object
properties:
documents:
type: array
items:
type: object
properties:
index:
type: integer
required:
- index
additionalProperties: false
required:
- documents
additionalProperties: false
top_k: 5

Parameters

Inputs

ParameterTypeDefaultDescription
querystrThe query to rank the documents against.
documentsList[Document]The candidate documents to rerank.
top_kOptional[int]NoneThe maximum number of documents to return. Overrides the top_k value set during initialization.

Outputs

ParameterTypeDescription
documentsList[Document]The documents ranked by relevance to the query.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
chat_generatorOptional[ChatGenerator]None (uses OpenAIChatGenerator with gpt-4.1-mini)The chat generator to use for reranking. If not provided, a default OpenAIChatGenerator configured for JSON output is used. The chat generator must return JSON with ranked document indices.
promptstrBuilt-in ranking promptCustom prompt template for reranking. The prompt must include exactly the variables query and documents and instruct the LLM to return ranked 1-based document indices as JSON.
top_kint10The maximum number of documents to return.
raise_on_failureboolFalseIf True, raises an error when the LLM call or response parsing fails. If False, logs the failure and returns the input documents in their original order as a fallback.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
querystrThe query to rank the documents against.
documentsList[Document]The candidate documents to rerank.
top_kOptional[int]NoneThe maximum number of documents to return. Overrides the top_k value set during initialization.