LLMRanker
Rank documents by relevance to a query using an LLM. Unlike traditional rankers that rely on similarity scoring, LLMRanker treats relevance as a semantic reasoning task. This can produce better results for complex or multi-step queries. The component can also filter out irrelevant or duplicate documents entirely, helping keep context windows lean.
When prompted with the query and document contents, LLMRanker returns a JSON response with ranked document indices ordered from most to least relevant.
Key Features
- Model-agnostic: Works with any LLM.
- Semantic reasoning: Treats relevance as a semantic reasoning task.
- Duplicate document filtering: Filters out irrelevant or duplicate documents.
- Context window management: Helps keep context windows lean.
When to Use LLMRanker
Use LLMRanker when:
- Your queries are complex, multi-step, or require nuanced understanding of relevance.
- You want to filter out irrelevant documents, not just reorder them.
- You need higher-quality context in RAG pipelines or agent workflows.
For simpler similarity-based reranking, consider TransformersSimilarityRanker or SentenceTransformersSimilarityRanker.
How It Works
- The component receives a query and a list of candidate documents.
- It builds a prompt that includes the query and the documents.
- The prompt is sent to an LLM (by default,
gpt-4.1-mini). - The LLM returns a JSON response with ranked document indices, listing only the relevant documents.
- The component reorders the documents based on the LLM's ranking and returns the top results.
Before ranking, the component automatically removes duplicate documents. If the query is empty, the documents are returned without reranking.
Usage Example
Basic Configuration
components:
LLMRanker:
type: haystack.components.rankers.llm_ranker.LLMRanker
init_parameters:
top_k: 5
This uses the default OpenAIChatGenerator with gpt-4.1-mini and the built-in ranking prompt.
In a RAG Pipeline
Use LLMRanker after a retriever to improve the quality of documents passed to the LLM:
components:
retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
- ${OPENSEARCH_HOST}
index: standard-index
embedding_dim: 768
return_embedding: false
create_index: true
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
use_ssl: true
verify_certs: false
top_k: 20
llm_ranker:
type: haystack.components.rankers.llm_ranker.LLMRanker
init_parameters:
top_k: 5
llm:
type: haystack.components.generators.chat.llm.LLM
init_parameters:
chat_generator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
model: gpt-5-mini
system_prompt: "Answer the question based on the provided documents."
user_prompt: |-
{% message role="user" %}
Documents:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{ question }}
{% endmessage %}
required_variables:
- documents
- question
AnswerBuilder:
type: haystack.components.builders.answer_builder.AnswerBuilder
init_parameters:
pattern:
reference_pattern:
last_message_only: false
connections:
- sender: retriever.documents
receiver: llm_ranker.documents
- sender: llm_ranker.documents
receiver: llm.documents
- sender: llm.messages
receiver: AnswerBuilder.replies
max_runs_per_component: 100
inputs:
query:
- retriever.query
- llm_ranker.query
- AnswerBuilder.query
question:
- llm.question
outputs:
answers: AnswerBuilder.answers
With a Custom Chat Generator
You can use a different LLM for reranking by passing a custom chat_generator. The chat generator must return JSON output with ranked document indices.
components:
LLMRanker:
type: haystack.components.rankers.llm_ranker.LLMRanker
init_parameters:
chat_generator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
model: gpt-5-mini
generation_kwargs:
temperature: 0.0
response_format:
type: json_schema
json_schema:
name: document_ranking
schema:
type: object
properties:
documents:
type: array
items:
type: object
properties:
index:
type: integer
required:
- index
additionalProperties: false
required:
- documents
additionalProperties: false
top_k: 5
Parameters
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The query to rank the documents against. | |
| documents | List[Document] | The candidate documents to rerank. | |
| top_k | Optional[int] | None | The maximum number of documents to return. Overrides the top_k value set during initialization. |
Outputs
| Parameter | Type | Description |
|---|---|---|
| documents | List[Document] | The documents ranked by relevance to the query. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| chat_generator | Optional[ChatGenerator] | None (uses OpenAIChatGenerator with gpt-4.1-mini) | The chat generator to use for reranking. If not provided, a default OpenAIChatGenerator configured for JSON output is used. The chat generator must return JSON with ranked document indices. |
| prompt | str | Built-in ranking prompt | Custom prompt template for reranking. The prompt must include exactly the variables query and documents and instruct the LLM to return ranked 1-based document indices as JSON. |
| top_k | int | 10 | The maximum number of documents to return. |
| raise_on_failure | bool | False | If True, raises an error when the LLM call or response parsing fails. If False, logs the failure and returns the input documents in their original order as a fallback. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | The query to rank the documents against. | |
| documents | List[Document] | The candidate documents to rerank. | |
| top_k | Optional[int] | None | The maximum number of documents to return. Overrides the top_k value set during initialization. |
Related Information
Was this page helpful?