JinaRanker
Re-rank documents based on their relevance to a query using Jina AI reranker models.
Key Features
- Re-ranks documents by relevance to a query using Jina AI cross-encoder models.
- Improves retrieval quality by directly comparing query-document pairs.
- Configurable
top_kto limit the number of returned documents. - Configurable
score_thresholdto filter out low-relevance documents.
Configuration
- Drag the
JinaRankercomponent onto the canvas from the Component Library. - Click on the component to open the configuration panel.
- On the General tab:
- Create a secret with your Jina API key. Use
JINA_API_KEYas the secret key. For instructions, see Create Secrets. Get your API key from Jina AI. - Select the reranker model. Check available models on the Jina documentation.
- Set
top_kto control the number of documents returned.
- Create a secret with your Jina API key. Use
- Go to the Advanced tab to configure
score_threshold.
Connections
JinaRanker accepts a query string through its query input, a list of documents through its documents input, and optional top_k and score_threshold overrides at runtime. It outputs ranked documents through its documents output, sorted from most to least relevant.
Connect a Retriever's documents output to JinaRanker's documents input. Then connect JinaRanker's documents output to PromptBuilder or another component that uses ranked documents.
Source Code
To check this component's source code, open ranker.py in the Haystack Core Integrations repository.
Usage Examples
Basic Configuration
JinaRanker:
type: haystack_integrations.components.rankers.jina.ranker.JinaRanker
init_parameters:
api_key:
type: env_var
env_vars:
- JINA_API_KEY
strict: false
model: jina-reranker-v1-base-en
top_k: 3
This example shows a RAG pipeline that retrieves documents, re-ranks them using Jina, and generates an answer.
components:
InMemoryBM25Retriever:
type: haystack.components.retrievers.in_memory.bm25_retriever.InMemoryBM25Retriever
init_parameters:
document_store:
type: haystack.document_stores.in_memory.document_store.InMemoryDocumentStore
init_parameters:
bm25_tokenization_regex: (?u)\b\w\w+\b
bm25_algorithm: BM25L
bm25_parameters:
embedding_similarity_function: dot_product
index: 'default'
async_executor:
top_k: 10
JinaRanker:
type: haystack_integrations.components.rankers.jina.ranker.JinaRanker
init_parameters:
api_key:
type: env_var
env_vars:
- JINA_API_KEY
strict: false
model: jina-reranker-v1-base-en
top_k: 3
PromptBuilder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: "Given the following documents, answer the question.\n\nDocuments:\n{% for doc in documents %}{{ doc.content }}\n{% endfor %}\n\nQuestion: {{ query }}"
OpenAIGenerator:
type: haystack.components.generators.openai.OpenAIGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o-mini
AnswerBuilder:
type: haystack.components.builders.answer_builder.AnswerBuilder
init_parameters:
pattern:
reference_pattern:
last_message_only: false
return_only_referenced_documents: true
connections:
- sender: InMemoryBM25Retriever.documents
receiver: JinaRanker.documents
- sender: JinaRanker.documents
receiver: PromptBuilder.documents
- sender: PromptBuilder.prompt
receiver: OpenAIGenerator.prompt
- sender: OpenAIGenerator.replies
receiver: AnswerBuilder.replies
- sender: JinaRanker.documents
receiver: AnswerBuilder.documents
max_runs_per_component: 100
metadata: {}
inputs:
query:
- InMemoryBM25Retriever.query
- JinaRanker.query
- PromptBuilder.query
- AnswerBuilder.query
outputs:
answers: AnswerBuilder.answers
Parameters
Inputs
| Parameter | Type | Description |
|---|---|---|
query | str | Query string. |
documents | List[Document] | List of Documents. |
top_k | Optional[int] | The maximum number of Documents you want the Ranker to return. |
score_threshold | Optional[float] | If provided, only returns documents with a score above this threshold. |
Outputs
| Parameter | Type | Description |
|---|---|---|
documents | List[Document] | List of Documents most similar to the given query in descending order of similarity. |
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| api_key | Secret | Secret.from_env_var('JINA_API_KEY') | The Jina API key. It can be explicitly provided or automatically read from the environment variable JINA_API_KEY (recommended). |
| model | str | jina-reranker-v1-base-en | The name of the Jina model to use. Check the list of available models on https://jina.ai/reranker/ |
| top_k | Optional[int] | None | The maximum number of Documents to return per query. If None, all documents are returned |
| score_threshold | Optional[float] | None | If provided only returns documents with a score above this threshold. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | Query string. | |
| documents | List[Document] | List of Documents. | |
| top_k | Optional[int] | None | The maximum number of Documents you want the Ranker to return. |
| score_threshold | Optional[float] | None | If provided only returns documents with a score above this threshold. |
Was this page helpful?