JinaRanker
Ranks Documents based on their similarity to the query using Jina AI models.
Basic Information
- Type:
haystack_integrations.components.rankers.jina.ranker.JinaRanker - Components it can connect with:
- Retrievers:
JinaRankerreceives documents to rank from a Retriever. PromptBuilder:JinaRankersends the re-ranked documents toPromptBuilderto be included in the prompt.
- Retrievers:
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | Query string. | |
| documents | List[Document] | List of Documents. | |
| top_k | Optional[int] | None | The maximum number of Documents you want the Ranker to return. |
| score_threshold | Optional[float] | None | If provided only returns documents with a score above this threshold. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | A dictionary with the following keys: - documents: List of Documents most similar to the given query in descending order of similarity. |
Overview
JinaRanker re-ranks documents based on their relevance to a query using Jina AI reranker models. Reranking improves retrieval quality by applying a cross-encoder model that directly compares query-document pairs.
Use JinaRanker after an initial retrieval step to improve the precision of results before passing them to a generator.
Authorization
Create a secret with your Jina API key. Type JINA_API_KEY as the secret key. For detailed instructions on creating secrets, see Create Secrets.
Get your API key from Jina AI.
Usage Example
This example shows a RAG pipeline that retrieves documents, re-ranks them using Jina, and generates an answer.
components:
InMemoryBM25Retriever:
type: haystack.components.retrievers.in_memory.bm25_retriever.InMemoryBM25Retriever
init_parameters:
document_store:
type: haystack.document_stores.in_memory.document_store.InMemoryDocumentStore
init_parameters:
bm25_tokenization_regex: (?u)\b\w\w+\b
bm25_algorithm: BM25L
bm25_parameters:
embedding_similarity_function: dot_product
index: 'default'
async_executor:
top_k: 10
JinaRanker:
type: haystack_integrations.components.rankers.jina.ranker.JinaRanker
init_parameters:
api_key:
type: env_var
env_vars:
- JINA_API_KEY
strict: false
model: jina-reranker-v1-base-en
top_k: 3
PromptBuilder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: "Given the following documents, answer the question.\n\nDocuments:\n{% for doc in documents %}{{ doc.content }}\n{% endfor %}\n\nQuestion: {{ query }}"
OpenAIGenerator:
type: haystack.components.generators.openai.OpenAIGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o-mini
AnswerBuilder:
type: haystack.components.builders.answer_builder.AnswerBuilder
init_parameters:
pattern:
reference_pattern:
last_message_only: false
return_only_referenced_documents: true
connections:
- sender: InMemoryBM25Retriever.documents
receiver: JinaRanker.documents
- sender: JinaRanker.documents
receiver: PromptBuilder.documents
- sender: PromptBuilder.prompt
receiver: OpenAIGenerator.prompt
- sender: OpenAIGenerator.replies
receiver: AnswerBuilder.replies
- sender: JinaRanker.documents
receiver: AnswerBuilder.documents
max_runs_per_component: 100
metadata: {}
inputs:
query:
- InMemoryBM25Retriever.query
- JinaRanker.query
- PromptBuilder.query
- AnswerBuilder.query
outputs:
answers: AnswerBuilder.answers
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| api_key | Secret | Secret.from_env_var('JINA_API_KEY') | The Jina API key. It can be explicitly provided or automatically read from the environment variable JINA_API_KEY (recommended). |
| model | str | jina-reranker-v1-base-en | The name of the Jina model to use. Check the list of available models on https://jina.ai/reranker/ |
| top_k | Optional[int] | None | The maximum number of Documents to return per query. If None, all documents are returned |
| score_threshold | Optional[float] | None | If provided only returns documents with a score above this threshold. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | Query string. | |
| documents | List[Document] | List of Documents. | |
| top_k | Optional[int] | None | The maximum number of Documents you want the Ranker to return. |
| score_threshold | Optional[float] | None | If provided only returns documents with a score above this threshold. |
Was this page helpful?