Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

AmazonBedrockRanker

Rank documents based on their similarity to the query using models hosted on Amazon Bedrock. Documents are returned in descending order of relevance.

Amazon Bedrock is a fully managed service that makes state-of-the-art language models available for use through a unified API. To learn more, see Amazon Bedrock documentation.

Key Features

  • Ranks documents by semantic relevance to the user query.
  • Supports Cohere and Amazon ranking models.
  • Configurable number of top results to return.
  • Allows embedding document metadata along with document content for richer ranking.
  • Works with any component that outputs or accepts a list of documents.

Configuration

Authentication

To use this component, connect Haystack Platform with Amazon Bedrock first. You'll need the region name, access key ID, and secret access key.

For detailed explanation, see Use Amazon Bedrock and SageMaker Models.

  1. Drag the AmazonBedrockRanker component onto the canvas from the Component Library.
  2. Click the component to open the configuration panel.
  3. On the General tab:
    1. Select the ranking model from the list.
  4. Go to the Advanced tab to configure the AWS credentials, number of top results, maximum chunks per document, metadata fields to embed, and metadata separator.

Connections

AmazonBedrockRanker accepts a query string, a list of documents to rank, and an optional top_k value as inputs. It outputs a list of documents sorted by relevance to the query.

Connect a retriever's documents output to the documents input. Connect the pipeline's query input to the query input. Connect the documents output to PromptBuilder or another component that processes ranked documents.

Usage Example

Using the Component in a Pipeline

This is an example of a document search pipeline that uses AmazonBedrockRanker with the cohere ranking model:

components:
bm25_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: Standard-Index-English
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20 # The number of results to return
fuzziness: 0

query_embedder:
type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
init_parameters:
normalize_embeddings: true
model: intfloat/e5-base-v2

embedding_retriever: # Selects the most similar documents from the document store
type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: Standard-Index-English
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 20 # The number of results to return

document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate

AmazonBedrockRanker:
type: haystack_integrations.components.rankers.amazon_bedrock.ranker.AmazonBedrockRanker
init_parameters:
model: cohere.rerank-v3-5:0
top_k: 10
aws_access_key_id:
type: env_var
env_vars:
- AWS_ACCESS_KEY_ID
strict: false
aws_secret_access_key:
type: env_var
env_vars:
- AWS_SECRET_ACCESS_KEY
strict: false
aws_session_token:
type: env_var
env_vars:
- AWS_SESSION_TOKEN
strict: false
aws_region_name:
type: env_var
env_vars:
- AWS_DEFAULT_REGION
strict: false
aws_profile_name:
type: env_var
env_vars:
- AWS_PROFILE
strict: false
max_chunks_per_doc:
meta_fields_to_embed:
meta_data_separator: \n

connections: # Defines how the components are connected
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: query_embedder.embedding
receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: document_joiner.documents
receiver: AmazonBedrockRanker.documents

inputs: # Define the inputs for your pipeline
query: # These components will receive the query as input
- "bm25_retriever.query"
- "query_embedder.text"
- "AmazonBedrockRanker.query"
filters: # These components will receive a potential query filter as input
- "bm25_retriever.filters"
- "embedding_retriever.filters"

outputs: # Defines the output of your pipeline
documents: "AmazonBedrockRanker.documents" # The output of the pipeline is the retrieved documents

max_runs_per_component: 100

metadata: {}

Parameters

Inputs

ParameterTypeDefaultDescription
querystrThe query used for ranking documents by their similarity to the query.
documentsList[Document]The documents to be ranked.
top_kOptional[int]NoneThe maximum number of documents you want the Ranker to return.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]Documents most similar to the query in descending order of similarity.

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
modelstrcohere.rerank-v3-5:0The ranking model to use.
top_kint10The maximum number of documents to return.
aws_access_key_idOptional[Secret]Secret.from_env_var(["AWS_ACCESS_KEY_ID"], strict=False)AWS access key ID.
aws_secret_access_keyOptional[Secret]Secret.from_env_var(["AWS_SECRET_ACCESS_KEY"], strict=False)AWS secret access key.
aws_session_tokenOptional[Secret]Secret.from_env_var([AWS_SESSION_TOKEN], strict=False)AWS session token.
aws_region_nameOptional[Secret]Secret.from_env_var(["AWS_DEFAULT_REGION"], strict=False)AWS region name.
aws_profile_nameOptional[Secret]Secret.from_env_var(["AWS_PROFILE"], strict=False)AWS profile name.
max_chunks_per_docOptional[int]NoneIf your document exceeds 512 tokens, this setting determines the maximum number of chunks a document can be split into. If set to None, uses the default of 10 chunks. This parameter is not used currently but it's included for future compatibility.
meta_fields_to_embedOptional[List[str]]NoneA list of metadata fields to embed in the document content.
meta_data_separatorstr\nThe separator used to concatenate the metadata fields to the document content.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
querystrThe user query for ranking the documents.
documentsList[Document]The documents to rank.
top_kOptional[int]NoneThe maximum number of documents you want the Ranker to return.