AmazonBedrockRanker

Rank documents based on their similarity to the query using models hosted on Amazon Bedrock.

Basic Information

Type: haystack_integrations.components.rankers.amazon_bedrock.ranker.AmazonBedrockRanker
Components it can connect with:
- Retrievers: AmazonBedrockRanker can receive documents from a Retriever and then rank them.
- PromptBuilder: AmazonBedrockRanker can send ranked documents to PromptBuilder, which includes them in the prompt for the language model.
- Any component that outputs a list of documents or accepts a list of documents as input.

Inputs

Parameter	Type	Default	Description
query	str		The query used for ranking documents by their similarity to the query.
documents	List[Document]		The documents to be ranked.
top_k	Optional[int]	None	The maximum number of documents you want the Ranker to return.

Outputs

Parameter	Type	Default	Description
documents	List[Document]		Documents most similar to the query in descending order of similarity.

Overview

Amazon Bedrock is a fully managed service that makes state-of-the-art language models available for use through a unified API. To learn more, see Amazon Bedrock documentation.

Documents are indexed from most to least semantically relevant to the query.

You can use the following ranking models:

cohere.rerank-v3-5:0
amazon.rerank-v1:0

AmazonBedrockRanker returns documents indexed from most to least semantically similar to the query.

Authentication

To use this component, connect deepset with Amazon Bedrock first. You'll need:

The region name
Access key ID
Secret access key

Add Workspace-Level Integration

Click your profile icon and choose Settings.
Go to Workspace>Integrations.
Find the provider you want to connect and click Connect next to them.
Enter the API key and any other required details.
Click Connect. You can use this integration in pipelines and indexes in the current workspace.

Add Organization-Level Integration

Click your profile icon and choose Settings.
Go to Organization>Integrations.
Find the provider you want to connect and click Connect next to them.
Enter the API key and any other required details.
Click Connect. You can use this integration in pipelines and indexes in all workspaces in the current organization.

For detailed explanation, see Use Amazon Bedrock and SageMaker Models.

Usage Example

Initializing the Component

components:
  AmazonBedrockRanker:
    type: haystack_integrations.components.rankers.amazon_bedrock.ranker.AmazonBedrockRanker
    init_parameters:

Using the Component in a Pipeline

This is an example of a document search pipeline that uses AmazonBedrockRanker with the cohere ranking model:

# If you need help with the YAML format, have a look at https://docs.cloud.deepset.ai/v2.0/docs/create-a-pipeline#create-a-pipeline-using-pipeline-editor.
# This section defines components that you want to use in your pipelines. Each component must have a name and a type. You can also set the component's parameters here.
# The name is up to you, you can give your component a friendly name. You then use components' names when specifying the connections in the pipeline.
# Type is the class path of the component. You can check the type on the component's documentation page.
components:
  bm25_retriever: # Selects the most similar documents from the document store
    type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          hosts:
          index: Standard-Index-English
          max_chunk_bytes: 104857600
          embedding_dim: 768
          return_embedding: false
          method:
          mappings:
          settings:
          create_index: true
          http_auth:
          use_ssl:
          verify_certs:
          timeout:
      top_k: 20 # The number of results to return
      fuzziness: 0

  query_embedder:
    type: deepset_cloud_custom_nodes.embedders.nvidia.text_embedder.DeepsetNvidiaTextEmbedder
    init_parameters:
      normalize_embeddings: true
      model: intfloat/e5-base-v2

  embedding_retriever: # Selects the most similar documents from the document store
    type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
    init_parameters:
      document_store:
        type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        init_parameters:
          hosts:
          index: Standard-Index-English
          max_chunk_bytes: 104857600
          embedding_dim: 768
          return_embedding: false
          method:
          mappings:
          settings:
          create_index: true
          http_auth:
          use_ssl:
          verify_certs:
          timeout:
      top_k: 20 # The number of results to return

  document_joiner:
    type: haystack.components.joiners.document_joiner.DocumentJoiner
    init_parameters:
      join_mode: concatenate

  AmazonBedrockRanker:
    type: haystack_integrations.components.rankers.amazon_bedrock.ranker.AmazonBedrockRanker
    init_parameters:
      model: cohere.rerank-v3-5:0
      top_k: 10
      aws_access_key_id:
        type: env_var
        env_vars:
        - AWS_ACCESS_KEY_ID
        strict: false
      aws_secret_access_key:
        type: env_var
        env_vars:
        - AWS_SECRET_ACCESS_KEY
        strict: false
      aws_session_token:
        type: env_var
        env_vars:
        - AWS_SESSION_TOKEN
        strict: false
      aws_region_name:
        type: env_var
        env_vars:
        - AWS_DEFAULT_REGION
        strict: false
      aws_profile_name:
        type: env_var
        env_vars:
        - AWS_PROFILE
        strict: false
      max_chunks_per_doc:
      meta_fields_to_embed:
      meta_data_separator: \n

connections:  # Defines how the components are connected
- sender: bm25_retriever.documents
  receiver: document_joiner.documents
- sender: query_embedder.embedding
  receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
  receiver: document_joiner.documents
- sender: document_joiner.documents
  receiver: AmazonBedrockRanker.documents

inputs:  # Define the inputs for your pipeline
  query:  # These components will receive the query as input
  - "bm25_retriever.query"
  - "query_embedder.text"
  - "AmazonBedrockRanker.query"
  filters:  # These components will receive a potential query filter as input
  - "bm25_retriever.filters"
  - "embedding_retriever.filters"

outputs:  # Defines the output of your pipeline
  documents: "AmazonBedrockRanker.documents" # The output of the pipeline is the retrieved documents

max_runs_per_component: 100

metadata: {}

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

Parameter	Type	Default	Description
model	str	cohere.rerank-v3-5:0	The ranking model to use.
top_k	int	10	The maximum number of documents to return.
aws_access_key_id	Optional[Secret]	Secret.from_env_var(["AWS_ACCESS_KEY_ID"], strict=False)	AWS access key ID.
aws_secret_access_key	Optional[Secret]	Secret.from_env_var(["AWS_SECRET_ACCESS_KEY"], strict=False)	AWS secret access key.
aws_session_token	Optional[Secret]	Secret.from_env_var([AWS_SESSION_TOKEN], strict=False)	AWS session token.
aws_region_name	Optional[Secret]	Secret.from_env_var(["AWS_DEFAULT_REGION"], strict=False)	AWS region name.
aws_profile_name	Optional[Secret]	Secret.from_env_var(["AWS_PROFILE"], strict=False)	AWS profile name.
max_chunks_per_doc	Optional[int]	None	If your document exceeds 512 tokens, this setting determines the maximum number of chunks a document can be split into. If set to None, uses the default of 10 chunks. This paramter is not used currently but it's included for future compatibility.
meta_fields_to_embed	Optional[List[str]]	None	A list of metadata fields to embed in the document content.
meta_data_separator	str	\n	The separator used to concatenate the metadata fields to the document content.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

Parameter	Type	Default	Description
query	str		The user query for ranking the documents.
documents	List[Document]		The documents to rank.
top_k	Optional[int]	None	The maximum number of documents you want the Ranker to return.

Was this page helpful?

Basic Information​

Inputs​

Outputs​

Overview​

Authentication​

Add Workspace-Level Integration​

Add Organization-Level Integration​

Usage Example​

Initializing the Component​

Using the Component in a Pipeline​

Parameters​

Init Parameters​

Run Method Parameters​