QueryClassifier

QueryClassifier distinguishes between different types of queries and routes them to the pipeline branch that can handle them best.

QueryClassifier can categorize queries into keyword-based and natural language queries. A common use case for QueryClassifier is in a question answering pipeline where it routes keyword queries to a less computationally expensive keyword-based Retriever and natural language questions to a vector-based Retriever. This helps you save time and can produce better results for your keyword queries.

To handle these tasks, QueryClassifier uses a classification model.

Basic Information

  • Pipeline type: Used in query pipelines.
  • Nodes that can precede it in a pipeline: The first node in query pipelines, takes [Query] as input.
  • Nodes that can follow it in a pipeline: Ranker, Retriever
  • Node input: Query
  • Node output: Query
  • Available node classes: TransformersQueryClassifier

When used in a pipeline, QueryClassifier acts as a decision node, which means it routes the queries to a specific node, depending on how the query is classified.

Overview

TransformersQueryClassifier is sensitive to the syntax of a sentence as it uses a transformer model to classify queries. The default model is shahrukhx01/bert-mini-finetune-question-detection. It was trained using the mini BERT architecture of about 50 MB in size, which allows relatively fast inference on the CPU. It supports zero-shot classification.

Usage Examples

...
components:
	- name: QueryClassifier
    type: TransformersQueryClassifier
    params:
    	model_name_or_path: "shahrukhx01/bert-mini-finetune-question-detection"
...
pipelines:
 - name: query
   nodes:
      - name: QueryClassifier
        inputs: [Query]
      - name: KeywordRetriever # such as BM25Retriever
        inputs: [QueryClassifier.output_1] # This output edge routes keyword queries further down the pipeline
      - name: VectorRetriever # such as DensePassageRetriever
        inputs: [QueryClassifier.output_2] # This output edge routes natural language queries further down the pipeline
   ...

Arguments

ArgumentTypePossible ValuesDescription
model_name_or_pathStringDefault: shahrukhx01/bert-mini-finetune-question-detectionSpecifies the model you want to use. You can either type a path to the model stored on your computer or the name of a public model from Hugging Face.
The default model was trained on the mini BERT architecture and can distinguish between natural language queries and questions.
Mandatory.
model_versionStringTag name
Branch name
Commit hash
The version of the model from Hugging Face.
Optional.
tokenizerStringDefault: NoneThe name of the tokenizer, usually the same as the model name.
Optional.
use_gpuBooleanTrue (default)
False
Specifies if GPU should be used.
Mandatory.
taskStringtext-classification (default)
zero-shot-classification
Specifies the type of classification the node should perform.
text-classification - Choose this task if you have a model trained with a defined list of labels.
zero-shot-classification - Choose if you want to define labels at runtime.
Mandatory.
labelsA list of stringsDefault: NoneIf you choose text-classification as task and provide an ordered label, the first label corresponds to output_1, the second label corresponds to output_2, and so on. The labels must match the model labels; only their order can differ.
If you selected zero-shot-classification as task, these are the candidate labels.
Mandatory.
batch_sizeIntegerDefault: 16The number of queries you want to process at one time.
Mandatory.
progress_barBooleanTrue (default)
False
Shows the progress bar when processing queries.
Mandatory.
use_auth_tokenString or BooleanDefault: NoneSpecifies the API token used to download private models from Hugging Face. If you set it to True, it uses the token generated when running transformers-cli login.
Optional.
devicesString or torch.deviceDefault: NoneA list of torch devices such as cuda, cpu, mps, to limit inference to specific devices.
Example: [torch.device(cuda:0), "mps, "cuda:1"
If you set use_gpu to False, this parameter is not used and a single cpu device is used for inference.
Optional.