Language Models in deepset Cloud

deepset Cloud loads models directly from Hugging Face. You can use publically available models but also your private ones if you connect deepset Cloud with Hugging Face.

If you're trying to find the right model on Hugging Face, go to the Models menu and you'll find a list of tasks by which you can search for your model.

A screen shot of model tasks on Hugging Face

Model tasks on Hugging Face

In deepset Cloud, two types of pipeline nodes use models: dense retrievers and readers. Readers use models for question answering, while retrievers use sentence similarity or DPR models. For information about choosing the models for pipeline nodes, see EmbeddingRetriever, DensePassageRetriever, and Reader.

Recommended Models

If you don't know which model to start with, you can use one of the models we recommend.

Models for Question Answering

This table describes the models that we recommend for the Question Answering task:

Model URLDescriptionLanguage
deepset/roberta-base-squad2-distilledA distilled model, relatively fast and with good performance.English
deepset/roberta-large-squad2A large model with good performance. Slower than the distilled one.English
deepset/xlm-roberta-base-squad2A base model with good speed and performance.Multilingual
deepset/tinyroberta-squad2A very fast model.English

You can also view state-of-the-art question answering models on the Hugging Face leaderboard.

Models for Information Retrieval

This table describes the models that we recommend for the Information Retrieval task:

Model URLDescriptionLanguageSimilarity Measure
sentence-transformers/all-mpnet-base-v2A model with good speed and performance.Englishdot_product,
cosine
sentence-transformers/multi-qa-mpnet-base-dot-v1A model with good speed and performance.Englishdot_product
sentence-transformers/all-MiniLM-L12-v2A model faster than the base models with still good performance.Englishdot_product,
cosine
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2A fast multilingual model.Multilingualcosine
sentence-transformers/paraphrase-multilingual-mpnet-base-v2A relatively big model, slower than the mini one but with better performance.Multilingualcosine

For more information about these models, see sentence transformers.

Using a Model

To use a model, simply provide its Hugging Face location as a parameter to the node, and deepset Cloud will take care of loading it. For example:

- name: DocumentStore
  type: DeepsetCloudDocumentStore
    params:
      similarity: dot_product # Make sure to choose the correct similarity function for the chosen Embedding model.
- name: Retriever 
  type: EmbeddingRetriever 
  params:
    document_store: DocumentStore
    embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 
    model_format: sentence_transformers
    top_k: 20