You use models through pipeline nodes. There are a couple of pipeline nodes that use models. Have a look at this table for an overview of model applications and nodes that use them:
|Model Type or Application||Node That Uses It||Description|
|Large language models||PromptNode||Use LLMs for various NLP tasks, like generative QA, through PromptNode. You can use models from providers such as OpenAI, Cohere, Azure, and more, or models hosted on AWS SageMaker.|
|Information retrieval models||Vector-based retrievers: EmbeddingRetriever and DensePassageRetriever||Retrievers act like filters that go through the documents and fetch the ones most relevant to the query. Vector-based retrievers used models to encode both the documents and the query for best results.|
|Question answering models||Readers||Readers are used in extractive question answering. They highlight the answer in the document to pinpoint it and they use transformer-based models to do that.|
|Ranking models||Model-based rankers: SentenceTransformersRanker and EmbeddingRanker||Rankers prioritize documents based on the criteria you specify, for example, a particular value in a document's metadata field. Model-based are powerful rankers that use transformer models to embed the documents and the query and thus build a strong semantic representation of the text.|
To use a model, you simply provide its location as a parameter to the node. If you're using a proprietary model, you can either pass the API key to the node or connect deepset Cloud to the model provider. deepset Cloud takes care of loading the models.
When using LLMs with PromptNode, you can specify all additional model settings, like
temperature, in the
model_kwargs parameter, for example:
components: - name: PromptNode type: PromptNode params: model_name_or_path: google/flan-t5-xl model_kwargs: temperature: 0.6
Larger models are generally more accurate at the cost of speed.
If you don't know which model to start with, you can use one of the models we recommend.
This table lists the model that we recommend for generative QA. You can use them with PromptNode in your pipelines.
|Falcon models||Currently, the most performant open source LLMs.||Open source|
|Claude models by Anthropic||A transformer-based LLM that can be an alternative to the GPT models. It can generate natural language, assist with code and translations.||Proprietary|
|GPT-3.5 models by OpenAI||Faster and cheaper than GPT-4, can generate and understand natural language and code.||Proprietary|
|GPT-4 models by OpenAI||Large multimodal models. More expensive and slower than GPT-3.5.||Proprietary|
PromptNode supports models hosted on AWS SageMaker. Contact your deepset Cloud representative to set up the model for you. Once it's ready, you'll get the model name that you then pass in the
model_name_or_path parameter of PromptNode, like this:
... components: - name: PromptNode type: PromptNode params: model_name_or_path: <the_model_name_you_got_from_deepset_Cloud_rep> model_kwargs: temperature: 0.6 #these are additional model parameters that you can configure
This table describes the models that we recommend for the Question Answering task. You can use them with your Readers.
|deepset/roberta-base-squad2-distilled||A distilled model, relatively fast and with good performance.||English|
|deepset/roberta-large-squad2||A large model with good performance. Slower than the distilled one.||English|
|deepset/xlm-roberta-base-squad2||A base model with good speed and performance.||Multilingual|
|deepset/tinyroberta-squad2||A very fast model.||English|
You can also view state-of-the-art question answering models on the Hugging Face leaderboard.
This table describes the models that we recommend for the Information Retrieval task. You can use them with your Retrievers.
|Model Provider||Model Name||Description||Language|
|See Cohere documentation.||English|
|Cohere||embed-multilingual-v2.0||See Cohere documentation.||Multilingual|
|OpenAI||text-adda-002||See OpenAI documentation.||English|
|Sentence Transformers||multi-qa-mpnet-base-do-v1||Vector dimension: 768||English|
|Sentence Transformers||e5-base-v2||Vector dimension: 768||English|
|Sentence Transformers||e5-large-v2||Vector dimension: 1024|
Slower than e5-base-v2 but performs better
|Sentence Transformers||multilingual-e5-base||Vector dimension: 768||Multilingual|
|Sentence Transformers||multilingual-e5-large||Vector dimension 1024||Multilingual|
It's best to try out different models and see what works best for your data.
This table lists models you can use with SentenceTransformersRanker to rank documents.
|simlm-msmarco-reranker||The best ranker model currently available.||English|
|cross-encoder/ms-marco-MiniLM-L-12-v2||A slightly bigger and slower model.||English|
|cross-encoder/ms-marco-MiniLM-L-6-v2||Slightly faster than ms-marco-MiniLM-L-12-v2||English|
|svalabs/cross-electra-ms-marco-german-uncased||In our practice, this is the best model for Geman.||German|
These are the recommended models for CohereRanker:
Updated 18 days ago