Ranking and performance of all 92 ranked microsoft_deberta-v3-base models (full table). The top 78 models were fully tested.
Notes:
- The baseline results can be found here
- While the average improvement is small, many datasets show large gains
model_name | avg | mnli_lp | 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
baseline | microsoft/deberta-v3-base | 79.04 | nan | 86.41 | 90.44 | 66.86 | 58.78 | 82.99 | 75.00 | 86.57 | 58.40 | 79.43 | 91.93 | 84.48 | 94.49 | 71.86 | 89.78 | 89.20 | 62.26 | 86.73 | 93.51 | 91.79 | 90.42 | 82.35 | 95.06 | 56.98 | 90.28 | 97.76 | 91.02 | 46.19 | 83.95 | 56.21 | 79.82 | 85.06 | 71.80 | 71.21 | 70.21 | 64.09 | 72.03 |
1 | sileod/deberta-v3-base-tasksource-nli | 80.73 | 93.73 | 86.46 | 90.67 | 66.90 | 60.38 | 85.66 | 82.14 | 87.15 | 81.00 | 79.20 | 91.54 | 85.20 | 94.67 | 71.90 | 91.14 | 88.73 | 63.82 | 92.31 | 93.72 | 91.92 | 90.99 | 90.61 | 95.41 | 58.60 | 91.81 | 96.80 | 90.80 | 47.82 | 85.71 | 57.47 | 83.04 | 85.23 | 72.01 | 69.44 | 67.61 | 66.35 | 72.07 |
2 | sileod/deberta-v3-base_tasksource-420 | 80.45 | 89.82 | 87.04 | 90.90 | 66.46 | 59.72 | 85.54 | 85.71 | 87.06 | 69.00 | 79.53 | 91.67 | 85.80 | 94.32 | 72.49 | 90.21 | 88.97 | 63.99 | 87.50 | 93.63 | 91.74 | 91.09 | 84.48 | 95.07 | 56.97 | 91.67 | 98.00 | 91.20 | 46.81 | 84.38 | 58.05 | 81.25 | 85.23 | 71.88 | 69.44 | 73.24 | 74.04 | 72.20 |
3 | MoritzLaurer/DeBERTa-v3-base-mnli | 80.37 | 89.77 | 86.42 | 90.63 | 67.18 | 59.34 | 84.16 | 83.93 | 86.29 | 72.00 | 79.20 | 91.42 | 85.10 | 94.21 | 71.51 | 89.88 | 89.95 | 64.03 | 86.54 | 93.67 | 91.69 | 89.87 | 88.09 | 95.41 | 57.38 | 91.38 | 97.40 | 91.60 | 47.30 | 81.28 | 57.64 | 77.17 | 85.35 | 70.29 | 71.79 | 74.65 | 77.88 | 71.70 |
4 | mariolinml/deberta-v3-base_MNLI_10_19_v0 | 79.75 | 86.71 | 85.85 | 90.23 | 66.74 | 60.06 | 81.83 | 82.14 | 84.85 | 69.00 | 79.43 | 91.11 | 86.90 | 94.37 | 71.38 | 89.72 | 88.24 | 64.38 | 88.46 | 93.76 | 91.87 | 89.77 | 85.56 | 95.18 | 57.47 | 91.74 | 97.60 | 91.80 | 45.53 | 84.24 | 55.99 | 79.85 | 84.30 | 71.26 | 70.06 | 74.65 | 63.46 | 72.13 |
5 | devpranjal/deberta-v3-base-devrev-data | 79.58 | 56.73 | 85.49 | 89.93 | 66.72 | 58.19 | 85.26 | 80.36 | 86.29 | 70.00 | 79.03 | 91.18 | 87.70 | 94.01 | 71.25 | 89.48 | 89.95 | 63.63 | 83.65 | 93.81 | 92.03 | 90.24 | 84.12 | 94.72 | 57.65 | 91.44 | 97.20 | 91.40 | 46.37 | 83.25 | 57.68 | 77.42 | 85.93 | 70.93 | 70.85 | 71.83 | 63.46 | 72.50 |
6 | nc33/deberta_finetune | 79.51 | 75.33 | 86.19 | 90.37 | 67.48 | 58.56 | 84.34 | 73.21 | 86.58 | 68.00 | 79.67 | 91.57 | 88.60 | 94.47 | 72.23 | 89.64 | 90.20 | 63.53 | 87.50 | 93.56 | 91.67 | 90.24 | 83.03 | 95.18 | 58.37 | 90.41 | 97.20 | 90.80 | 47.12 | 85.08 | 59.39 | 79.08 | 83.72 | 70.20 | 70.69 | 67.61 | 64.42 | 72.33 |
7 | nc33/finetune_rte_model | 79.38 | 78.60 | 86.98 | 90.40 | 66.98 | 59.38 | 84.56 | 78.57 | 86.58 | 60.00 | 80.00 | 91.12 | 85.90 | 94.78 | 72.49 | 89.71 | 88.97 | 63.41 | 85.58 | 93.61 | 92.13 | 90.62 | 83.03 | 94.95 | 55.97 | 91.70 | 97.60 | 91.20 | 47.05 | 82.48 | 58.62 | 78.32 | 85.12 | 71.88 | 71.94 | 70.42 | 63.46 | 72.13 |
8 | MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli | 79.37 | 89.93 | 86.42 | 90.40 | 67.42 | 58.47 | 84.50 | 92.86 | 85.43 | 50.00 | 79.40 | 91.44 | 78.60 | 94.36 | 71.90 | 89.39 | 89.95 | 64.11 | 85.58 | 93.81 | 92.01 | 90.15 | 88.45 | 95.30 | 55.48 | 91.81 | 97.40 | 91.60 | 45.25 | 82.76 | 54.95 | 77.30 | 85.58 | 71.53 | 68.81 | 73.24 | 70.19 | 71.40 |
9 | s8n29/finetuned_model_1 | 79.25 | 57.45 | 86.06 | 90.43 | 66.52 | 57.78 | 85.08 | 76.79 | 86.86 | 72.00 | 78.97 | 90.91 | 83.50 | 94.06 | 72.95 | 89.70 | 89.22 | 63.26 | 85.58 | 94.14 | 92.05 | 90.34 | 79.78 | 94.84 | 57.01 | 91.05 | 97.00 | 91.20 | 45.48 | 83.18 | 57.71 | 80.74 | 85.12 | 69.96 | 70.06 | 67.61 | 63.46 | 72.57 |
10 | koolerkx/autotrain-sns-fake-news-3229590413 | 79.24 | 65.50 | 86.78 | 91.07 | 66.98 | 60.44 | 85.14 | 75.00 | 86.19 | 52.00 | 79.43 | 90.90 | 86.30 | 94.39 | 72.29 | 89.36 | 89.22 | 62.48 | 89.42 | 93.68 | 92.04 | 90.71 | 81.23 | 94.84 | 58.73 | 91.20 | 96.80 | 91.80 | 46.93 | 83.67 | 54.78 | 79.08 | 85.00 | 71.61 | 71.94 | 76.06 | 63.46 | 71.83 |
Download full models ranking table: csv