Ranking and performance of all 1277 ranked roberta-base models (full table). The top 386 models were fully tested.
Notes:
- The baseline results can be found here
- While the average improvement is small, many datasets show large gains
- ColD Fusion variations were removed to avoid cluttering the table
model_name | avg | mnli_lp | 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
baseline | roberta-base | 76.22 | nan | 85.28 | 89.77 | 66.58 | 50.35 | 78.69 | 67.77 | 83.53 | 48.70 | 77.30 | 90.99 | 85.11 | 93.90 | 72.47 | 86.98 | 87.87 | 61.22 | 83.94 | 92.41 | 90.71 | 88.42 | 72.40 | 94.12 | 56.68 | 89.92 | 97.11 | 87.76 | 46.30 | 81.82 | 52.89 | 71.56 | 84.55 | 71.03 | 65.48 | 54.79 | 63.27 | 72.40 |
1 | ibm/ColD-Fusion | 78.47 | 86.09 | 85.82 | 89.80 | 66.26 | 51.94 | 81.38 | 87.50 | 83.32 | 72.00 | 78.63 | 91.14 | 88.10 | 93.86 | 73.53 | 87.30 | 87.01 | 63.72 | 85.58 | 92.40 | 91.11 | 91.84 | 85.20 | 95.41 | 56.38 | 91.30 | 97.00 | 90.40 | 46.31 | 83.04 | 54.44 | 77.93 | 85.93 | 70.43 | 68.65 | 47.89 | 60.58 | 71.87 |
2 | gustavecortal/roberta_emo | 78.47 | 84.87 | 85.82 | 90.23 | 66.08 | 52.16 | 81.62 | 89.29 | 83.41 | 71.00 | 77.50 | 90.70 | 86.10 | 93.78 | 73.01 | 86.82 | 88.24 | 64.07 | 88.46 | 92.88 | 90.95 | 91.37 | 83.39 | 95.76 | 57.47 | 91.51 | 97.20 | 91.20 | 45.99 | 82.48 | 52.49 | 75.64 | 86.63 | 70.87 | 68.50 | 46.48 | 63.46 | 72.27 |
3 | jakub014/ColD-Fusion-finetuned-convincingness-acl2016 | 78.39 | 84.05 | 86.13 | 89.17 | 66.68 | 52.22 | 81.44 | 85.71 | 82.84 | 67.00 | 77.77 | 91.10 | 85.60 | 93.56 | 71.84 | 87.53 | 87.50 | 64.03 | 91.35 | 93.21 | 91.28 | 91.93 | 86.28 | 95.41 | 58.28 | 91.34 | 97.20 | 88.80 | 46.49 | 83.46 | 54.95 | 73.98 | 85.93 | 70.88 | 66.93 | 49.30 | 62.50 | 72.50 |
4 | jakub014/ColD-Fusion-finetuned-convincingness-IBM | 78.36 | 85.08 | 85.98 | 89.37 | 67.26 | 51.31 | 81.56 | 89.29 | 82.84 | 75.00 | 77.00 | 91.26 | 87.60 | 94.18 | 72.43 | 87.23 | 89.22 | 64.17 | 88.46 | 92.22 | 90.90 | 91.46 | 85.20 | 95.53 | 57.65 | 91.52 | 97.40 | 87.40 | 46.24 | 82.83 | 54.01 | 75.51 | 85.23 | 69.85 | 66.77 | 47.89 | 57.69 | 71.60 |
5 | janeel/muppet-roberta-base-finetuned-squad | 78.04 | 83.24 | 84.89 | 89.67 | 67.16 | 53.59 | 82.39 | 82.14 | 81.88 | 62.00 | 77.77 | 91.34 | 85.60 | 94.12 | 72.95 | 86.55 | 89.46 | 64.25 | 87.50 | 92.70 | 91.00 | 90.71 | 83.75 | 95.99 | 58.14 | 91.29 | 97.00 | 90.60 | 46.46 | 82.20 | 54.38 | 80.10 | 84.88 | 71.85 | 70.22 | 39.44 | 63.46 | 71.93 |
6 | mwong/roberta-base-climate-evidence-related | 77.21 | 55.09 | 85.13 | 89.93 | 66.54 | 50.22 | 72.40 | 77.70 | 83.22 | 84.60 | 77.70 | 89.65 | 84.60 | 93.99 | 73.14 | 87.12 | 89.96 | 87.12 | 83.65 | 92.29 | 89.93 | 88.93 | 72.20 | 95.07 | 54.71 | 73.14 | 96.80 | 87.40 | 46.57 | 81.42 | 51.65 | 71.43 | 85.12 | 70.34 | 54.93 | 54.93 | 63.46 | 72.40 |
7 | k4black/roberta-base-e-snli-classification-nli-base | 77.06 | 80.54 | 85.42 | 89.30 | 66.54 | 51.66 | 79.88 | 78.57 | 83.32 | 59.00 | 77.30 | 90.70 | 86.20 | 93.96 | 73.21 | 86.80 | 85.78 | 62.09 | 81.73 | 92.35 | 91.04 | 88.09 | 80.14 | 94.38 | 56.47 | 90.97 | 97.80 | 87.60 | 46.40 | 81.42 | 52.83 | 68.88 | 83.60 | 69.36 | 69.75 | 56.34 | 63.46 | 71.77 |
8 | facebook/muppet-roberta-base | 77.00 | 84.75 | 90.00 | 89.77 | 86.50 | 52.59 | 82.17 | 80.36 | 81.21 | 65.00 | 85.17 | 52.59 | 46.10 | 91.74 | 73.01 | 93.04 | 88.97 | 64.15 | 94.14 | 84.48 | 91.25 | 58.10 | 39.44 | 67.06 | 94.84 | 91.58 | 85.58 | 96.80 | 82.76 | 51.11 | 76.02 | 84.77 | 71.57 | 87.07 | 66.61 | 91.10 | 63.46 | 71.90 |
9 | WillHeld/roberta-base-mnli | 76.93 | 86.22 | 83.48 | 90.07 | 84.50 | 50.75 | 80.18 | 82.14 | 80.63 | 72.00 | 77.43 | 50.75 | 45.65 | 92.98 | 70.34 | 91.76 | 88.48 | 62.81 | 81.73 | 82.67 | 91.20 | 86.21 | 57.75 | 65.84 | 94.15 | 89.82 | 96.00 | 85.00 | 78.61 | 52.15 | 70.15 | 83.60 | 70.23 | 86.98 | 66.77 | 89.86 | 65.38 | 71.27 |
10 | deepakvk/roberta-base-squad2-finetuned-squad | 76.89 | 61.13 | 85.41 | 89.37 | 66.62 | 52.22 | 79.11 | 69.64 | 82.74 | 55.00 | 77.60 | 90.65 | 88.80 | 93.43 | 71.84 | 86.49 | 88.24 | 63.51 | 85.58 | 92.84 | 90.69 | 87.52 | 77.26 | 93.12 | 56.61 | 90.09 | 97.80 | 89.00 | 45.60 | 81.14 | 53.50 | 71.56 | 83.84 | 70.01 | 69.59 | 56.34 | 63.46 | 72.00 |
Download full models ranking table: csv