Training a Sequence Classifier

Let us now look into a short tutorial on training a sequence classifier using pre-trained language model.

For this tutorial, we provide a sample corpus in the folder demo/data/sentiment/.

 $ ls demo/data/sentiment/
 dev.txt
 test.txt
 train.txt

The train, dev, and test files are in tab separated format. The sample snippet of the train corpus is here. The first line of the file should contain sentence as the name of first column and Label as the name of the second column (which is also the column containing class labels)

 $ cat demo/data/sentiment/train.txt
 sentence    Label
 I liked the movie   1
 I hated the movie   0
 The movie was good  1

The filenames should be the same as mentioned above

Hyper-Parameter Tuning

We first have to select the best hyper-parameter value. For this, we monitor the loss/accuracy/f1-score on the dev set and select the best hyper-parameter. We perform a grid-search over batch size and learning rate only.

Hyper-Parameter	Values
Batch Size	8, 16, 32
Learning Rate	1e-3, 1e-4, 1e-5, 1e-6, 3e-3, 3e-4, 3e-5, 13e-6, 5e-3, 5e-4, 5e-5, 5e-6

We now perform hyper-parameter tuning of the sequence classifier

$ python src/sequenceclassifier/helper_scripts/tune_hyper_parameter.py \
    --data_dir demo/data/sentiment/ \
    --configuration_name bert-custom \
    --model_name demo/model/mlm/checkpoint-200/ \
    --output_dir demo/model/sentiment/ \
    --tokenizer_name demo/model/tokenizer/ \
    --task_name sentiment \
    --log_dir logs

The code performs hyper-parameter tuning and Aim library tracks the experiment in logs folder

Fine-Tuning using best Hyper-Parameter

We now run the script src/sequenceclassifier/helper_scripts/get_best_hyper_parameter_and_train.py to find the best hyper-parameter and fine-tune the model using that best hyper-parameter

$ python src/sequenceclassifier/helper_scripts/get_best_hyper_parameter_and_train.py \
    --data_dir demo/data/sentiment/ \
    --configuration_name bert-custom \
    --model_name demo/model/mlm/checkpoint-200/ \
    --output_dir demo/model/sentiment/ \
    --tokenizer_name demo/model/tokenizer/ \
    --log_dir logs

    +----+------------+-------------+----------------+
    |    |   F1-Score |   BatchSize |   LearningRate |
    +====+============+=============+================+
    |  0 |   0.666667 |          16 |         0.001  |
    +----+------------+-------------+----------------+
    |  1 |   0.666667 |          16 |         0.0001 |
    +----+------------+-------------+----------------+
    |  2 |   0        |          16 |         1e-05  |
    +----+------------+-------------+----------------+
    |  3 |   0        |          16 |         1e-06  |
    +----+------------+-------------+----------------+
    |  4 |   0.666667 |          16 |         0.003  |
    +----+------------+-------------+----------------+
    |  5 |   0.666667 |          16 |         0.0003 |
    +----+------------+-------------+----------------+
    |  6 |   0        |          16 |         3e-05  |
    +----+------------+-------------+----------------+
    |  7 |   0        |          16 |         3e-06  |
    +----+------------+-------------+----------------+
    |  8 |   0        |          16 |         0.005  |
    +----+------------+-------------+----------------+
    |  9 |   0.666667 |          16 |         0.0005 |
    +----+------------+-------------+----------------+
    | 10 |   0        |          16 |         5e-05  |
    +----+------------+-------------+----------------+
    | 11 |   0        |          16 |         5e-06  |
    +----+------------+-------------+----------------+
    | 12 |   0.666667 |          32 |         0.001  |
    +----+------------+-------------+----------------+
    | 13 |   0.666667 |          32 |         0.0001 |
    +----+------------+-------------+----------------+
    | 14 |   0        |          32 |         1e-05  |
    +----+------------+-------------+----------------+
    | 15 |   0        |          32 |         1e-06  |
    +----+------------+-------------+----------------+
    | 16 |   0.666667 |          32 |         0.003  |
    +----+------------+-------------+----------------+
    | 17 |   0.666667 |          32 |         0.0003 |
    +----+------------+-------------+----------------+
    | 18 |   0        |          32 |         3e-05  |
    +----+------------+-------------+----------------+
    | 19 |   0        |          32 |         3e-06  |
    +----+------------+-------------+----------------+
    | 20 |   0        |          32 |         0.005  |
    +----+------------+-------------+----------------+
    | 21 |   0.666667 |          32 |         0.0005 |
    +----+------------+-------------+----------------+
    | 22 |   0        |          32 |         5e-05  |
    +----+------------+-------------+----------------+
    | 23 |   0        |          32 |         5e-06  |
    +----+------------+-------------+----------------+
    | 24 |   0.666667 |           8 |         0.001  |
    +----+------------+-------------+----------------+
    | 25 |   0.666667 |           8 |         0.0001 |
    +----+------------+-------------+----------------+
    | 26 |   0        |           8 |         1e-05  |
    +----+------------+-------------+----------------+
    | 27 |   0        |           8 |         1e-06  |
    +----+------------+-------------+----------------+
    | 28 |   0.666667 |           8 |         0.003  |
    +----+------------+-------------+----------------+
    | 29 |   0.666667 |           8 |         0.0003 |
    +----+------------+-------------+----------------+
    | 30 |   0        |           8 |         3e-05  |
    +----+------------+-------------+----------------+
    | 31 |   0        |           8 |         3e-06  |
    +----+------------+-------------+----------------+
    | 32 |   0        |           8 |         0.005  |
    +----+------------+-------------+----------------+
    | 33 |   0.666667 |           8 |         0.0005 |
    +----+------------+-------------+----------------+
    | 34 |   0        |           8 |         5e-05  |
    +----+------------+-------------+----------------+
    | 35 |   0        |           8 |         5e-06  |
    +----+------------+-------------+----------------+
    Model is demo/model/mlm/checkpoint-200/
    Best Configuration is 16 0.001
    Best F1 is 0.6666666666666666

The command fine-tunes the model for 5 different random seeds. The models can be found in the folder demo/model/sentiment/

$ ls -lh demo/model/sentiment/ | grep '^d' | awk '{print $9}
bert-custom-model_sentiment_16_0.001_4_1
bert-custom-model_sentiment_16_0.001_4_2
bert-custom-model_sentiment_16_0.001_4_3
bert-custom-model_sentiment_16_0.001_4_4
bert-custom-model_sentiment_16_0.001_4_5

The folder contains the following files

$ ls -lh demo/model/sentiment/bert-custom-model_sentiment_16_0.001_4_1/ | awk '{print $5, $9}'
386B all_results.json
700B config.json
219B eval_results.json
41B predict_results_sentiment.txt
3.6M pytorch_model.bin
96B runs
48B test_predictions.txt
147B test_results.json
187B train_results.json
808B trainer_state.json
2.9K training_args.bin

The files test_predictions.txt contains the predictions from the model on test set. Similarly, the files test_results.json and eval_results.json contains the results (F1-Score, Accuracy, etc) from the model on test and dev set respectively.

The sample snippet of the eval_results.jsom is presented here

$ head demo/model/ner/en/bert-custom-model_ner_16_1e-05_4_1/eval_results.json
{
"epoch": 4.0,
"eval_f1": 0.6666666666666666,
"eval_loss": 0.7115099430084229,
"eval_runtime": 0.0788,
"eval_samples": 6,
"eval_samples_per_second": 76.159,
"eval_steps_per_second": 12.693
}

The scores are bad as we have trained on a tiny corpus. Training on a larger corpus should give good results.