Util¶
checkpoint¶
-
class
seq2seq.util.checkpoint.
Checkpoint
(model, optimizer, epoch, step, input_vocab, output_vocab, path=None)¶ The Checkpoint class manages the saving and loading of a model during training. It allows training to be suspended and resumed at a later time (e.g. when running on a cluster using sequential jobs).
To make a checkpoint, initialize a Checkpoint object with the following args; then call that object’s save() method to write parameters to disk.
Parameters: - model (seq2seq) – seq2seq model being trained
- optimizer (Optimizer) – stores the state of the optimizer
- epoch (int) – current epoch (an epoch is a loop through the full training data)
- step (int) – number of examples seen within the current epoch
- input_vocab (Vocabulary) – vocabulary for the input language
- output_vocab (Vocabulary) – vocabulary for the output language
Variables: - CHECKPOINT_DIR_NAME (str) – name of the checkpoint directory
- TRAINER_STATE_NAME (str) – name of the file storing trainer states
- MODEL_NAME (str) – name of the file storing model
- INPUT_VOCAB_FILE (str) – name of the input vocab file
- OUTPUT_VOCAB_FILE (str) – name of the output vocab file
-
CHECKPOINT_DIR_NAME
= 'checkpoints'¶
-
INPUT_VOCAB_FILE
= 'input_vocab.pt'¶
-
MODEL_NAME
= 'model.pt'¶
-
OUTPUT_VOCAB_FILE
= 'output_vocab.pt'¶
-
TRAINER_STATE_NAME
= 'trainer_states.pt'¶
-
classmethod
get_latest_checkpoint
(experiment_path)¶ Given the path to an experiment directory, returns the path to the last saved checkpoint’s subdirectory.
Precondition: at least one checkpoint has been made (i.e., latest checkpoint subdirectory exists). :param experiment_path: path to the experiment directory :type experiment_path: str
Returns: path to the last saved checkpoint’s subdirectory Return type: str
-
classmethod
load
(path)¶ Loads a Checkpoint object that was previously saved to disk. :param path: path to the checkpoint subdirectory :type path: str
Returns: checkpoint object with fields copied from those stored on disk Return type: checkpoint (Checkpoint)
-
path
¶
-
save
(experiment_dir)¶ Saves the current model and related training parameters into a subdirectory of the checkpoint directory. The name of the subdirectory is the current local time in Y_M_D_H_M_S format. :param experiment_dir: path to the experiment root directory :type experiment_dir: str
Returns: path to the saved checkpoint subdirectory Return type: str