Modules
- train_tokenizer module
- create_config module
- tokenize_corpus module
- run_clm module
DataTrainingArgumentsDataTrainingArguments.block_sizeDataTrainingArguments.dataset_config_nameDataTrainingArguments.dataset_nameDataTrainingArguments.keep_linebreaksDataTrainingArguments.line_by_lineDataTrainingArguments.max_eval_samplesDataTrainingArguments.max_seq_lengthDataTrainingArguments.max_train_samplesDataTrainingArguments.overwrite_cacheDataTrainingArguments.pad_to_max_lengthDataTrainingArguments.preprocessing_num_workersDataTrainingArguments.test_fileDataTrainingArguments.train_fileDataTrainingArguments.validation_fileDataTrainingArguments.validation_split_percentage
ModelArgumentsmain()
- run_mlm module
DataTrainingArgumentsDataTrainingArguments.dataset_config_nameDataTrainingArguments.dataset_nameDataTrainingArguments.keep_linebreaksDataTrainingArguments.line_by_lineDataTrainingArguments.max_eval_samplesDataTrainingArguments.max_seq_lengthDataTrainingArguments.max_train_samplesDataTrainingArguments.mlm_probabilityDataTrainingArguments.overwrite_cacheDataTrainingArguments.pad_to_max_lengthDataTrainingArguments.preprocessing_num_workersDataTrainingArguments.test_fileDataTrainingArguments.train_fileDataTrainingArguments.validation_fileDataTrainingArguments.validation_split_percentage
ModelArgumentsModelArguments.cache_dirModelArguments.config_nameModelArguments.config_overridesModelArguments.freeze_token_embedModelArguments.model_name_or_pathModelArguments.model_revisionModelArguments.model_typeModelArguments.pretrained_token_embedModelArguments.tokenizer_nameModelArguments.use_auth_tokenModelArguments.use_fast_tokenizer
main()read_txt_embeddings()
- run_seq_to_seq_pretrain module
- run_tc module
DataTrainingArgumentsDataTrainingArguments.dataset_config_nameDataTrainingArguments.dataset_nameDataTrainingArguments.early_stopDataTrainingArguments.label_column_nameDataTrainingArguments.max_seq_lengthDataTrainingArguments.overwrite_cacheDataTrainingArguments.pad_to_max_lengthDataTrainingArguments.preprocessing_num_workersDataTrainingArguments.task_nameDataTrainingArguments.test_fileDataTrainingArguments.text_column_nameDataTrainingArguments.train_fileDataTrainingArguments.validation_file
ModelArgumentsmain()
- run_seq module
DataTrainingArgumentsDataTrainingArguments.dataset_config_nameDataTrainingArguments.dataset_nameDataTrainingArguments.max_eval_samplesDataTrainingArguments.max_predict_samplesDataTrainingArguments.max_seq_lengthDataTrainingArguments.max_train_samplesDataTrainingArguments.overwrite_cacheDataTrainingArguments.pad_to_max_lengthDataTrainingArguments.task_nameDataTrainingArguments.test_fileDataTrainingArguments.train_fileDataTrainingArguments.validation_file
ModelArgumentsTaskArgumentsmain()
- data_collator_for_seq_to_seq module