run_seq module
Finetuning the library models for sequence classification tasks.
- class run_seq.DataTrainingArguments(task_name: Optional[str] = None, dataset_name: Optional[str] = None, dataset_config_name: Optional[str] = None, max_seq_length: int = 128, overwrite_cache: bool = False, pad_to_max_length: bool = True, max_train_samples: Optional[int] = None, max_eval_samples: Optional[int] = None, max_predict_samples: Optional[int] = None, train_file: Optional[str] = None, validation_file: Optional[str] = None, test_file: Optional[str] = None)[source]
Bases:
objectArguments pertaining to what data we are going to input our model for training and eval.
Using HfArgumentParser we can turn this class into argparse arguments to be able to specify them on the command line.
- dataset_config_name: Optional[str] = None
- dataset_name: Optional[str] = None
- max_eval_samples: Optional[int] = None
- max_predict_samples: Optional[int] = None
- max_seq_length: int = 128
- max_train_samples: Optional[int] = None
- overwrite_cache: bool = False
- pad_to_max_length: bool = True
- task_name: Optional[str] = None
- test_file: Optional[str] = None
- train_file: Optional[str] = None
- validation_file: Optional[str] = None
- class run_seq.ModelArguments(model_name_or_path: str, config_name: Optional[str] = None, tokenizer_name: Optional[str] = None, cache_dir: Optional[str] = None, use_fast_tokenizer: bool = True, model_revision: str = 'main', use_auth_token: bool = False, log_dir: Optional[str] = None)[source]
Bases:
objectArguments pertaining to which model/config/tokenizer we are going to fine-tune from.
- cache_dir: Optional[str] = None
- config_name: Optional[str] = None
- log_dir: Optional[str] = None
- model_name_or_path: str
- model_revision: str = 'main'
- tokenizer_name: Optional[str] = None
- use_auth_token: bool = False
- use_fast_tokenizer: bool = True