TRL
aisteer360.algorithms.structural_control.wrappers.trl
The TRL wrapper implements a variety of methods from Hugging Face's TRL library.
The current functionality spans the following methods:
- SFT (Supervised Fine-Tuning): Standard supervised learning to fine-tune language models on demonstration data
- DPO (Direct Preference Optimization): Trains models directly on preference data without requiring a separate reward model
- APO (Anchored Preference Optimization): A variant of DPO that uses an anchor model to improve training stability and performance
- SPPO (Self-Play Preference Optimization): Iterative preference optimization using self-generated synthetic data to reduce dependency on external preference datasets
For documentation information, please refer to the TRL page and the SPPO repository.
args
base_mixin
TRLMixin
Bases: StructuralControl
Source code in aisteer360/algorithms/structural_control/wrappers/trl/base_mixin.py
4 5 6 7 8 9 |
|
args = self.Args.validate(*args, **kwargs)
instance-attribute
enabled = True
class-attribute
instance-attribute
steer(model, tokenizer=None, **kwargs)
abstractmethod
Required steering/preparation.
Source code in aisteer360/algorithms/structural_control/base.py
61 62 63 64 65 66 67 68 69 |
|