Skip to content

TRL

aisteer360.algorithms.structural_control.wrappers.trl

The TRL wrapper implements a variety of methods from Hugging Face's TRL library.

The current functionality spans the following methods:

  • SFT (Supervised Fine-Tuning): Standard supervised learning to fine-tune language models on demonstration data
  • DPO (Direct Preference Optimization): Trains models directly on preference data without requiring a separate reward model
  • APO (Anchored Preference Optimization): A variant of DPO that uses an anchor model to improve training stability and performance
  • SPPO (Self-Play Preference Optimization): Iterative preference optimization using self-generated synthetic data to reduce dependency on external preference datasets

For documentation information, please refer to the TRL page and the SPPO repository.

args

base_mixin

TRLMixin

Bases: StructuralControl

Source code in aisteer360/algorithms/structural_control/wrappers/trl/base_mixin.py
4
5
6
7
8
9
class TRLMixin(StructuralControl):
    """

    """

    pass
args = self.Args.validate(*args, **kwargs) instance-attribute
enabled = True class-attribute instance-attribute
steer(model, tokenizer=None, **kwargs) abstractmethod

Required steering/preparation.

Source code in aisteer360/algorithms/structural_control/base.py
61
62
63
64
65
66
67
68
69
@abstractmethod
def steer(
        self,
        model: PreTrainedModel,
        tokenizer: PreTrainedTokenizer = None,
        **kwargs
) -> PreTrainedModel:
    """Required steering/preparation."""
    pass