Skip to content

Notebooks

Notebooks cover basic implementations of each control in our toolkit (including examples of how to implement methods from wrappers), as well as implementations of benchmarks.

Controls

  • Input control


    Input control methods adapt the input (prompt) before the model is called. Current notebooks cover:

    FewShot

  • Structural control


    Structural control methods adapt the model's weights/architecture. Current notebooks cover:

    MergeKit wrapper

    TRL wrapper

  • State control


    State control methods influence the model's internal states (activation, attentions, etc.) at inference time. Current notebooks cover:

    CAST

    PASTA

  • Output control


    Output control methods influence the model's behavior via the generate() method. Current notebooks cover:

    DeAL

    RAD

    SASA

    ThinkingIntervention

Benchmarks

  • Commonsense MCQA


    This benchmark evaluates how well a steered model (under FewShot and LoRA) performs compared to a base model on answering commonsense multiple-choice questions.

    See the benchmark

  • Instruction following


    This benchmark evaluates a steered model's ability to follow instructions. We compare the performance of the baseline model to the steered model under PASTA, DeAL, and ThinkingIntervention.

    See the benchmark