Actuators, Experiments & Measurement Spaces
Experiments¶
To find the values of certain properties of Entities we need to perform measurements on them. We use the term "experiment" to describe a particular type of measurement. This could also be called an "experiment protocol".
An experiment will define its inputs - the set of constitutive and observed properties it requires entities to have. It will also define the properties it measures.
Actuators¶
Experiments are provided by Actuators. An Actuator usually provides sets of experiments that work on the same types of entities i.e. have the same or similar input requirements. As such Actuators usually are related to a particular domain e.g. computational chemistry, foundation model inference, robotic biology lab.
ado get actuators --details
lists the available actuators and experiments. Here is a truncated example of the output:
ACTUATOR ID CATALOG ID EXPERIMENT ID SUPPORTED
0 molecule-embeddings Embeddings calculate-morgan-fingerprint True
1 molformer-toxicity molformer-toxicity predict-toxicity True
2 mordred Mordred Descriptor mordred-descriptor-calculator True
3 st4sd ST4SD toxicity-prediction-opera True
4 st4sd ST4SD band-gap-pm3-gamess-us:1.0.0 True
5 materials-model-evaluator materials-model-evaluator evaluate_with_clintox True
6 materials-model-evaluator materials-model-evaluator evaluate_sider_for_target True
7 caikit-config-explorer caikit-config-explorer fmaas-perf True
8 caikit-config-explorer caikit-config-explorer fmaas-perf-composable-gigaio True
One of the primary ways to extend ado
is to develop new Actuators providing the ability to do experiments on entities in a new domain.
Example: Experiment from the SFTTrainer actuator¶
Here is an example description of an experiment from the SFTTrainer actuator.
Identifier: SFTTrainer.finetune-full-fsdp-v1.6.0
Measures the performance of full-fine tuning a model with FSDP+flash-attention for a given (GPU model, number GPUS, batch_size, model_max_length) combination.
Inputs:
Constitutive Properties:
dataset_id
Domain:
Type: CATEGORICAL_VARIABLE_TYPE
Values: ['news-chars-1024-entries-1024', 'news-chars-1024-entries-256', 'news-chars-1024-entries-4096', 'news-chars-2048-entries-1024', 'news-chars-2048-entries-256', 'news-chars-2048-entries-4096', 'news-chars-512-entries-1024', 'news-chars-512-entries-256', 'news-chars-512-entries-4096', 'news-tokens-128kplus-entries-320', 'news-tokens-128kplus-entries-4096', 'news-tokens-16384plus-entries-4096']
model_name
Domain:
Type: CATEGORICAL_VARIABLE_TYPE
Values: ['granite-13b-v2', 'granite-20b-v2', 'granite-34b-code-base', 'granite-3b-1.5', 'granite-3b-code-base-128k', 'granite-7b-base', 'granite-8b-code-base', 'granite-8b-code-base-128k', 'granite-8b-japanese', 'hf-tiny-model-private/tiny-random-BloomForCausalLM', 'llama-13b', 'llama-7b', 'llama2-70b', 'llama3-70b', 'llama3-8b', 'llama3.1-405b', 'llama3.1-70b', 'llama3.1-8b', 'mistral-7b-v0.1', 'mixtral-8x7b-instruct-v0.1']
model_max_length
Domain:
Type: DISCRETE_VARIABLE_TYPE Interval: 1.0 Range: [1, 131073]
torch_dtype
Domain:
Type: CATEGORICAL_VARIABLE_TYPE Values: ['bfloat16']
number_gpus
Domain:
Type: DISCRETE_VARIABLE_TYPE Interval: 1.0 Range: [2, 9]
gpu_model
Domain:
Type: CATEGORICAL_VARIABLE_TYPE
Values: ['NVIDIA-A100-SXM4-80GB', 'NVIDIA-A100-80GB-PCIe', 'Tesla-T4', 'L40S', 'Tesla-V100-PCIE-16GB']
batch_size
Domain:
Type: DISCRETE_VARIABLE_TYPE Interval: 1.0 Range: [1, 129]
Outputs:
finetune-full-fsdp-v1.6.0-gpu_compute_utilization_min
finetune-full-fsdp-v1.6.0-gpu_compute_utilization_avg
finetune-full-fsdp-v1.6.0-gpu_compute_utilization_max
finetune-full-fsdp-v1.6.0-gpu_memory_utilization_min
finetune-full-fsdp-v1.6.0-gpu_memory_utilization_avg
finetune-full-fsdp-v1.6.0-gpu_memory_utilization_max
finetune-full-fsdp-v1.6.0-gpu_memory_utilization_peak
finetune-full-fsdp-v1.6.0-gpu_power_watts_min
finetune-full-fsdp-v1.6.0-gpu_power_watts_avg
finetune-full-fsdp-v1.6.0-gpu_power_watts_max
finetune-full-fsdp-v1.6.0-gpu_power_percent_min
finetune-full-fsdp-v1.6.0-gpu_power_percent_avg
finetune-full-fsdp-v1.6.0-gpu_power_percent_max
finetune-full-fsdp-v1.6.0-cpu_compute_utilization
finetune-full-fsdp-v1.6.0-cpu_memory_utilization
finetune-full-fsdp-v1.6.0-train_runtime
finetune-full-fsdp-v1.6.0-train_samples_per_second
finetune-full-fsdp-v1.6.0-train_steps_per_second
finetune-full-fsdp-v1.6.0-train_tokens_per_second
finetune-full-fsdp-v1.6.0-train_tokens_per_gpu_per_second
finetune-full-fsdp-v1.6.0-model_load_time
finetune-full-fsdp-v1.6.0-dataset_tokens_per_second
finetune-full-fsdp-v1.6.0-dataset_tokens_per_second_per_gpu
finetune-full-fsdp-v1.6.0-is_valid
The SFTTrainer actuator provides experiments which measure the performance of different fine-tuning techniques on a foundation model fine-tuning deployment configuration. Therefore, the entities it takes as input represent fine-tuning deployment configuration.
Example: Experiment from the ST4SD actuator¶
Here is an example description of an experiment from the ST4SD actuator.
Identifier: st4sd.band-gap-pm3-gamess-us:1.0.0
Required Inputs:
Constitutive Properties:
smiles
Domain:
Type: UNKNOWN_VARIABLE_TYPE
Outputs:
band-gap-pm3-gamess-us:1.0.0-band-gap
band-gap-pm3-gamess-us:1.0.0-homo
band-gap-pm3-gamess-us:1.0.0-lumo
band-gap-pm3-gamess-us:1.0.0-electric-moments
band-gap-pm3-gamess-us:1.0.0-total-energy
The ST4SD actuator provides experiments which perform computational measurements of entities, often molecules. Therefore, the entities it takes as input often represent molecules.
Experiment Inputs¶
Experiments define their inputs they require along with valid values for those inputs.
Required Inputs¶
Experiments can define required inputs. There are properties an Entity must have values for, for it to be a valid input to the Experiment.
For example for SFTTrainer.finetune-full-fsdp-v1.6.0
shown above we can see it requires an Entity to have 7 constitutive properties defined: dataset_id
, model_name
, model_max_length
, torch_dtype
, number_gpus
, gpu_model
, and batch_size
. Each one has a domain which defines the allowed values for that property - if an Entity has a value for a property that is not in the defined domain the experiment cannot run on it.
For example, the number_gpu
property can only have the values 2,3,4,5,6,7 and 8 (range is exclusive of upper bound)
number_gpus
Domain:
Type: DISCRETE_VARIABLE_TYPE Interval: 1.0 Range: [2, 9]
All the required inputs in the examples above are constitutive properties. However, they can also be observed properties (see next section) i.e. properties measured by other experiments. If an Experiment, B
has a required input that is an observed property it means the experiment measuring that property has to be run on an Entity before Experiment B
can be run on it.
Optional Properties¶
Experiment can also define optional properties. These are properties an Entity can have but if they don't the Experiment will give it a default value. In addition, the default values of optional properties can be overridden to create parameterized experiments. This is described further in the discoveryspace
resource documentation.
An example experiment with optional properties is
Identifier: robotic_lab.peptide_mineralization
Measures adsorption of peptide lanthanide combinations
Required Inputs:
Constitutive Properties:
peptide_identifier
Domain:
Type: CATEGORICAL_VARIABLE_TYPE
Values: ['test_peptide', 'test_peptide_new']
peptide_concentration
Domain:
Type: DISCRETE_VARIABLE_TYPE
Values: [0.1, 0.4, 0.6, 0.8]
Range: [0.1, 1.8]
lanthanide_concentration
Domain:
Type: DISCRETE_VARIABLE_TYPE
Values: [0.1, 0.4, 0.6, 0.8]
Range: [0.1, 1.8]
Optional Inputs and Default Values:
temperature
Domain:
Type: CONTINUOUS_VARIABLE_TYPE Range: [0, 100]
Default value: 23.0
replicas
Domain:
Type: DISCRETE_VARIABLE_TYPE Interval: 1.0 Range: [1, 4]
Default value: 1.0
robot_identifier
Domain:
Type: CATEGORICAL_VARIABLE_TYPE Values: ['harry', 'hermione']
Default value: hermione
Outputs:
peptide_mineralization-adsorption_timeseries
peptide_mineralization-adsorption_plateau_value
Here you can see three optional properties, temperature
, replicas
and robot_identifier
that are given default values.
Target and Observed Properties¶
Experiments define properties the properties they measure. However, there may be many experiments that measure the same property in different ways so we need a way to differentiate them.
The properties the experiment targets measuring are called target properties
, and the properties it actually measures observed properties
. If experiment A
has target property X
, then the observed property is A-X
i.e. the value of target property X
measured by experiment A
.
Looking at the definitions above for the st4sd.band-gap-pm3-gamess-us:1.0.0
experiment we can see a target is band-gap
and the corresponding observed property is band-gap-pm3-gamess-us:1.0.0-band-gap
Measurement Space¶
A measurement space is simply a set of experiments.
Since each experiment has a set of observed properties, a measurement space also defines a set of observed properties.
Since each observed property is an observation of a target property, a measurement space also defines a set of target properties.