Working with TuneExperiment and ILabTuner (BETA)

The TuneExperiment class is responsible for creating experiments and scheduling tunings. All experiment results are stored automatically in your chosen Cloud Object Storage (COS) for SaaS. Then the TuneExperiment feature can fetch the results and provide them directly to you for further use. InstructLab fine tuning is available only with IBM watsonx.ai for IBM Cloud.

Note

InstructLab fine tuning is in currently in closed beta stage. Feature available only for whitelisted users. Breaking changes in API may be introduced in the future.

Preparations

ILab scenario requires additional preparation steps listed below:

  1. Your IBM Cloud IBMid must be added to allowlists that control access to the Tuning Studio user interface, the watsonx.ai API, and the InstructLab service.

  2. Upload taxonomy into github - clone taxonomy from https://github.com/instructlab/taxonomy into your github.

  3. Prepare PAT (Personal Access Token) for your github repo with taxonomy.

  4. Prepare Secrets Manager instance - if you don’t have one yet go into resource list in https://cloud.ibm.com and add new resource Secrets Manager. It may take few hours to end provisioning of new instance.

  5. Add record with github PAT into Secrets Manager - open Secrets Manager and chose “Key-Value” option of secrets storing. Add following record:

    {
        "github_pat": "<Personal Access Token>",
        "github_url": "github_repo",
    }
    
  6. Run ibmcli command to allow connection between service and COS:

    ibmcloud iam authorization-policy-create Writer --source-service-name instructlab --target-service-name cloud-object-storage --target-service-instance-id <service_instance_id> --target-resource <bucket_name> --target-resource-type bucket
    

Configure ILabTuner

For an TuneExperiment object initialization, you need authentication credentials (for examples, see Setup) and the project_id or the space_id.

Hint

You can copy the project_id from the Project’s Manage tab (Project -> Manage -> General -> Details).

from ibm_watsonx_ai.foundation_models.utils.enums import ModelTypes
from ibm_watsonx_ai.experiment import TuneExperiment

experiment = TuneExperiment(credentials,
    project_id="7ac03029-8bdd-4d5f-a561-2c4fd1e40705"
)

prompt_tuner = experiment.ilab_tuner(
    name="ilab_tuning_name",
)

Document extraction

To extract pdf file placed in the container into github repo use document extraction.

from ibm_watsonx_ai.helpers import DataConnection
from ibm_watsonx_ai.helpers.connections.connections import GithubLocation, ContainerLocation


document_extraction = ilab_tuner.documents.extract(
    name="document_extraction",
    document_references=[DataConnection(
        location=ContainerLocation(path=extract_file))],
    results_reference=DataConnection(
        location=GithubLocation(
            secret_manager_url=secrets_manager_url,
            secret_id=secret_id,
            path=extracted_file
        )
    )
)

# OR

document_extraction = ilab_tuner.documents.extract(
    name="document_extraction",
    document_references=[DataConnection(
        location=ContainerLocation(path=extract_file))],
    results_reference=DataConnection(
        location=GithubLocation(
            secret_manager_url=secrets_manager_url,
            secret_id=secret_id,
            path=extracted_file
        )
    ),
    background_mode=True
)

print(document_extraction.get_run_status())
document_extraction_details = document_extraction.get_run_details()

Taxonomy import

Taxonomy should be imported from github repo into COS to be used in ilab tuning.

taxonomy_import = ilab_tuner.taxonomies.run_import(
    name="taxonomy",
    data_reference=DataConnection(
        location=GithubLocation(
            secret_manager_url=secrets_manager_url,
            secret_id=secret_id,
            path="."
        )
    )
)

# OR

taxonomy_import = ilab_tuner.taxonomies.run_import(
    name="taxonomy",
    data_reference=DataConnection(
        location=GithubLocation(
            secret_manager_url=secrets_manager_url,
            secret_id=secret_id,
            path="."
        )
    ),
    background_mode=True
)

print(taxonomy_import.get_run_status())
taxonomy_import_details = taxonomy_import.get_run_details()
taxonomy = taxonomy_import.get_taxonomy()

Taxonomy tree update

Knowledge and skills are added by updating the taxonomy tree and adding files into taxonomy. To update taxonomy tree get the tree, modify it and send the update.

taxonomy.get_taxonomy_tree()
taxonomy.update_taxonomy_tree({'compositional_skills': {...})

Synthetic data generation

After taxonomy is enhanced with additional knowledge and skills, synthetic data generation should be run to prepare files for tuning.

sdg = ilab_tuner.synthetic_data.generate(
    name="sdg",
    taxonomy=taxonomy,
)

# OR

sdg = ilab_tuner.synthetic_data.generate(
    name="sdg",
    taxonomy=taxonomy,
    background_mode=True
)

print(sdg.get_run_status())
sdg_details = sdg.get_run_details()

Run ilab tuning

To schedule a tuning experiment, call the run() method, which will trigger a training process. The run() method can be synchronous (background_mode=False) or asynchronous (background_mode=True). If you don’t want to wait for the training to end, invoke the async version. It immediately returns only run details.

tuning_details = ilab_tuner.run(
        training_data_references=[sdg.get_results_reference()],
        training_results_reference=DataConnection(
            location=ContainerLocation(
                path="."
            )
    )
)

# OR

tuning_details = ilab_tuner.run(
        training_data_references=[sdg.get_results_reference()],
        training_results_reference=DataConnection(
            location=ContainerLocation(
                path="."
            )
    ),
    background_mode=True
)

Get run status, get run details

If you use the run() method asynchronously, you can monitor the run details and status by using the following two methods:

status = ilab_tuner.get_run_status()
print(status)
'running'

# OR

'completed'

run_details = ilab_tuner.get_run_details(include_metrics=True)
print(run_details)
{'entity': {'results_reference': {'location': {'assets_path': 'f4181378-bd39-4240-aa0c-5b0adac26ff6/assets',
    'path': '.',
    'training': 'f4181378-bd39-4240-aa0c-5b0adac26ff6',
    'training_status': 'f4181378-bd39-4240-aa0c-5b0adac26ff6/training-status.json'},
   'type': 'container'},
  'status': {'completed_at': '2025-01-28T21:12:40.533Z',
   'locations': {'logs': 'trained_models/3501ef6c-a1d3-424a-afd3-079dfffb2939/logs'},
   'metrics': [{'context': {'locations': {'artifacts': 'trained_models/3501ef6c-a1d3-424a-afd3-079dfffb2939/artifacts',
       'eval': 'trained_models/3501ef6c-a1d3-424a-afd3-079dfffb2939/eval',
       'model': 'trained_models/3501ef6c-a1d3-424a-afd3-079dfffb2939/model'}},
     'fine_tuning_metrics': {'mmlu': {'overall_average': 0.57,
       'scores': {'mmlu_abstract_algebra': 0.37,
        'mmlu_anatomy': 0.59,
        'mmlu_astronomy': 0.72,
        'mmlu_business_ethics': 0.53,
        'mmlu_clinical_knowledge': 0.6,
        'mmlu_college_biology': 0.66,
        'mmlu_college_chemistry': 0.4,
        'mmlu_college_computer_science': 0.37,
        'mmlu_college_mathematics': 0.36,
        'mmlu_college_medicine': 0.59,
        'mmlu_college_physics': 0.32,
        'mmlu_computer_security': 0.7,
        'mmlu_conceptual_physics': 0.49,
        'mmlu_econometrics': 0.37,
        'mmlu_electrical_engineering': 0.59,
        'mmlu_elementary_mathematics': 0.41,
        'mmlu_formal_logic': 0.32,
        'mmlu_global_facts': 0.36,
        'mmlu_high_school_biology': 0.69,
        'mmlu_high_school_chemistry': 0.5,
        'mmlu_high_school_computer_science': 0.57,
        'mmlu_high_school_european_history': 0.73,
        'mmlu_high_school_geography': 0.72,
        'mmlu_high_school_government_and_politics': 0.79,
        'mmlu_high_school_macroeconomics': 0.55,
        'mmlu_high_school_mathematics': 0.32,
        'mmlu_high_school_microeconomics': 0.58,
        'mmlu_high_school_physics': 0.34,
        'mmlu_high_school_psychology': 0.75,
        'mmlu_high_school_statistics': 0.45,
        'mmlu_high_school_us_history': 0.75,
        'mmlu_high_school_world_history': 0.76,
        'mmlu_human_aging': 0.56,
        'mmlu_human_sexuality': 0.64,
        'mmlu_international_law': 0.68,
        'mmlu_jurisprudence': 0.68,
        'mmlu_logical_fallacies': 0.69,
        'mmlu_machine_learning': 0.39,
        'mmlu_management': 0.77,
        'mmlu_marketing': 0.83,
        'mmlu_medical_genetics': 0.58,
        'mmlu_miscellaneous': 0.74,
        'mmlu_moral_disputes': 0.62,
        'mmlu_moral_scenarios': 0.3,
        'mmlu_nutrition': 0.58,
        'mmlu_philosophy': 0.65,
        'mmlu_prehistory': 0.68,
        'mmlu_professional_accounting': 0.39,
        'mmlu_professional_law': 0.42,
        'mmlu_professional_medicine': 0.47,
        'mmlu_professional_psychology': 0.55,
        'mmlu_public_relations': 0.61,
        'mmlu_security_studies': 0.71,
        'mmlu_sociology': 0.8,
        'mmlu_us_foreign_policy': 0.81,
        'mmlu_virology': 0.42,
        'mmlu_world_religions': 0.73}},
      'mmlu_branch': {'improvements': {'knowledge_science_animals_birds_black_capped_chickadee': 0.1},
       'no_change': {},
       'regressions': {}},
      'mt_bench': {'error_rate': 0.03,
       'overall_average': 7.25,
       'scores': {'turn_one': 7.58, 'turn_two': 6.91}},
      'mt_bench_branch': {'error_rate': 0,
       'improvements': {'compositional_skills/grounded/linguistics/inclusion/qna.yaml': 2.17,
        'compositional_skills/linguistics/synonyms/qna.yaml': 1.5,
        'foundational_skills/reasoning/linguistics_reasoning/logical_sequence_of_words/qna.yaml': 2.67,
        'foundational_skills/reasoning/linguistics_reasoning/object_identification/qna.yaml': 1.67,
        'foundational_skills/reasoning/linguistics_reasoning/odd_one_out/qna.yaml': 5.33,
        'foundational_skills/reasoning/logical_reasoning/causal/qna.yaml': 2,
        'foundational_skills/reasoning/logical_reasoning/general/qna.yaml': 0.93,
        'foundational_skills/reasoning/mathematical_reasoning/qna.yaml': 1.67,
        'foundational_skills/reasoning/temporal_reasoning/qna.yaml': 0.33,
        'foundational_skills/reasoning/theory_of_mind/qna.yaml': 1.38,
        'foundational_skills/reasoning/unconventional_reasoning/lower_score_wins/qna.yaml': 2.67},
       'no_change': {},
       'regressions': {'compositional_skills/grounded/linguistics/writing/rewriting/qna.yaml': -0.4,
        'foundational_skills/reasoning/common_sense_reasoning/qna.yaml': -2.33,
        'foundational_skills/reasoning/logical_reasoning/tabular/qna.yaml': -5.33}},
      'tokens': {'training_phases': {}}},
     'timestamp': '2025-01-28T21:12:38.121Z'}],
   'running_at': '2025-01-27T13:01:36.331Z',
   'state': 'completed',
   'step': 'completed'},
  'training_data_references': [{'location': {'href': '/v2/assets/dd585535-1bd3-4989-b552-ae5602cb6728',
     'id': 'dd585535-1bd3-4989-b552-ae5602cb6728'},
    'type': 'data_asset'}],
  'type': 'ilab'},
 'metadata': {'created_at': '2025-01-27T13:00:09.492Z',
  'description': 'my_ilab_tuning',
  'id': 'f4181378-bd39-4240-aa0c-5b0adac26ff6',
  'modified_at': '2025-01-28T21:12:40.611Z',
  'name': 'my_ilab_tuning',
  'project_id': '6305b637-e91c-4e3c-b865-362388a81e48'},
 'system': {'warnings': [{'message': 'This beta release is a preview IBM Cloud service and is not meant for production use. Usage limitations that are explained in the Service Description apply.'}]}}

Get data connections

The data_connections list contains all the training connections that you referenced while calling the run() method.

data_connections = ilab_tuner.get_data_connections()

# Get data in binary format
binary_data = data_connections[0].read(binary=True)