InstructLab Experiment (BETA)

Using TuneExperiment, special InstructLab tuner can be created. The tuner is similar to fine tuner and prompt tuner in usage. InstructLab fine tuning is available only with IBM watsonx.ai for IBM Cloud.

Note

InstructLab fine tuning is in currently in closed beta stage. Feature available only for whitelisted users. Breaking changes in API may be introduced in the future.

ILabTuner

class ibm_watsonx_ai.foundation_models.ILabTuner(name, api_client)[source]

Class of InstructLab fine tuner.

cancel_run()[source]

Cancel a ILab Tuning run.

delete_run()[source]

Delete a ILab Tuning run.

get_data_connections()[source]
Create DataConnection objects for further usage

(eg. to handle data storage connection).

Returns:

list of DataConnections

Return type:

list[‘DataConnection’]

Example:

from ibm_watsonx_ai.experiment import TuneExperiment
experiment = TuneExperiment(credentials, ...)
ilab_tuner = experiment.ilab_tuner(...)
ilab_tuner.run(...)

data_connections = ilab_tuner.get_data_connections()
get_params()[source]

Get configuration parameters of ILabTuner.

Returns:

ILabTuner parameters

Return type:

dict

Example:

from ibm_watsonx_ai.experiment import TuneExperiment

experiment = TuneExperiment(credentials, ...)
ilab_tuner = experiment.ilab_tuner(...)

ilab_tuner.get_params()

# Result:
#
# {'name': 'ILab tuning'}
get_run_details(include_metrics=False)[source]

Get details of an ilab tuning run.

Parameters:

include_metrics (bool, optional) – indicates to include metrics in the training details output

Returns:

details of the ilab tuning

Return type:

dict

Example:

from ibm_watsonx_ai.experiment import TuneExperiment

experiment = TuneExperiment(credentials, ...)
ilab_tuner = experiment.ilab_tuner(...)
ilab_tuner.run(...)

ilab_tuner.get_run_details()
get_run_status()[source]

Check the status/state of an initialized ilab tuning run if it was run in background mode.

Returns:

status of the ILab Tuning run

Return type:

str

Example:

from ibm_watsonx_ai.experiment import TuneExperiment

experiment = TuneExperiment(credentials, ...)
ilab_tuner = experiment.ilab_tuner(...)
ilab_tuner.run(...)

ilab_tuner.get_run_details()

# Result:
# 'completed'
run(training_data_references, training_results_reference=None, background_mode=False)[source]

Run an ilab tuning process of a foundation model on top of the training data referenced by DataConnection.

Parameters:
  • training_data_references (list[DataConnection]) – data storage connection details to inform where the training data is stored

  • training_results_reference (DataConnection, optional) – data storage connection details to store pipeline training results

  • background_mode (bool, optional) – indicator if the fit() method will run in the background, async or sync

Returns:

run details

Return type:

dict

Example:

from ibm_watsonx_ai.experiment import TuneExperiment
from ibm_watsonx_ai.helpers import DataConnection, GithubLocation

experiment = TuneExperiment(credentials, ...)
ilab_tuner = experiment.ilab_tuner(...)

taxonomy_import = ilab_tuner.taxonomies.run_import(
    name="my_taxonomy",
    data_reference=DataConnection(
        location=GithubLocation(
            secret_manager_url="...",
            secret_id="...",
            path="."
        )
    ),
    results_reference=DataConnection(
        location=ContainerLocation(path="."))
)

taxonomy = taxonomy_import.get_taxonomy()

sdg = ilab_tuner.synthetic_data.generate(
    name="my_sdg",
    taxonomy=taxonomy
)

ilab_tuner.run(
    training_data_references=[sdg.get_results_reference()],
    training_results_reference=DataConnection(
        location=ContainerLocation(
            path="fine_tuning_result"
        )
    )
)

Taxonomy import

This section contains classes used for taxonomy import step.

Taxonomy

class ibm_watsonx_ai.foundation_models.ilab.taxonomies.Taxonomy(id, api_client)[source]

Class of InstructLab taxonomy.

delete()[source]

Delete taxonomy import

get_details()[source]

Get taxonomy import details

Returns:

details of taxonomy import

Return type:

dict

get_taxonomy_import()[source]

Get taxonomy import object

Returns:

taxonomy import

Return type:

TaxonomyImport

get_taxonomy_tree()[source]

Get taxonomy import tree

Returns:

taxonomy import tree

Return type:

dict

update_taxonomy_tree(updated_taxonomy_tree)[source]

Update taxonomy import tree

Parameters:

updated_taxonomy_tree (dict) – taxonomy tree with updated nodes

TaxonomyImport

class ibm_watsonx_ai.foundation_models.ilab.taxonomies.TaxonomyImport(name, api_client)[source]

Class of InstructLab taxonomy import.

cancel_run()[source]

Cancel taxonomy import run

delete_run()[source]

Delete taxonomy import run

get_run_details()[source]

Get details of taxonomy import run

Returns:

details of taxonomy import

Return type:

dict

get_run_status()[source]

Get status of taxonomy import run

Returns:

status of taxonomy import

Return type:

str

get_taxonomy()[source]

Get taxonomy object for given taxonomy import

Returns:

taxonomy object

Return type:

Taxonomy

TaxonomiesRuns

class ibm_watsonx_ai.foundation_models.ilab.taxonomies.TaxonomiesRuns(api_client)[source]

Class of InstructLab taxonomy import runs.

get_taxonomy_import(taxonomy_import_id)[source]

Get taxonomy import object by id.

Parameters:

taxonomy_import_id (str) – id of given taxonomy import

Returns:

taxonomy import object

Return type:

TaxonomyImport

Taxonomies

class ibm_watsonx_ai.foundation_models.ilab.taxonomies.Taxonomies(ilab_tuner_name, api_client)[source]

Class of InstructLab taxonomy import module.

run_import(*, data_reference, name=None, background_mode=False)[source]

Run a taxonomy import process using data_reference with taxonomy Github location to results_reference location.

Parameters:
  • data_reference (DataConnection) – reference to github repo where taxonomy is stored

  • background_mode (bool, optional) – indicator if the method will run in the background, async or sync

Returns:

taxonomy import object

Return type:

TaxonomyImport

Example:

from ibm_watsonx_ai.experiment import TuneExperiment
from ibm_watsonx_ai.helpers import DataConnection, GithubLocation

experiment = TuneExperiment(credentials, ...)
ilab_tuner = experiment.ilab_tuner(...)

taxonomy_import = ilab_tuner.taxonomies.run_import(
    name="my_taxonomy",
    data_reference=DataConnection(
        location=GithubLocation(
            secret_manager_url="...",
            secret_id="...",
            path="."
        )
    ))
runs()[source]

Get the historical runs.

Returns:

runs object

Return type:

TaxonomiesRuns

Document extraction

This section contains classes used for document extraction step.

DocumentExtraction

class ibm_watsonx_ai.foundation_models.ilab.documents.DocumentExtraction(name, api_client)[source]

Class of InstructLab document extraction.

cancel_run()[source]

Cancel document extraction run

delete_run()[source]

Delete document extraction run

get_run_details()[source]

Get document extraction details

Returns:

details of document extraction

Return type:

dict

get_run_status()[source]

Get document extraction status

Returns:

status of document extraction

Return type:

str

DocumentExtractionsRuns

class ibm_watsonx_ai.foundation_models.ilab.documents.DocumentExtractionsRuns(api_client)[source]

Class of InstructLab document extraction runs.

get_document_extraction(document_extraction_id)[source]

Get document extraction object

Parameters:

document_extraction_id (str) – id of document extraction object

Returns:

document extraction object

Return type:

DocumentExtraction

DocumentExtractions

class ibm_watsonx_ai.foundation_models.ilab.documents.DocumentExtractions(ilab_tuner_name, api_client)[source]

Class of InstructLab document extraction module.

extract(*, name=None, document_references, results_reference, background_mode=False)[source]

Extract .md document from given .pdf document

Parameters:
  • name (str) – document extraction run name

  • document_references (list[DataConnection]) – .pdf document location

  • results_reference (DataConnection) – .md file extraction location

  • background_mode (bool, optional) – indicator if the method will run in the background, async or sync

Returns:

document extraction run

Return type:

DocumentExtraction

runs()[source]

Get the historical runs.

Returns:

runs object

Return type:

DocumentExtractionsRuns

Synthetic data generation

This section contains classes used for synthetic data generation step.

SyntheticDataGeneration

class ibm_watsonx_ai.foundation_models.ilab.synthetic_data.SyntheticDataGeneration(name, api_client)[source]

Class of InstructLab synthetic data generation run.

cancel_run()[source]

Cancel synthetic data generation run

delete_run()[source]

Delete synthetic data generation run

get_results_reference()[source]

Get results reference to generated synthetic data.

Returns:

data connection to generated synthetic data

Return type:

DataConnection

get_run_details()[source]

Get synthetic data generation details

Returns:

details of synthetic data generation

Return type:

dict

get_run_status()[source]

Get synthetic data generation status

Returns:

status of synthetic data generation

Return type:

str

SDGRuns

class ibm_watsonx_ai.foundation_models.ilab.synthetic_data.SDGRuns(api_client)[source]

Class of InstructLab synthetic generation runs.

get_synthetic_data_generation(sdg_id)[source]

Get synthetic data generation object

Parameters:

sdg_id (str) – id of synthetic data generation object

Returns:

synthetic data generation object

Return type:

SyntheticDataGeneration

SyntheticData

class ibm_watsonx_ai.foundation_models.ilab.synthetic_data.SyntheticData(ilab_tuner_name, api_client)[source]

Class of InstructLab synthetic data generation module.

generate(*, name=None, taxonomy, background_mode=False)[source]

Generate synthetic data from updated taxonomy

Parameters:
  • name (str) – name of synthetic data generation run

  • taxonomy (Taxonomy) – taxonomy object

  • background_mode (bool, optional) – indicator if the method will run in the background, async or sync

Returns:

synthetic data generation run object

Return type:

SyntheticDataGeneration

runs()[source]

Get the historical runs.

Returns:

runs object

Return type:

SDGRuns