Custom models

The custom models of watsonx.ai client might differ depending on the product offering. Choose an option from the list below to see steps.

IBM watsonx.ai for IBM Cloud

This section shows how to create task credentials, store & deploy model and use ModelInference module with created deployment on the IBM watsonx.ai for IBM Cloud.

Initialize APIClient object

Initialize APIClient object if needed. More details about supported APIClient initialization can be found in Setup section,

from ibm_watsonx_ai import APIClient

client = APIClient(credentials)
client.set.default_project(project_id=project_id)
# or client.set.default_space(space_id=space_id)

Add Task Credentials

Warning

If not already added, a Task Credentials are required on IBM watsonx.ai for IBM Cloud to make a deployment.

Task credentials enable you to deploy a custom foundation model and avoid token expiration issues. More details can be found in: Adding task credentials.

To list available task credentials, use list method:

client.task_credentials.list()

If the list is empty, you can create new task credentials by using store method:

client.task_credentials.store()

To get status of available task credentials, use get_details method:

client.task_credentials.get_details()

Storing model in service repository

To store model as an asset in the repo, first create proper metadata.

In Cloud scenario, there is a need to have an active connection to Cloud Object Storage. For more information and how to create COS connection, see Connection Asset. From such connection, the user can obtain the parameters needed to fill MODEL_LOCATION field.

metadata = {
    client.repository.ModelMetaNames.NAME: "custom FM asset,
    client.repository.ModelMetaNames.SOFTWARE_SPEC_ID: sw_spec_id,
    client.repository.ModelMetaNames.TYPE: client.repository.ModelAssetTypes.CUSTOM_FOUNDATION_MODEL_1_0,
    client.repository.ModelMetaNames.MODEL_LOCATION: {
        "file_path": "path/to/pvc",
        "bucket": "watsonx-llm-models",
        "connection_id": "5e891c6b-3aa9-4f01-8e2d-785d81797466",
    },
}

Storing model

After that, it is possible to store model using client.repository.store_model().

stored_model_details = client.repository.store_model(model='Google/flan-t5-small', meta_props=metadata)

To get id of stored asset use the details obtained.

model_asset_id = client.repository.get_model_id(stored_model_details)

All stored models custom foundation models can be listed by client.repository.list() method with filtering by framework type.

client.repository.list(framework_filter='custom_foundation_model_1.0')

Deployment of custom foundation model

To crete new deployment of custom foundation models dictionary with deployment metadata need to be defined. There can be specified the NAME of new deployment, DESCRIPTION and HARDWARE_REQUEST fields. The requested hardware must have specific size and num_nodes. If it comes to size user can use client.deployments.HardwareRequestSizes.Small or client.deployments.HardwareRequestSizes.Medium. Where num_nodes is a number.

For now only online deployments are supported so ONLINE field is required. At this stage user can overwrite model parameters optionally. It can be done by passing dictionary with new parameters values in FOUNDATION_MODEL field.

Besides the metadata with deployment configuration the id of stored model asset are required for deployment creation.

meta_props = {
    client.deployments.ConfigurationMetaNames.NAME: "custom_fm_deployment",
    client.deployments.ConfigurationMetaNames.DESCRIPTION: "Testing deployment using custom foundation model",
    client.deployments.ConfigurationMetaNames.ONLINE: {},
    client.deployments.ConfigurationMetaNames.HARDWARE_REQUEST: {
        'size': client.deployments.HardwareRequestSizes.Small,
        # or 'size': client.deployments.HardwareRequestSizes.Medium
        'num_nodes': 1
    },
    # optionally overwrite model parameters here
    client.deployments.ConfigurationMetaNames.FOUNDATION_MODEL: {"max_input_tokens": 256},
    client.deployments.ConfigurationMetaNames.SERVING_NAME: "test_byom_fm_01"
}
deployment_details = client.deployments.create(stored_model_asset_id, meta_props)

Once deployment creation process is done the client.deployments.create returns dictionary with deployment details, which can be used to retrieve the id of the deployment.

deployment_id = client.deployments.get_id(deployment_details)

All existing in working space or project scope can be listed with list method:

client.deployments.list()

Working with deployments

Working with deployments of foundation models is described in section Models/ ModelInference for Deployments.

IBM watsonx.ai software with IBM Cloud Pak® for Data

Note

Available in version IBM watsonx.ai for IBM Cloud Pak® for Data 4.8.4 and later.

This section shows how to list custom models specs, store & deploy model and use ModelInference module with created deployment on IBM watsonx.ai for IBM Cloud Pak® for Data.

Initialize APIClient object

Initialize APIClient object if needed. More details about supported APIClient initialization can be found in Setup section,

from ibm_watsonx_ai import APIClient

client = APIClient(credentials)
client.set.default_project(project_id=project_id)
# or client.set.default_space(space_id=space_id)

Listing models specification

Warning

Only applicable for IBM watsonx.ai for IBM Cloud Pak® for Data 4.8.4 and later.

Warning

The model needs to be explicitly stored & deployed in the repository to be used/listed.

To list available custom models on PVC use example below. To get specification of specific model provide model_id.

from ibm_watsonx_ai.foundation_models import get_custom_model_specs

get_custom_models_spec(api_client=client)
# OR
get_custom_models_spec(credentials=credentials)
# OR
get_custom_models_spec(api_client=client, model_id='mistralai/Mistral-7B-Instruct-v0.2')

Storing model in service repository

To store model as an asset in the repo, first create proper metadata.

sw_spec_id = client.software_specifications.get_id_by_name('watsonx-cfm-caikit-1.0')

metadata = {
    client.repository.ModelMetaNames.NAME: 'custom FM asset',
    client.repository.ModelMetaNames.SOFTWARE_SPEC_ID: sw_spec_id,
    client.repository.ModelMetaNames.TYPE: client.repository.ModelAssetTypes.CUSTOM_FOUNDATION_MODEL_1_0
}

Storing model

After that, it is possible to store model using client.repository.store_model().

stored_model_details = client.repository.store_model(model='mistralai/Mistral-7B-Instruct-v0.2', meta_props=metadata)

To get id of stored asset use the details obtained.

model_asset_id = client.repository.get_model_id(stored_model_details)

All stored models custom foundation models can be listed by client.repository.list() method with filtering by framework type.

client.repository.list(framework_filter='custom_foundation_model_1.0')

Defining hardware specification

For deployment of stored custom foundation model a hardware specifications need to be defined. You can use custom hardware specification or pre-defined T-shirt sizes. APIClient has dedicated module to work with Hardware Specifications. Few key methods are:

  • List all defined hardware specifications:

client.hardware_specifications.list()
  • Retrieve details of defined hardware specifications:

client.hardware_specifications.get_details(client.hardware_specifications.get_id_by_name('M'))
  • Define custom hardware specification:

meta_props = {
    client.hardware_specifications.ConfigurationMetaNames.NAME: "Custom GPU hw spec",
    client.hardware_specifications.ConfigurationMetaNames.NODES:{"cpu":{"units":"2"},"mem":{"size":"128Gi"},"gpu":{"num_gpu":1}}
    }

hw_spec_details = client.hardware_specifications.store(meta_props)

Deployment of custom foundation model

To crete new deployment of custom foundation models dictionary with deployment metadata need to be defined. There can be specified the NAME of new deployment, DESCRIPTION and hardware specification. For now only online deployments are supported so ONLINE field is required. At this stage user can overwrite model parameters optionally. It can be done by passing dictionary with new parameters values in FOUNDATION_MODEL field.

Besides the metadata with deployment configuration the id of stored model asset are required for deployment creation.

metadata = {
    client.deployments.ConfigurationMetaNames.NAME: "Custom FM Deployment",
    client.deployments.ConfigurationMetaNames.DESCRIPTION: "Deployment of custom foundation model with SDK",
    client.deployments.ConfigurationMetaNames.ONLINE: {},
    client.deployments.ConfigurationMetaNames.HARDWARE_SPEC : { "name":  "Custom GPU hw spec"}, # name or id supported here
    client.deployments.ConfigurationMetaNames.FOUNDATION_MODEL: {"max_new_tokens": 128}.  # optional
}
deployment_details = client.deployments.create(model_asset_id, metadata)

Once deployment creation process is done the client.deployments.create returns dictionary with deployment details, which can be used to retrieve the id of the deployment.

deployment_id = client.deployments.get_id(deployment_details)

All existing in working space or project scope can be listed with list method:

client.deployments.list()

Working with deployments

Working with deployments of foundation models is described in section Models/ ModelInference for Deployments.