Custom models

Note

Available in version IBM watsonx.ai software with IBM Cloud Pak for Data 4.8.4 and higher.

This section shows how to list custom models specs, store & deploy model and use ModelInference module with created deployment.

Initialize APIClient object

Initialize APIClient object if needed. More details about supported APIClient initialization can be found in Setup section,

from ibm_watsonx_ai import APIClient

client = APIClient(credentials)
client.set.default_project(project_id=project_id)
# or client.set.default_space(space_id=space_id)

Listing models specification

Warning

The model needs to be explicitly stored & deployed in the repository to be used/listed.

To list available custom models on PVC use example below. To get specification of specific model provide model_id.

from ibm_watsonx_ai.foundation_models import get_custom_model_specs

get_custom_models_spec(api_client=client)
# OR
get_custom_models_spec(credentials=credentials)
# OR
get_custom_models_spec(api_client=client, model_id='mistralai/Mistral-7B-Instruct-v0.2')

Storing model in service repository

To store model as an asset in the repo, first create proper metadata.

sw_spec_id = client.software_specifications.get_id_by_name('watsonx-cfm-caikit-1.0')

metadata = {
    client.repository.ModelMetaNames.NAME: 'custom FM asset',
    client.repository.ModelMetaNames.SOFTWARE_SPEC_UID: sw_spec_id,
    client.repository.ModelMetaNames.TYPE: client.repository.ModelAssetTypes.CUSTOM_FOUNDATION_MODEL_1_0
}

After that, it is possible to store model using client.repository.store_model().

stored_model_details = client.repository.store_model(model='mistralai/Mistral-7B-Instruct-v0.2', meta_props=metadata)

To get id of stored asset use the details obtained.

model_asset_id = client.repository.get_model_id(stored_model_details)

All stored models custom foundation models can be listed by client.repository.list() method with filtering by framework type.

client.repository.list(framework_filter='custom_foundation_model_1.0')

Defining hardware specification

For deployment of stored custom foundation model a hardware specifications need to be defined. You can use custom hardware specification or pre-defined T-shirt sizes. APIClient has dedicated module to work with Hardware Specifications. Few key methods are:

  • List all defined hardware specifications:

client.hardware_specifications.list()
  • Retrieve details of defined hardware specifications:

client.hardware_specifications.get_details(client.hardware_specifications.get_id_by_name('M'))
  • Define custom hardware specification:

meta_props = {
    client.hardware_specifications.ConfigurationMetaNames.NAME: "Custom GPU hw spec",
    client.hardware_specifications.ConfigurationMetaNames.NODES:{"cpu":{"units":"2"},"mem":{"size":"128Gi"},"gpu":{"num_gpu":1}}
    }

hw_spec_details = client.hardware_specifications.store(meta_props)

Deployment of custom foundation model

To crete new deployment of custom foundation models dictionary with deployment metadata need to be defined. There can be specified the NAME of new deployment, DESCRIPTION and hardware specification. For now only online deployments are supported so ONLINE field is required. At this stage user can overwrite model parameters optionally. It can be done by passing dictionary with new parameters values in FOUNDATION_MODEL field.

Besides the metadata with deployment configuration the id of stored model asset are required for deployment creation.

metadata = {
    client.deployments.ConfigurationMetaNames.NAME: "Custom FM Deployment",
    client.deployments.ConfigurationMetaNames.DESCRIPTION: "Deployment of custom foundation model with SDK",
    client.deployments.ConfigurationMetaNames.ONLINE: {},
    client.deployments.ConfigurationMetaNames.HARDWARE_SPEC : { "name":  "Custom GPU hw spec"}, # name or id supported here
    client.deployments.ConfigurationMetaNames.FOUNDATION_MODEL: {"max_new_tokens": 128}.  # optional
}
deployment_details = client.deployments.create(model_asset_id, metadata)

Once deployment creation process is done the client.deployments.create returns dictionary with deployment details, which can be used to retrieve the id of the deployment.

deployment_id = client.deployments.get_id(deployment_details)

All existing in working space or project scope can be listed with list method:

client.deployments.list()

Working with deployments

Working with deployments of foundation models is described in section Models/ ModelInference for Deployments.