Custom models¶
The custom models of watsonx.ai client might differ depending on the product offering. Go to your product section to see the steps for model deployment.
IBM watsonx.ai for IBM Cloud¶
This section shows how to create task credentials, store and deploy a model, and use the ModelInference module with the created deployment on IBM watsonx.ai for IBM Cloud.
Initialize an APIClient object¶
Initialize an APIClient
object if needed. For more details about supported APIClient
initialization, see Setup.
from ibm_watsonx_ai import APIClient
client = APIClient(credentials)
client.set.default_project(project_id=project_id)
# or client.set.default_space(space_id=space_id)
Add Task Credentials¶
Warning
If not already added, Task Credentials are required on IBM watsonx.ai for IBM Cloud to make a deployment.
With task credentials, you can deploy a custom foundation model and avoid token expiration issues. For more details, see Adding task credentials.
To list available task credentials, use the list
method:
client.task_credentials.list()
If the list is empty, you can create new task credentials with the store
method:
client.task_credentials.store()
To get the status of available task credentials, use the get_details
method:
client.task_credentials.get_details()
Store the model in the service repository¶
To store a model as an asset in the repo, you must first create proper metadata
.
For Cloud, you need to have an active connection to Cloud Object Storage. To see how to create a COS connection, refer to Connection Asset.
Get the parameters for the MODEL_LOCATION
field below from the COS connection.
metadata = {
client.repository.ModelMetaNames.NAME: "custom FM asset,
client.repository.ModelMetaNames.SOFTWARE_SPEC_ID: sw_spec_id,
client.repository.ModelMetaNames.TYPE: client.repository.ModelAssetTypes.CUSTOM_FOUNDATION_MODEL_1_0,
client.repository.ModelMetaNames.MODEL_LOCATION: {
"file_path": "path/to/pvc",
"bucket": "watsonx-llm-models",
"connection_id": "5e891c6b-3aa9-4f01-8e2d-785d81797466",
},
}
Store the model¶
After creating the proper metadata
, you can store the model using client.repository.store_model()
.
stored_model_details = client.repository.store_model(model='Google/flan-t5-small', meta_props=metadata)
To get the id
of a stored asset, use the obtained details.
model_asset_id = client.repository.get_model_id(stored_model_details)
List all stored custom foundation models and filter them by framework type.
client.repository.list(framework_filter='custom_foundation_model_1.0')
Deploy the custom foundation model¶
To create a new deployment of a custom foundation model, you need to define a dictionary with deployment metadata
.
Specify the deployment NAME
, DESCRIPTION
and HARDWARE_REQUEST
fields, and the size
and num_nodes
of the requested hardware.
For size
, enter client.deployments.HardwareRequestSizes.Small
, client.deployments.HardwareRequestSizes.Medium
or client.deployments.HardwareRequestSizes.Large
.
For the num_nodes
field, provide a number.
Only online deployments are supported, so the ONLINE
field is required.
Optional: At this stage, you can overwrite model parameters. To overwrite them, pass a dictionary with new parameters values in the FOUNDATION_MODEL
field.
You also need to provide the id
of the stored model asset to create the deployment.
meta_props = {
client.deployments.ConfigurationMetaNames.NAME: "custom_fm_deployment",
client.deployments.ConfigurationMetaNames.DESCRIPTION: "Testing deployment using custom foundation model",
client.deployments.ConfigurationMetaNames.ONLINE: {},
client.deployments.ConfigurationMetaNames.HARDWARE_REQUEST: {
'size': client.deployments.HardwareRequestSizes.Small,
# or 'size': client.deployments.HardwareRequestSizes.Medium
# or 'size': client.deployments.HardwareRequestSizes.Large
'num_nodes': 1
},
# optionally overwrite model parameters here
client.deployments.ConfigurationMetaNames.FOUNDATION_MODEL: {"max_input_tokens": 256},
client.deployments.ConfigurationMetaNames.SERVING_NAME: "test_byom_fm_01"
}
deployment_details = client.deployments.create(stored_model_asset_id, meta_props)
Once the deployment creation process is done, client.deployments.create
returns a dictionary with the deployment details,
which can be used to retrieve the deployment id
.
deployment_id = client.deployments.get_id(deployment_details)
You can list all existing deployments in the working space or project scope with the list
method:
client.deployments.list()
Work with deployments¶
For information on working with foundation model deployments, see Models/ ModelInference for Deployments.
IBM watsonx.ai software with IBM Cloud Pak® for Data¶
Note
Available in IBM watsonx.ai for IBM Cloud Pak® for Data 4.8.4 and later.
This section shows how to list custom models specifications, store and deploy a model, and use the ModelInference module with the created deployment on IBM watsonx.ai for IBM Cloud Pak® for Data.
Initialize the APIClient object¶
Initialize the APIClient
object if needed. For information about supported APIClient
initialization, see Setup.
from ibm_watsonx_ai import APIClient
client = APIClient(credentials)
client.set.default_project(project_id=project_id)
# or client.set.default_space(space_id=space_id)
List model specifications¶
Warning
Only applicable for IBM watsonx.ai for IBM Cloud Pak® for Data 4.8.4 and later.
Warning
The model needs to be explicitly stored and deployed in the repository to be used/listed.
To list available custom models on PVC, use the code block below. To get the specification of a specific model, provide the model_id
.
from ibm_watsonx_ai.foundation_models import get_custom_model_specs
get_custom_models_spec(api_client=client)
# OR
get_custom_models_spec(credentials=credentials)
# OR
get_custom_models_spec(api_client=client, model_id='mistralai/Mistral-7B-Instruct-v0.2')
Store the model in the service repository¶
To store a model as an asset in the repo, you must first create proper metadata
.
Note
There are two distinct software specifications available for custom models, each optimized for a specific model architecture. These specifications ensure the most effective deployment and utilization of the model’s capabilities:
watsonx-cfm-caikit-1.0
- recommended for models witht5
ormt5
architecture,
watsonx-cfm-caikit-1.1
- recommended for all other model architectures.
sw_spec_id = client.software_specifications.get_id_by_name('watsonx-cfm-caikit-1.0')
metadata = {
client.repository.ModelMetaNames.NAME: 'custom FM asset',
client.repository.ModelMetaNames.SOFTWARE_SPEC_ID: sw_spec_id,
client.repository.ModelMetaNames.TYPE: client.repository.ModelAssetTypes.CUSTOM_FOUNDATION_MODEL_1_0
}
Store the model¶
After storing the model as an asset, you can store the model using client.repository.store_model()
.
stored_model_details = client.repository.store_model(model='mistralai/Mistral-7B-Instruct-v0.2', meta_props=metadata)
To get the id
of the stored asset, use the obtained details.
model_asset_id = client.repository.get_model_id(stored_model_details)
List all stored custom foundation models and filter them by framework type.
client.repository.list(framework_filter='custom_foundation_model_1.0')
Define the hardware specification¶
To deploy a stored custom foundation model, you need to define hardware specifications.
You can use a custom hardware specification or pre-defined T-shirt sizes.
APIClient
has a dedicated module to work with Hardware Specifications. Few key methods are:
List all defined hardware specifications:
client.hardware_specifications.list()
Retrieve the details of defined hardware specifications:
client.hardware_specifications.get_details(client.hardware_specifications.get_id_by_name('M'))
Define custom hardware specifications:
meta_props = {
client.hardware_specifications.ConfigurationMetaNames.NAME: "Custom GPU hw spec",
client.hardware_specifications.ConfigurationMetaNames.NODES:{"cpu":{"units":"2"},"mem":{"size":"128Gi"},"gpu":{"num_gpu":1}}
}
hw_spec_details = client.hardware_specifications.store(meta_props)
Deploy the custom foundation model¶
To create a new deployment of a custom foundation model, you need to define a dictionary with deployment metadata
.
Specify the deployment NAME
, DESCRIPTION
, and hardware specification.
Only online deployments are supported, so the ONLINE
field is required.
Optional: At this stage, you can overwrite model parameters. To overwrite them, pass a dictionary with new parameters values in the FOUNDATION_MODEL
field.
You also need to provide the id
of the stored model asset to create the deployment.
metadata = {
client.deployments.ConfigurationMetaNames.NAME: "Custom FM Deployment",
client.deployments.ConfigurationMetaNames.DESCRIPTION: "Deployment of custom foundation model with SDK",
client.deployments.ConfigurationMetaNames.ONLINE: {},
client.deployments.ConfigurationMetaNames.HARDWARE_SPEC : { "name": "Custom GPU hw spec"}, # name or id supported here
client.deployments.ConfigurationMetaNames.FOUNDATION_MODEL: {"max_new_tokens": 128}. # optional
}
deployment_details = client.deployments.create(model_asset_id, metadata)
Once the deployment creation process is done, client.deployments.create
returns a dictionary with deployment details,
which can be used to retrieve the deployment id
.
deployment_id = client.deployments.get_id(deployment_details)
You can list all existing deployments in the working space or project scope with the list
method:
client.deployments.list()
Work with deployments¶
For information on working with foundation model deployments, see Models/ ModelInference for Deployments.