.. _fm_custom_models: Custom models ============= The custom models of `watsonx.ai` client might differ depending on the product offering. Go to your product section to see the steps for model deployment. - `IBM watsonx.ai for IBM Cloud <#id1>`_ - `IBM watsonx.ai software with IBM Cloud Pak for Data <#id2>`_ IBM watsonx.ai for IBM Cloud ---------------------------- This section shows how to create task credentials, store and deploy a model, and use the ModelInference module with the created deployment on IBM watsonx.ai for IBM Cloud. Initialize an APIClient object ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Initialize an ``APIClient`` object if needed. For more details about supported ``APIClient`` initialization, see :doc:`setup`. .. code-block:: from ibm_watsonx_ai import APIClient client = APIClient(credentials) client.set.default_project(project_id=project_id) # or client.set.default_space(space_id=space_id) Add Task Credentials ^^^^^^^^^^^^^^^^^^^^ .. warning:: If not already added, Task Credentials are required on IBM watsonx.ai for IBM Cloud to make a deployment. With task credentials, you can deploy a custom foundation model and avoid token expiration issues. For more details, see `Adding task credentials `_. To list available task credentials, use the ``list`` method: .. code-block:: client.task_credentials.list() If the list is empty, you can create new task credentials with the ``store`` method: .. code-block:: client.task_credentials.store() To get the status of available task credentials, use the ``get_details`` method: .. code-block:: client.task_credentials.get_details() Store the model in the service repository ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To store a model as an asset in the repo, you must first create proper ``metadata``. For Cloud, you need to have an active connection to Cloud Object Storage. To see how to create a COS connection, refer to :ref:`working-with-connection-asset`. Get the parameters for the ``MODEL_LOCATION`` field below from the COS connection. .. code-block:: python metadata = { client.repository.ModelMetaNames.NAME: "custom FM asset", client.repository.ModelMetaNames.SOFTWARE_SPEC_ID: sw_spec_id, client.repository.ModelMetaNames.TYPE: client.repository.ModelAssetTypes.CUSTOM_FOUNDATION_MODEL_1_0, client.repository.ModelMetaNames.MODEL_LOCATION: { "file_path": "path/to/custom/model/files/inside/bucket", "bucket": "watsonx-llm-models", "connection_id": "5e891c6b-3aa9-4f01-8e2d-785d81797466", }, } Store the model ^^^^^^^^^^^^^^^ After creating the proper ``metadata``, you can store the model using ``client.repository.store_model()``. .. code-block:: python stored_model_details = client.repository.store_model(model='Google/flan-t5-small', meta_props=metadata) To get the ``id`` of a stored asset, use the obtained details. .. code-block:: python stored_model_asset_id = client.repository.get_model_id(stored_model_details) List all stored custom foundation models and filter them by framework type. .. code-block:: python client.repository.list(framework_filter='custom_foundation_model_1.0') Deploy the custom foundation model ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To create a new deployment of a custom foundation model, you need to define a dictionary with deployment ``metadata``. Specify the deployment ``NAME``, ``DESCRIPTION`` and ``HARDWARE_SPEC`` fields, and the ``name`` and ``num_nodes`` of the requested hardware. For the ``name`` field, enter a GPU ID available in your cluster. For more information about listing available GPU configurations, see https://cloud.ibm.com/apidocs/watsonx-ai#mgerrorresponse-yaml. For the ``num_nodes`` field, provide a number. Only online deployments are supported, so the ``ONLINE`` field is required. Optional: At this stage, you can overwrite model parameters. To overwrite them, pass a dictionary with new parameters values in the ``FOUNDATION_MODEL`` field. You also need to provide the ``id`` of the stored model asset to create the deployment. .. code-block:: python meta_props = { client.deployments.ConfigurationMetaNames.NAME: "custom_fm_deployment", client.deployments.ConfigurationMetaNames.DESCRIPTION: "Testing deployment using custom foundation model", client.deployments.ConfigurationMetaNames.ONLINE: {}, client.deployments.ConfigurationMetaNames.HARDWARE_SPEC: { "name": "1a100-80g" "num_nodes": 1 }, # optionally overwrite model parameters here client.deployments.ConfigurationMetaNames.FOUNDATION_MODEL: {"max_input_tokens": 256}, client.deployments.ConfigurationMetaNames.SERVING_NAME: "test_byom_fm_01" } deployment_details = client.deployments.create(stored_model_asset_id, meta_props) Once the deployment creation process is done, ``client.deployments.create`` returns a dictionary with the deployment details, which can be used to retrieve the deployment ``id``. .. code-block:: python deployment_id = client.deployments.get_id(deployment_details) You can list all existing deployments in the working space or project scope with the ``list`` method: .. code-block:: python client.deployments.list() Work with deployments ^^^^^^^^^^^^^^^^^^^^^ For information on working with foundation model deployments, see :doc:`Models/ ModelInference for Deployments`. IBM watsonx.ai software with IBM Cloud Pak® for Data ---------------------------------------------------- .. note:: Available in IBM watsonx.ai for IBM Cloud Pak® for Data 5.0 and later. This section shows how to list custom models specifications, store and deploy a model, and use the ModelInference module with the created deployment on IBM watsonx.ai for IBM Cloud Pak® for Data. Initialize the APIClient object ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Initialize the ``APIClient`` object if needed. For information about supported ``APIClient`` initialization, see :doc:`setup`. .. code-block:: from ibm_watsonx_ai import APIClient client = APIClient(credentials) client.set.default_project(project_id=project_id) # or client.set.default_space(space_id=space_id) List model specifications ^^^^^^^^^^^^^^^^^^^^^^^^^ .. warning:: Only applicable for IBM watsonx.ai for IBM Cloud Pak® for Data 5.0 and later. .. warning:: The model needs to be explicitly stored and deployed in the repository to be used/listed. To list available custom models on PVC, use the code block below. To get the specification of a specific model, provide the ``model_id``. .. code-block:: python from ibm_watsonx_ai.foundation_models import get_custom_model_specs get_custom_model_specs(api_client=client) # OR get_custom_model_specs(credentials=credentials) # OR get_custom_model_specs(api_client=client, model_id='mistralai/Mistral-7B-Instruct-v0.2') Store the model in the service repository ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To store a model as an asset in the repo, you must first create proper ``metadata``. .. note:: There are two distinct software specifications available for custom models, each optimized for a specific model architecture. These specifications ensure the most effective deployment and utilization of the model's capabilities: * ``watsonx-cfm-caikit-1.0`` - recommended for models with ``t5`` or ``mt5`` architecture, * ``watsonx-cfm-caikit-1.1`` - recommended for all other model architectures. .. code-block:: python sw_spec_id = client.software_specifications.get_id_by_name('watsonx-cfm-caikit-1.0') metadata = { client.repository.ModelMetaNames.NAME: 'custom FM asset', client.repository.ModelMetaNames.SOFTWARE_SPEC_ID: sw_spec_id, client.repository.ModelMetaNames.TYPE: client.repository.ModelAssetTypes.CUSTOM_FOUNDATION_MODEL_1_0 } Store the model ^^^^^^^^^^^^^^^ After storing the model as an asset, you can store the model using ``client.repository.store_model()``. .. code-block:: python stored_model_details = client.repository.store_model(model='mistralai/Mistral-7B-Instruct-v0.2', meta_props=metadata) To get the ``id`` of the stored asset, use the obtained details. .. code-block:: python model_asset_id = client.repository.get_model_id(stored_model_details) List all stored custom foundation models and filter them by framework type. .. code-block:: python client.repository.list(framework_filter='custom_foundation_model_1.0') Define the hardware specification ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To deploy a stored custom foundation model, you need to define hardware specifications. You can use a custom hardware specification or pre-defined T-shirt sizes. ``APIClient`` has a dedicated module to work with :ref:`Hardware Specifications`. Few key methods are: - List all defined hardware specifications: .. code-block:: python client.hardware_specifications.list() - Retrieve the details of defined hardware specifications: .. code-block:: python client.hardware_specifications.get_details(client.hardware_specifications.get_id_by_name('M')) - Define custom hardware specifications: .. code-block:: python meta_props = { client.hardware_specifications.ConfigurationMetaNames.NAME: "Custom GPU hw spec", client.hardware_specifications.ConfigurationMetaNames.NODES: {"cpu":{"units":"2"},"mem":{"size":"128Gi"},"gpu":{"num_gpu":1}} } hw_spec_details = client.hardware_specifications.store(meta_props) Deploy the custom foundation model ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To create a new deployment of a custom foundation model, you need to define a dictionary with deployment ``metadata``. Specify the deployment ``NAME``, ``DESCRIPTION``, and hardware specification. Only online deployments are supported, so the ``ONLINE`` field is required. Optional: At this stage, you can overwrite model parameters. To overwrite them, pass a dictionary with new parameters values in the ``FOUNDATION_MODEL`` field. You also need to provide the ``id`` of the stored model asset to create the deployment. .. code-block:: python metadata = { client.deployments.ConfigurationMetaNames.NAME: "Custom FM Deployment", client.deployments.ConfigurationMetaNames.DESCRIPTION: "Deployment of custom foundation model with SDK", client.deployments.ConfigurationMetaNames.ONLINE: {}, client.deployments.ConfigurationMetaNames.HARDWARE_SPEC : {"name": "Custom GPU hw spec"}, # name or id supported here client.deployments.ConfigurationMetaNames.FOUNDATION_MODEL: {"max_new_tokens": 128} # optional } deployment_details = client.deployments.create(model_asset_id, metadata) Once the deployment creation process is done, ``client.deployments.create`` returns a dictionary with deployment details, which can be used to retrieve the deployment ``id``. .. code-block:: python deployment_id = client.deployments.get_id(deployment_details) You can list all existing deployments in the working space or project scope with the ``list`` method: .. code-block:: python client.deployments.list() Work with deployments ^^^^^^^^^^^^^^^^^^^^^ For information on working with foundation model deployments, see :doc:`Models/ ModelInference for Deployments`.