Custom models ============= The custom models of `watsonx.ai` client might differ depending on the product offering. Choose an option from the list below to see steps. - `IBM watsonx.ai for IBM Cloud <#id1>`_ - `IBM watsonx.ai software with IBM Cloud Pak for Data <#id2>`_ IBM watsonx.ai for IBM Cloud ---------------------------- This section shows how to create task credentials, store & deploy model and use ModelInference module with created deployment on the IBM watsonx.ai for IBM Cloud. Initialize APIClient object ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Initialize ``APIClient`` object if needed. More details about supported ``APIClient`` initialization can be found in :doc:`setup` section, .. code-block:: from ibm_watsonx_ai import APIClient client = APIClient(credentials) client.set.default_project(project_id=project_id) # or client.set.default_space(space_id=space_id) Add Task Credentials ^^^^^^^^^^^^^^^^^^^^ .. warning:: If not already added, a Task Credentials are required on IBM watsonx.ai for IBM Cloud to make a deployment. Task credentials enable you to deploy a custom foundation model and avoid token expiration issues. More details can be found in: `Adding task credentials `_. To list available task credentials, use ``list`` method: .. code-block:: client.task_credentials.list() If the list is empty, you can create new task credentials by using ``store`` method: .. code-block:: client.task_credentials.store() To get status of available task credentials, use ``get_details`` method: .. code-block:: client.task_credentials.get_details() Storing model in service repository ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To store model as an asset in the repo, first create proper ``metadata``. In Cloud scenario, there is a need to have an active connection to Cloud Object Storage. For more information and how to create COS connection, see :ref:`working-with-connection-asset`. From such connection, the user can obtain the parameters needed to fill ``MODEL_LOCATION`` field. .. code-block:: python metadata = { client.repository.ModelMetaNames.NAME: "custom FM asset, client.repository.ModelMetaNames.SOFTWARE_SPEC_ID: sw_spec_id, client.repository.ModelMetaNames.TYPE: client.repository.ModelAssetTypes.CUSTOM_FOUNDATION_MODEL_1_0, client.repository.ModelMetaNames.MODEL_LOCATION: { "file_path": "path/to/pvc", "bucket": "watsonx-llm-models", "connection_id": "5e891c6b-3aa9-4f01-8e2d-785d81797466", }, } Storing model ^^^^^^^^^^^^^ After that, it is possible to store model using ``client.repository.store_model()``. .. code-block:: python stored_model_details = client.repository.store_model(model='Google/flan-t5-small', meta_props=metadata) To get ``id`` of stored asset use the details obtained. .. code-block:: python model_asset_id = client.repository.get_model_id(stored_model_details) All stored models custom foundation models can be listed by client.repository.list() method with filtering by framework type. .. code-block:: python client.repository.list(framework_filter='custom_foundation_model_1.0') Deployment of custom foundation model ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To crete new deployment of custom foundation models dictionary with deployment ``metadata`` need to be defined. There can be specified the ``NAME`` of new deployment, ``DESCRIPTION`` and ``HARDWARE_REQUEST`` fields. The requested hardware must have specific ``size`` and ``num_nodes``. If it comes to size user can use ``client.deployments.HardwareRequestSizes.Small`` or ``client.deployments.HardwareRequestSizes.Medium``. Where ``num_nodes`` is a number. For now only online deployments are supported so ``ONLINE`` field is required. At this stage user can overwrite model parameters optionally. It can be done by passing dictionary with new parameters values in ``FOUNDATION_MODEL`` field. Besides the ``metadata`` with deployment configuration the ``id`` of stored model asset are required for deployment creation. .. code-block:: python meta_props = { client.deployments.ConfigurationMetaNames.NAME: "custom_fm_deployment", client.deployments.ConfigurationMetaNames.DESCRIPTION: "Testing deployment using custom foundation model", client.deployments.ConfigurationMetaNames.ONLINE: {}, client.deployments.ConfigurationMetaNames.HARDWARE_REQUEST: { 'size': client.deployments.HardwareRequestSizes.Small, # or 'size': client.deployments.HardwareRequestSizes.Medium 'num_nodes': 1 }, # optionally overwrite model parameters here client.deployments.ConfigurationMetaNames.FOUNDATION_MODEL: {"max_input_tokens": 256}, client.deployments.ConfigurationMetaNames.SERVING_NAME: "test_byom_fm_01" } deployment_details = client.deployments.create(stored_model_asset_id, meta_props) Once deployment creation process is done the ``client.deployments.create`` returns dictionary with deployment details, which can be used to retrieve the ``id`` of the deployment. .. code-block:: python deployment_id = client.deployments.get_id(deployment_details) All existing in working space or project scope can be listed with ``list`` method: .. code-block:: python client.deployments.list() Working with deployments ^^^^^^^^^^^^^^^^^^^^^^^^ Working with deployments of foundation models is described in section :doc:`Models/ ModelInference for Deployments`. IBM watsonx.ai software with IBM Cloud Pak® for Data ---------------------------------------------------- .. note:: Available in version IBM watsonx.ai for IBM Cloud Pak® for Data 4.8.4 and later. This section shows how to list custom models specs, store & deploy model and use ModelInference module with created deployment on IBM watsonx.ai for IBM Cloud Pak® for Data. Initialize APIClient object ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Initialize ``APIClient`` object if needed. More details about supported ``APIClient`` initialization can be found in :doc:`setup` section, .. code-block:: from ibm_watsonx_ai import APIClient client = APIClient(credentials) client.set.default_project(project_id=project_id) # or client.set.default_space(space_id=space_id) Listing models specification ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. warning:: Only applicable for IBM watsonx.ai for IBM Cloud Pak® for Data 4.8.4 and later. .. warning:: The model needs to be explicitly stored & deployed in the repository to be used/listed. To list available custom models on PVC use example below. To get specification of specific model provide ``model_id``. .. code-block:: python from ibm_watsonx_ai.foundation_models import get_custom_model_specs get_custom_models_spec(api_client=client) # OR get_custom_models_spec(credentials=credentials) # OR get_custom_models_spec(api_client=client, model_id='mistralai/Mistral-7B-Instruct-v0.2') Storing model in service repository ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To store model as an asset in the repo, first create proper ``metadata``. .. code-block:: python sw_spec_id = client.software_specifications.get_id_by_name('watsonx-cfm-caikit-1.0') metadata = { client.repository.ModelMetaNames.NAME: 'custom FM asset', client.repository.ModelMetaNames.SOFTWARE_SPEC_ID: sw_spec_id, client.repository.ModelMetaNames.TYPE: client.repository.ModelAssetTypes.CUSTOM_FOUNDATION_MODEL_1_0 } Storing model ^^^^^^^^^^^^^ After that, it is possible to store model using ``client.repository.store_model()``. .. code-block:: python stored_model_details = client.repository.store_model(model='mistralai/Mistral-7B-Instruct-v0.2', meta_props=metadata) To get ``id`` of stored asset use the details obtained. .. code-block:: python model_asset_id = client.repository.get_model_id(stored_model_details) All stored models custom foundation models can be listed by client.repository.list() method with filtering by framework type. .. code-block:: python client.repository.list(framework_filter='custom_foundation_model_1.0') Defining hardware specification ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For deployment of stored custom foundation model a hardware specifications need to be defined. You can use custom hardware specification or pre-defined T-shirt sizes. ``APIClient`` has dedicated module to work with :ref:`Hardware Specifications`. Few key methods are: - List all defined hardware specifications: .. code-block:: python client.hardware_specifications.list() - Retrieve details of defined hardware specifications: .. code-block:: python client.hardware_specifications.get_details(client.hardware_specifications.get_id_by_name('M')) - Define custom hardware specification: .. code-block:: python meta_props = { client.hardware_specifications.ConfigurationMetaNames.NAME: "Custom GPU hw spec", client.hardware_specifications.ConfigurationMetaNames.NODES:{"cpu":{"units":"2"},"mem":{"size":"128Gi"},"gpu":{"num_gpu":1}} } hw_spec_details = client.hardware_specifications.store(meta_props) Deployment of custom foundation model ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To crete new deployment of custom foundation models dictionary with deployment ``metadata`` need to be defined. There can be specified the ``NAME`` of new deployment, ``DESCRIPTION`` and hardware specification. For now only online deployments are supported so ``ONLINE`` field is required. At this stage user can overwrite model parameters optionally. It can be done by passing dictionary with new parameters values in ``FOUNDATION_MODEL`` field. Besides the ``metadata`` with deployment configuration the ``id`` of stored model asset are required for deployment creation. .. code-block:: python metadata = { client.deployments.ConfigurationMetaNames.NAME: "Custom FM Deployment", client.deployments.ConfigurationMetaNames.DESCRIPTION: "Deployment of custom foundation model with SDK", client.deployments.ConfigurationMetaNames.ONLINE: {}, client.deployments.ConfigurationMetaNames.HARDWARE_SPEC : { "name": "Custom GPU hw spec"}, # name or id supported here client.deployments.ConfigurationMetaNames.FOUNDATION_MODEL: {"max_new_tokens": 128}. # optional } deployment_details = client.deployments.create(model_asset_id, metadata) Once deployment creation process is done the ``client.deployments.create`` returns dictionary with deployment details, which can be used to retrieve the ``id`` of the deployment. .. code-block:: python deployment_id = client.deployments.get_id(deployment_details) All existing in working space or project scope can be listed with ``list`` method: .. code-block:: python client.deployments.list() Working with deployments ^^^^^^^^^^^^^^^^^^^^^^^^ Working with deployments of foundation models is described in section :doc:`Models/ ModelInference for Deployments`.