Custom models
=============

The custom models of `watsonx.ai` client might differ depending on the product offering. Choose an option from the list below
to see steps.

- `IBM watsonx.ai for IBM Cloud <#id1>`_
- `IBM watsonx.ai software with IBM Cloud Pak for Data <#id2>`_


IBM watsonx.ai for IBM Cloud
----------------------------

This section shows how to create task credentials, store & deploy model and use ModelInference module with created deployment on the IBM watsonx.ai for IBM Cloud.

Initialize APIClient object
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Initialize ``APIClient`` object if needed. More details about supported ``APIClient`` initialization can be found in :doc:`setup` section,

.. code-block::

    from ibm_watsonx_ai import APIClient

    client = APIClient(credentials)
    client.set.default_project(project_id=project_id)
    # or client.set.default_space(space_id=space_id)

Add Task Credentials
^^^^^^^^^^^^^^^^^^^^

.. warning::
    If not already added, a Task Credentials are required on IBM watsonx.ai for IBM Cloud to make a deployment.

Task credentials enable you to deploy a custom foundation model and avoid token expiration issues.
More details can be found in: `Adding task credentials <https://www.ibm.com/docs/en/watsonx/saas?topic=projects-adding-task-credentials>`_.

To list available task credentials, use ``list`` method:

.. code-block::

    client.task_credentials.list()

If the list is empty, you can create new task credentials by using ``store`` method:

.. code-block::

    client.task_credentials.store()

To get status of available task credentials, use ``get_details`` method:

.. code-block::

    client.task_credentials.get_details()


Storing model in service repository
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To store model as an asset in the repo, first create proper ``metadata``.

In Cloud scenario, there is a need to have an active connection to Cloud Object Storage. 
For more information and how to create COS connection, see :ref:`working-with-connection-asset`. 
From such connection, the user can obtain the parameters needed to fill ``MODEL_LOCATION`` field.

.. code-block:: python

    metadata = {
        client.repository.ModelMetaNames.NAME: "custom FM asset,
        client.repository.ModelMetaNames.SOFTWARE_SPEC_ID: sw_spec_id,
        client.repository.ModelMetaNames.TYPE: client.repository.ModelAssetTypes.CUSTOM_FOUNDATION_MODEL_1_0,
        client.repository.ModelMetaNames.MODEL_LOCATION: {
            "file_path": "path/to/pvc",
            "bucket": "watsonx-llm-models",
            "connection_id": "5e891c6b-3aa9-4f01-8e2d-785d81797466",
        },
    }

Storing model
^^^^^^^^^^^^^

After that, it is possible to store model using ``client.repository.store_model()``.

.. code-block:: python

    stored_model_details = client.repository.store_model(model='Google/flan-t5-small', meta_props=metadata)

To get ``id`` of stored asset use the details obtained.

.. code-block:: python

    model_asset_id = client.repository.get_model_id(stored_model_details)


All stored models custom foundation models can be listed by client.repository.list() method with filtering by framework type.

.. code-block:: python

    client.repository.list(framework_filter='custom_foundation_model_1.0')

Deployment of custom foundation model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
To crete new deployment of custom foundation models dictionary with deployment ``metadata`` need to be defined.
There can be specified the ``NAME`` of new deployment, ``DESCRIPTION`` and ``HARDWARE_REQUEST`` fields. 
The requested hardware must have specific ``size`` and ``num_nodes``. 
If it comes to size user can use ``client.deployments.HardwareRequestSizes.Small`` or ``client.deployments.HardwareRequestSizes.Medium``.
Where ``num_nodes`` is a number.

For now only online deployments are supported so ``ONLINE`` field is required.
At this stage user can overwrite model parameters optionally.
It can be done by passing dictionary with new parameters values in ``FOUNDATION_MODEL`` field.

Besides the ``metadata`` with deployment configuration the ``id`` of stored model asset are required for deployment creation.

.. code-block:: python

    meta_props = {
        client.deployments.ConfigurationMetaNames.NAME: "custom_fm_deployment",
        client.deployments.ConfigurationMetaNames.DESCRIPTION: "Testing deployment using custom foundation model",
        client.deployments.ConfigurationMetaNames.ONLINE: {},
        client.deployments.ConfigurationMetaNames.HARDWARE_REQUEST: {
            'size': client.deployments.HardwareRequestSizes.Small, 
            # or 'size': client.deployments.HardwareRequestSizes.Medium
            'num_nodes': 1
        },
        # optionally overwrite model parameters here
        client.deployments.ConfigurationMetaNames.FOUNDATION_MODEL: {"max_input_tokens": 256}, 
        client.deployments.ConfigurationMetaNames.SERVING_NAME: "test_byom_fm_01"
    }
    deployment_details = client.deployments.create(stored_model_asset_id, meta_props)


Once deployment creation process is done the ``client.deployments.create`` returns dictionary with deployment details,
which can be used to retrieve the ``id`` of the deployment.

.. code-block:: python

    deployment_id = client.deployments.get_id(deployment_details)


All existing in working space or project scope can be listed with ``list`` method:

.. code-block:: python

    client.deployments.list()


Working with deployments
^^^^^^^^^^^^^^^^^^^^^^^^

Working with deployments of foundation models is described in section :doc:`Models/ ModelInference for Deployments<fm_deployments>`.

IBM watsonx.ai software with IBM Cloud Pak® for Data
----------------------------------------------------

.. note::
    Available in version IBM watsonx.ai for IBM Cloud Pak® for Data 4.8.4 and later.

This section shows how to list custom models specs, store & deploy model and use ModelInference module with created deployment on IBM watsonx.ai for IBM Cloud Pak® for Data.

Initialize APIClient object
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Initialize ``APIClient`` object if needed. More details about supported ``APIClient`` initialization can be found in :doc:`setup` section,

.. code-block::

    from ibm_watsonx_ai import APIClient

    client = APIClient(credentials)
    client.set.default_project(project_id=project_id)
    # or client.set.default_space(space_id=space_id)

Listing models specification
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. warning::
    Only applicable for IBM watsonx.ai for IBM Cloud Pak® for Data 4.8.4 and later.

.. warning::
    The model needs to be explicitly stored & deployed in the repository to be used/listed.

To list available custom models on PVC use example below. To get specification of specific model provide ``model_id``.

.. code-block:: python

    from ibm_watsonx_ai.foundation_models import get_custom_model_specs

    get_custom_models_spec(api_client=client)
    # OR
    get_custom_models_spec(credentials=credentials)
    # OR
    get_custom_models_spec(api_client=client, model_id='mistralai/Mistral-7B-Instruct-v0.2')

Storing model in service repository
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To store model as an asset in the repo, first create proper ``metadata``.

.. code-block:: python

    sw_spec_id = client.software_specifications.get_id_by_name('watsonx-cfm-caikit-1.0')

    metadata = {
        client.repository.ModelMetaNames.NAME: 'custom FM asset',
        client.repository.ModelMetaNames.SOFTWARE_SPEC_ID: sw_spec_id,
        client.repository.ModelMetaNames.TYPE: client.repository.ModelAssetTypes.CUSTOM_FOUNDATION_MODEL_1_0 
    }

Storing model
^^^^^^^^^^^^^

After that, it is possible to store model using ``client.repository.store_model()``.

.. code-block:: python

    stored_model_details = client.repository.store_model(model='mistralai/Mistral-7B-Instruct-v0.2', meta_props=metadata)

To get ``id`` of stored asset use the details obtained.

.. code-block:: python

    model_asset_id = client.repository.get_model_id(stored_model_details)


All stored models custom foundation models can be listed by client.repository.list() method with filtering by framework type.

.. code-block:: python

    client.repository.list(framework_filter='custom_foundation_model_1.0')


Defining hardware specification
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For deployment of stored custom foundation model a hardware specifications need to be defined.
You can use custom hardware specification or pre-defined T-shirt sizes.
``APIClient`` has dedicated module to work with :ref:`Hardware Specifications<core-api-hardware-specification>`. Few key methods are:

- List all defined hardware specifications:

.. code-block:: python

    client.hardware_specifications.list()

- Retrieve details of defined hardware specifications:

.. code-block:: python

    client.hardware_specifications.get_details(client.hardware_specifications.get_id_by_name('M'))

- Define custom hardware specification:

.. code-block:: python

    meta_props = {
        client.hardware_specifications.ConfigurationMetaNames.NAME: "Custom GPU hw spec",
        client.hardware_specifications.ConfigurationMetaNames.NODES:{"cpu":{"units":"2"},"mem":{"size":"128Gi"},"gpu":{"num_gpu":1}}
        }

    hw_spec_details = client.hardware_specifications.store(meta_props)


Deployment of custom foundation model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
To crete new deployment of custom foundation models dictionary with deployment ``metadata`` need to be defined.
There can be specified the ``NAME`` of new deployment, ``DESCRIPTION`` and hardware specification.
For now only online deployments are supported so ``ONLINE`` field is required.
At this stage user can overwrite model parameters optionally.
It can be done by passing dictionary with new parameters values in ``FOUNDATION_MODEL`` field.

Besides the ``metadata`` with deployment configuration the ``id`` of stored model asset are required for deployment creation.

.. code-block:: python

    metadata = {
        client.deployments.ConfigurationMetaNames.NAME: "Custom FM Deployment",
        client.deployments.ConfigurationMetaNames.DESCRIPTION: "Deployment of custom foundation model with SDK",
        client.deployments.ConfigurationMetaNames.ONLINE: {},
        client.deployments.ConfigurationMetaNames.HARDWARE_SPEC : { "name":  "Custom GPU hw spec"}, # name or id supported here
        client.deployments.ConfigurationMetaNames.FOUNDATION_MODEL: {"max_new_tokens": 128}.  # optional
    }
    deployment_details = client.deployments.create(model_asset_id, metadata)


Once deployment creation process is done the ``client.deployments.create`` returns dictionary with deployment details,
which can be used to retrieve the ``id`` of the deployment.

.. code-block:: python

    deployment_id = client.deployments.get_id(deployment_details)


All existing in working space or project scope can be listed with ``list`` method:

.. code-block:: python

    client.deployments.list()


Working with deployments
^^^^^^^^^^^^^^^^^^^^^^^^

Working with deployments of foundation models is described in section :doc:`Models/ ModelInference for Deployments<fm_deployments>`.