Deploy on Demand ================ IBM watsonx.ai makes a curated collection of foundation models available for you to deploy on-demand on dedicated hardware for the exclusive use of your organization. By using this approach, you can access the capabilities of these powerful foundation models without the need for extensive computational resources. Foundation models that you deploy on-demand are hosted in a dedicated deployment space where you can use these models for inferencing. - `IBM watsonx.ai for IBM Cloud <#id1>`_ IBM watsonx.ai for IBM Cloud ---------------------------- This section shows how to create task credentials, store and deploy a model, and use the ModelInference module with the created deployment on IBM watsonx.ai for IBM Cloud. Initialize an APIClient object ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Initialize an ``APIClient`` object if needed. For more details about supported ``APIClient`` initialization, see :doc:`setup`. .. code-block:: from ibm_watsonx_ai import APIClient client = APIClient(credentials, project_id=project_id) # or: client = APIClient(credentials, space_id=space_id) Add Task Credentials ^^^^^^^^^^^^^^^^^^^^ .. warning:: If not already added, Task Credentials are required on IBM watsonx.ai for IBM Cloud to make a deployment. With task credentials, you can deploy a curated foundation model and avoid token expiration issues. For more details, see `Adding task credentials `_. To list available task credentials, use the ``list`` method: .. code-block:: client.task_credentials.list() If the list is empty, you can create new task credentials with the ``store`` method: .. code-block:: client.task_credentials.store() To get details of available task credentials, use the ``get_details`` method: .. code-block:: client.task_credentials.get_details() Store the model in the service repository ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To store a model as an asset in the repo, you must first create proper ``metadata``. .. code-block:: python metadata = { client.repository.ModelMetaNames.NAME: "curated FM asset", client.repository.ModelMetaNames.TYPE: client.repository.ModelAssetTypes.CURATED_FOUNDATION_MODEL_1_0, } Store the model ^^^^^^^^^^^^^^^ After creating the proper ``metadata``, you can store the model using ``client.repository.store_model()``. .. code-block:: python stored_model_details = client.repository.store_model(model='ibm/granite-13b-chat-v2-curated', meta_props=metadata) To get the ``id`` of a stored model asset, use the obtained details. .. code-block:: python stored_model_asset_id = client.repository.get_model_id(stored_model_details) List all stored curated foundation models and filter them by framework type. .. code-block:: python client.repository.list(framework_filter='curated_foundation_model_1.0') Deploy the curated foundation model ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To create a new deployment of a curated foundation model, you need to define a dictionary with deployment metadata: ``meta_props``. Specify the deployment ``NAME`` and ``DESCRIPTION`` fields. Only online deployments are supported, so the ``ONLINE`` field is required. Optional: At this stage, you can overwrite model parameters. To overwrite them, pass a dictionary with new parameters values in the ``FOUNDATION_MODEL`` field. You also need to provide the ``stored_model_asset_id`` to create the deployment. .. code-block:: python meta_props = { client.deployments.ConfigurationMetaNames.NAME: "curated_fm_deployment", client.deployments.ConfigurationMetaNames.DESCRIPTION: "Testing deployment using curated foundation model", client.deployments.ConfigurationMetaNames.ONLINE: {}, client.deployments.ConfigurationMetaNames.SERVING_NAME: "test_curated_fm_01" } deployment_details = client.deployments.create(stored_model_asset_id, meta_props) Once the deployment creation process is done, the ``create`` method returns a dictionary with the deployment details, which can be used to retrieve the ``deployment_id``. .. code-block:: python deployment_id = client.deployments.get_id(deployment_details) You can list all existing deployments in the working space or project scope with the ``list`` method: .. code-block:: python client.deployments.list() Work with deployments ^^^^^^^^^^^^^^^^^^^^^ For information on working with foundation model deployments, see :doc:`Models/ ModelInference for Deployments`.