Skip to main contentCloud Pak Deployer Monitor

Manual Deployment of Watson Machine Learning Deployment Space Job Information

This page will go through all manual steps to deploy the Watson Machine Learning Deployment Space Job Information monitor, and in addition to delete it.

The following pre-requisites are assumed:

This manual deployment will be based on:

  • Source of the Monitor is located in a Git Repository
  • The Image will be pushed to the internal OpenShift Image Registry

Deploy Watson Machine Learning Deployment Space Job Information

Build the Monitor Image and push to the registry

export CP4D_PROJECT=<CP4D_PROJECT>
export OPENSHIFT_IMAGE_REGISTRY=image-registry.openshift-image-registry.svc:5000/${CP4D_PROJECT}
oc new-build https://github.com/IBM/cp4d-monitors \
--context-dir cp4d-wml-deployment-space-job-info \
--name cp4d-platform-wml-deployment-space-job-info \
--to ${OPENSHIFT_IMAGE_REGISTRY}/cp4d-platform-wml-deployment-space-job-info:latest \
--to-docker=true \
--namespace ${CP4D_PROJECT}

Wait for the build to complete successfully

export CP4D_PROJECT=<CP4D_PROJECT>
oc wait -n ${CP4D_PROJECT} --for=condition=Complete build/cp4d-platform-wml-deployment-space-job-info-1 --timeout=300s

or to monitor the build process:

export CP4D_PROJECT=<CP4D_PROJECT>
oc logs -n ${CP4D_PROJECT} build/cp4d-platform-wml-deployment-space-job-info-1 -f

Ensure the build finishes with the message Push successful

Create the Cloud Pak for Data zen-watchdog monitor configuration

export CP4D_PROJECT=<CP4D_PROJECT>
export OPENSHIFT_IMAGE_REGISTRY=image-registry.openshift-image-registry.svc:5000/${CP4D_PROJECT}
export ICPDATA_ADDON_VERSION=$(oc get po -l component=zen-watcher -o jsonpath='{.items[0].metadata.annotations.productVersion}')
cat << EOF | oc apply -f -
kind: ConfigMap
apiVersion: v1
metadata:
name: zen-alert-cp4d-platform-wml-deployment-space-job-info-monitor-extension

Note: Once the ConfigMap above is created, the zen-watcher pod will detect it. Please check the log of zen-watcher pod for details. For example:

export CP4D_PROJECT=<CP4D_PROJECT>
export ZEN_WATCHER_POD=$(oc get po -l component=zen-watcher -o custom-columns=CONTAINER:.metadata.name --no-headers)
oc logs ${ZEN_WATCHER_POD}
time="2022-05-31 05:25:59" level=info msg=processConfigData event="adding extensions from zen-alert-cp4d-platform-wml-deployment-space-job-info-monitor-extension to the database"
time="2022-05-31 05:25:59" level=info msg=CleanUpStaleExtensions event="upgrade extensions: removing stale extensions from zen-alert-cp4d-platform-wml-deployment-space-job-info-monitor-extension to the database"
time="2022-05-31 05:25:59" level=info msg=processExtensionHandler event="processing action: update for extension" extension_name=zen_alert_cp4d_platform_wml_deployment_space_job_info
time="2022-05-31 05:25:59" level=info msg=watchConfigMap event="config zen-alert-cp4d-platform-wml-deployment-space-job-info-monitor-extension added"

Wait for zen-watchdog to create cronjob

Get the watchdog-alert-monitoring-cronjob cronjob details of Cloud Pak for Data

export CP4D_PROJECT=<CP4D_PROJECT>
oc get cronjob watchdog-alert-monitoring-cronjob -n ${CP4D_PROJECT}
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
watchdog-alert-monitoring-cronjob */10 * * * * False 0 3m46s 7d3h

This cronjob must run in order for the Watson Studio Job cronjob to be created. Optionally the schedule can be changed to trigger its execution. The pod zen-watchdog can be monitored for any error messages:

export CP4D_PROJECT=<CP4D_PROJECT>
export ZEN_WATCHDOG_POD=$(oc get po -n ${CP4D_PROJECT} -l component=zen-watchdog -o custom-columns=CONTAINER:.metadata.name --no-headers)
oc logs -n ${CP4D_PROJECT} ${ZEN_WATCHDOG_POD} -f

The new monitor cronjob is created:

export CP4D_PROJECT=<CP4D_PROJECT>
oc get cronjob -n ${CP4D_PROJECT}
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
cp4dplatformwmldeploymentspacejobinfo-cronjob */15 * * * * False 0 <none> 3s

Most monitors require access to the Cloud Pak for Data /user-home folder to cache information. To test whether this mount point is already present on the monitor use the following command:

export CP4D_PROJECT=<CP4D_PROJECT>
export CP4D_CRONJOB=cp4dplatformwmldeploymentspacejobinfo-cronjob
oc set volume -n ${CP4D_PROJECT} cronjobs/${CP4D_CRONJOB} | grep "mounted at /user-home" | wc -l

If the result is 0, patch the cronjob using the following command:

oc patch cronjob -n ${CP4D_PROJECT} ${CP4D_CRONJOB} \
--type=json \
--patch '[{"op": "add","path": "/spec/jobTemplate/spec/template/spec/containers/0/volumeMounts/-","value": {"name": "user-home-mount","mountPath": "/user-home"}}]'

Based on the schedule the cronjob will be executed. This will create a pod, which can be monitored:

export CP4D_PROJECT=<CP4D_PROJECT>
oc logs -n ${CP4D_PROJECT} <PODNAME>

Rebuilding the image

When changes are applied to the monitor, restarting the Build Config will re-build and push the image to the image registry. No other changed are required. The next time the cronjob is executed, the new version of the monitor image will be used

export CP4D_PROJECT=<CP4D_PROJECT>
export CP4D_CRONJOB=cp4dplatformwmldeploymentspacejobinfo-cronjob
oc start-build -n ${CP4D_PROJECT} cp4d-platform-wml-deployment-space-job-info

Monitor the build using (use -2, -3 etc, based on the created build by the previous command):

export CP4D_PROJECT=<CP4D_PROJECT>
oc logs -n ${CP4D_PROJECT} build/cp4d-platform-wml-deployment-space-job-info-2 -f

Patch the cronjob so it will Always pull the image to ensure it will fetch the latest version once triggered

oc patch cronjob -n ${CP4D_PROJECT} ${CP4D_CRONJOB} \
--type=json \
--patch '[{"op":"replace","path":"/spec/jobTemplate/spec/template/spec/containers/0/imagePullPolicy","value":"Always"}]'

Remove the Monitor

Use the following commands to delete the monitor

export CP4D_PROJECT=<CP4D_PROJECT>
export CP4D_CRONJOB=cp4dplatformwmldeploymentspacejobinfo-cronjob
oc delete bc -n ${CP4D_PROJECT} cp4d-platform-wml-deployment-space-job-info
oc delete cm -n ${CP4D_PROJECT} zen-alert-cp4d-platform-wml-deployment-space-job-info-monitor-extension
oc delete cronjob -n ${CP4D_PROJECT} ${CP4D_CRONJOB}
oc delete cm -n ${CP4D_PROJECT} cp4d-monitor-configuration

Note: Please only delete the configmap cp4d-monitor-configuration if no other monitors are deployed. The configmap cp4d-monitor-configuration is generated by monitor. When monitor is deployed and scheduled to run, it will create the configmap with its default values.

"cp4d-job-last-refresh" : "0"
"cp4d-job-refresh-interval-minutes": "120"
"cp4d-project-last-refresh": "0"
"cp4d-project-refresh-interval-minutes": "240"
"cp4d-space-last-refresh": "0"
"cp4d-space-refresh-interval-minutes": "120"
"cp4d-wkc-last-refresh": "0"
"cp4d-wkc-refresh-interval-minutes": "120"

Reset Cloud Pak for Data metrics configuration and influxdb

If, during development, the zen-watchdog is unable to process events because of an incorrect configuration or naming convention, using the following steps to reset the zen-watchdog and its influxdb

export CP4D_PROJECT=<CP4D_PROJECT>
oc project ${CP4D_PROJECT}
oc exec -it zen-metastoredb-0 /bin/bash
mkdir /tmp/certs
cp -Lr /certs/* /tmp/certs
cd /tmp && chmod 0600 ./certs/*
cd /cockroach

Wait for the cronjobs to be re-created

Acquire the Password for influxdb, and copy it.

oc extract secret/dsx-influxdb-auth --keys=influxdb-password --to=-

Delete the influxdb entries

oc exec -it dsx-influxdb-0 bash
influx -ssl -unsafeSsl
auth
<enter>admin
#Delete the events
use WATCHDOG;
drop measurement events;