Attention: This version of Event Streams has reached End of Support. For more information about supported versions, see the support matrix.

Monitoring Kafka cluster health

Monitoring the health of your Kafka cluster ensures your operations run smoothly. Event Streams collects metrics from all of the Kafka brokers and exports them to a Prometheus-based monitoring platform. The metrics are useful indicators of the health of the cluster, and can provide warnings of potential problems.

You can use the metrics as follows:

  • View a selection of metrics on a preconfigured dashboard in the Event Streams UI.
  • Create dashboards in the Grafana service that is provided in IBM Cloud Private, and use the dashboards to monitor your Event Streams instance, including Kafka health and performance details. You can create the dashboards in the IBM Cloud Private monitoring service by selecting to Export the Event Streams dashboards when configuring your Event Streams installation.

    For more information about the monitoring capabilities provided in IBM Cloud Private, including Grafana, see the IBM Cloud Private documentation.

    To install the configured Grafana dashboards, follow these steps:

    1. Download the dashboards you want to install from GitHub.
    2. Log in to your IBM Cloud Private cluster management console from a supported web browser by using the URL https://<Cluster Master Host>:<Cluster Master API Port>. The master host and port for your cluster are set during the installation of IBM Cloud Private. For more information, see the IBM Cloud Private documentation.
    3. Navigate to the IBM Cloud Private console homepage.
    4. Click the hamburger icon in the top left.
    5. Expand Platform.
    6. Click Monitoring to navigate to the Grafana homepage.
    7. On the Grafana homepage, click the Home icon in the top left to view all pre-installed dashboards.
    8. Click Import Dashboards, and either paste the JSON of the dashboard you want to install or import the dashboard’s JSON file that you downloaded in step 1.
    9. Navigate to the Grafana homepage again and click the Home icon, then find the dashboard you have installed to view it.

    Ensure you select your namespace, release name, and other filters at the top of the dashboard to view the required information.

  • Create alerts so that metrics that meet predefined criteria are used to send notifications to emails, Slack, PagerDuty, and so on. For an example of how to use the metrics to trigger alert notifications, see how you can set up notifications to Slack.
  • Create dashboards in the Kibana service that is provided in IBM Cloud Private. You can download example Kibana dashboards for Event Streams from GitHub, and use the dashboards to monitor for specific errors in the logs and set up alerts for when a number of errors occur over a period of time in your Event Streams instance.

    For more information about the logging capabilities provided in IBM Cloud Private, including Kibana, see the IBM Cloud Private documentation.

    To download the preconfigured Kibana Dashboards, follow these steps:

    1. Download Event Streams Kibana Dashboard.json from GitHub
    2. Log in to your IBM Cloud Private cluster management console from a supported web browser by using the URL https://<Cluster Master Host>:<Cluster Master API Port>. The master host and port for your cluster are set during the installation of IBM Cloud Private. For more information, see the IBM Cloud Private documentation.
    3. Navigate to the IBM Cloud Private console homepage.
    4. Click the hamburger icon in the top left.
    5. Expand Platform.
    6. Click Logging to navigate to the Kibana homepage.
    7. Click Management on the left.
    8. Click Saved Objects.
    9. Click the Import icon and navigate to the Event Streams Kibana Dashboard.json file that you downloaded.
    10. Click the Dashboard tab on the left to view the downloaded dashboards.

You can also use external monitoring tools to monitor the deployed Event Streams Kafka cluster.

For information about the health of your topics, check the producer activity dashboard.

Important: By default, the metrics data used to provide monitoring information is only stored for a day. Modify the time period for metric retention to be able to view monitoring data for longer time periods, such as 1 week or 1 month.

Viewing the preconfigured dashboard

To get an overview of the cluster health, you can view a selection of metrics on the Event Streams Monitor dashboard.

  1. Log in to your Event Streams UI as an administrator from a supported web browser (see how to determine the login URL for your Event Streams UI).
  2. Click Monitoring in the primary navigation. A dashboard is displayed with overview charts for messages, partitions, and replicas.
  3. Select 1 hour, 1 day, 1 week, or 1 month to view data for different time periods.