Attention: This version of Event Streams has reached End of Support. For more information about supported versions, see the support matrix.

Monitoring Kafka cluster health

Monitoring the health of your Kafka cluster ensures your operations run smoothly. Event Streams collects metrics from all of the Kafka brokers and exports them to a Prometheus-based monitoring platform. The metrics are useful indicators of the health of the cluster, and can provide warnings of potential problems.

You can use the metrics as follows:

  • View a selection of metrics on a preconfigured dashboard in the Event Streams UI.
  • Create dashboards in the Grafana service that is provided in IBM Cloud Private. You can download example Grafana dashboards for Event Streams from GitHub.

    For more information about the monitoring capabilities provided in IBM Cloud Private, including Grafana, see the IBM Cloud Private documentation.

  • Create alerts so that metrics that meet predefined criteria are used to send notifications to emails, Slack, PagerDuty, and so on. For an example of how to use the metrics to trigger alert notifications, see how you can set up notifications to Slack.

Important: By default, the metrics data used to provide monitoring information is only stored for a day. Modify the time period for metric retention to be able to view monitoring data for longer time periods, such as 1 week or 1 month.

Viewing the preconfigured dashboard

To get an overview of the cluster health, you can view a selection of metrics on the Event Streams Monitor dashboard.

  1. Log in to Event Streams as an administrator
  2. Click the Monitor tab. A dashboard is displayed with overview charts for messages, partitions, and replicas.
  3. Click a chart to drill down into more detail.
  4. Click 1 hour, 1 day, 1 week, or 1 month to view data for different time periods.