When you have geo-replication set up, you can monitor and manage your geo-replication, such as checking the status of your geo-replicators, pausing and resuming geo-replication, removing replicated topics from destination clusters, and so on.
From a destination cluster
You can check the status of your geo-replication and manage geo-replicators (such as pause and resume) on your destination cluster.
You can view the following information for geo-replication on a destination cluster:
- The total number of origin clusters that have topics being replicated to the destination cluster you are logged into.
- The total number of topics being geo-replicated to the destination cluster you are logged into.
- Information about each origin cluster that has geo-replication set up on the destination cluster you are logged into:
- The cluster name, which includes the release name.
- The health of the geo-replication for that origin cluster: Creating, Running, Updating, Paused, Stopping, Assigning, Offline, and Error.
- Number of topics replicated from each origin cluster.
Tip: As your cluster can be used as a destination for more than one origin cluster and their replicated topics, this information is useful to understand the status of all geo-replicators running on the cluster.
Using the UI
To view this information on the destination cluster by using the UI:
- Log in to your destination Event Streams cluster as an administrator.
- Click Topics in the primary navigation and then click Geo-replication.
- Click the Origin locations tab for details.
To manage geo-replication on the destination cluster by using the UI:
- Log in to your destination Event Streams cluster as an administrator.
- Click Topics in the primary navigation and then click Geo-replication.
- Click the Origin locations tab for details.
- Locate the name of the origin cluster for which you want to manage geo-replication for, and choose from one of the following options:
- Overflow menu > Pause running replicators: To pause geo-replication and suspend replication of data from the origin cluster.
- Overflow menu > Resume paused replicators: To resume geo-replication and restart replication of data from the origin cluster.
- Overflow menu > Restart failed replicators: To restart a geo-replicator that experienced problems.
-
Overflow menu > Stop replication: To stop geo-replication from the origin cluster.
Important: Stopping replication also removes the origin cluster from the list.
Note: You cannot perform these actions on the destination cluster by using the CLI.
From an origin cluster
On the origin cluster, you can check the status of all of your destination clusters, and drill down into more detail about each destination.
You can also manage geo-replicators (such as pause and resume), and remove entire destination clusters as a target for geo-replication. You can also add topics to geo-replicate.
You can view the following high-level information for geo-replication on an origin cluster:
- The name of each destination cluster.
- The total number of topics being geo-replicated to all destination clusters from the origin cluster you are logged into.
- The total number of workers running for the destination cluster you are geo-replicating topics to.
You can view more detailed information about each destination cluster after they are set up and running like:
- The topics that are being geo-replicated to the destination cluster.
- The health status of the geo-replication on each destination cluster: Awaiting creation, Pending, Running, Resume, Resuming, Pausing, Paused, Removing, and Error. When the status is Error, the cause of the problem is also provided to aid resolution.
Using the UI
To view this information on the origin cluster by using the UI:
- Log in to your origin Event Streams cluster as an administrator.
- Click Topics in the primary navigation and then click Geo-replication.
- Click the Destination locations tab for details.
To manage geo-replication on the origin cluster by using the UI:
- Log in to your origin Event Streams cluster as an administrator.
- Click Topics in the primary navigation and then click Geo-replication.
- Click the name of the destination cluster for which you want to manage geo-replication.
- Choose from one of the following options using the top right Overflow menu:
- Overflow menu > Pause running geo-replicator: To pause the geo-replicator for this destination and suspend replication of data to the destination cluster for all topics.
- Overflow menu > Resume paused geo-replicator: To resume the paused geo-replicator for this destination and resume replication of data to the destination cluster for all topics.
- Overflow menu > Restart failed geo-replicator: To restart a geo-replicator that experienced problems.
- Overflow menu > Remove cluster as destination: To remove the cluster as a destination for geo-replication.
To stop an individual topic from being replicated and remove it from the geo-replicator, select Overflow menu > Stop replicating topic.
Using the CLI
To view this information on the origin cluster by using the CLI:
- Go to your origin cluster. Log in to your Kubernetes cluster as a cluster administrator by setting your
kubectl
context. - Initialize the Event Streams CLI by following the instructions in logging in.
-
Retrieve destination cluster IDs by using the following command:
kubectl es geo-clusters
-
Retrieve information about a destination cluster by running the following command and copying the required destination cluster ID from the previous step:
kubectl es geo-cluster --destination <destination-cluster-id>
For example:
kubectl es geo-cluster --destination destination_byl6x
The command returns the following information:
Details of destination cluster destination_byl6x Cluster ID Cluster name REST API URL Skip SSL validation? destination_byl6x destination https://destination-ibm-es-admapi-external-myproject.apps.geodest.ibm.com:443 false Geo-replicator details Geo-replicator name Status Origin bootstrap servers origin_es->destination-mm2connector RUNNING origin_es-kafka-bootstrap-myproject.apps.geosource.ibm.com:443 Geo-replicated topics Geo-replicator name Origin topic Destination topic origin_es->destination-mm2connector topic1 origin_es.topic1 origin_es->destination-mm2connector topic2 origin_es.topic2
Each geo-replicator creates a MirrorSource connector and a MirrorCheckpoint connector. The MirrorSource connector replicates data from the origin to the destination cluster. You can use the MirrorCheckpoint connector during failover from the origin to the destination cluster.
To manage geo-replication on the origin cluster by using the CLI:
- Go to your origin cluster. Log in to your Kubernetes cluster as a cluster administrator by setting your
kubectl
context. - Initialize the Event Streams CLI by following the instructions in logging in.
-
Run the following commands as required:
-
kubectl es geo-replicator-pause --destination <destination-cluster-id> --name "<replicator-name>"
For example:
kubectl es geo-replicator-pause --destination destination_byl6x --name "origin_es->destination-mm2connector"
This will pause both the MirrorSource connector and the MirrorCheckpoint connector for this geo-replicator. Geo-replication for all topics that are part of this geo-replicator will be paused.
-
kubectl es geo-replicator-resume --destination <destination-cluster-id> --name "<replicator-name>"
For example:
kubectl es geo-replicator-resume --destination destination_byl6x --name "origin_es->destination-mm2connector"
This will resume both the MirrorSource connector and the MirrorCheckpoint connector for this geo-replicator after they have been paused. Geo-replication for all topics that are part of this geo-replicator will be resumed.
-
kubectl es geo-replicator-restart --destination <destination-cluster-id> --name "<replicator-name>" --connector <connector-name>
For example:
kubectl es geo-replicator-restart --destination destination_byl6x --name "origin_es->destination-mm2connector" --connector MirrorSourceConnector
This will restart a failed geo-replicator MirrorSource connector.
-
kubectl es geo-replicator-topics-remove --destination <destination-cluster-id> --name "<replicator-name>" --topics <comma-separated-topic-list>
For example:
kubectl es geo-replicator-topics-remove --destination destination_byl6x --name "origin_es->destination-mm2connector " --topics topic1,topic2
This will remove the listed topics from this geo-replicator.
-
kubectl es geo-replicator-delete --destination <destination-cluster-id> --name "<replicator-name>"
For example:
kubectl es geo-replicator-delete --destination destination_byl6x --name "origin_es->destination-mm2connector"
This will remove all MirrorSource and MirrorCheckpoint connectors for this geo-replicator.
-
kubectl es geo-cluster-remove --destination <destination-cluster-id>
For example:
kubectl es geo-cluster-remove --destination destination_byl6x
This will permanently remove a destination cluster.
Note: If you are unable to remove a destination cluster due to technical issues, you can use the
--force
option with thegeo-cluster-remove
command to remove the cluster.
-
Restarting a geo-replicator with Error status
Running geo-replicators constantly consume from origin clusters and produce to destination clusters. If the geo-replicator receives an unexpected error from Kafka, it might stop replicating and report a status of Error.
Monitor your geo-replication cluster to confirm that your geo-replicator is replicating data.
To restart a geo-replicator that has an Error status from the UI:
- Log in to your origin Event Streams cluster as an administrator.
- Click Topics in the primary navigation and then click Geo-replication.
- Locate the name of the destination cluster for the geo-replicator that has an Error status.
- Locate the reason for the Error status under the entry for the geo-replicator.
- Either fix the reported problem with the system or verify that the problem is no longer present.
- Select Overflow menu > Restart failed replicator to restart the geo-replicator.
Using Grafana dashboards to monitor geo-replication
Metrics are useful indicators of the health of geo-replication. They can give warnings of potential problems as well as providing data that can be used to alert on outages. Monitor the health of your geo-replicator using the available metrics to ensure replication continues.
Configure your Event Streams geo-replicator to export metrics, and then view them using the example Grafana dashboard.
Configuring metrics
Enable export of metrics in Event Streams geo-replication by editing the associated KafkaMirrorMaker2 custom resource.
Using the OpenShift Container Platform web console
- Go to where your destination cluster is installed. Log in to the OpenShift Container Platform web console using your login credentials.
- From the navigation menu, click Operators > Installed Operators.
- In the Projects dropdown list, select the project that contains the destination Event Streams instance.
- Select the Event Streams operator in the list of installed operators.
- Click the Kafka Mirror Maker 2 tab to see the list of KafkaMirrorMaker2 instances.
- Click the KafkaMirrorMaker2 instance with the name of the instance that you are adding metrics to.
- Click the YAML tab.
-
Add the
spec.metrics
property. For example:# ... spec: metrics: {} # ...
- Click Save.
Using the CLI
To modify the number of geo-replicator workers run the following using the CLI:
- Go to where your destination cluster is installed. Log in to your Kubernetes cluster as a cluster administrator by setting your
kubectl
context. -
Run the following command to select the namespace that contains the existing destination cluster:
kubectl namespace <namespace-name>
-
Run the following command to list your
KafkaMirrorMaker2
instances:kubectl get kafkamirrormaker2s
-
Run the following command to edit the custom resource for your
KafkaMirrorMaker2
instance:kubectl edit kafkamirrormaker2 <instance-name>
-
Add the
spec.metrics
property. For example:spec: metrics: {}
- Save your changes and close the editor.