Configuring disaster recovery topologies

You cam use MirrorMaker 2.0 to mirror topics from one Event Streams cluster to another.

You can configure Event Streams for disaster recovery (DR) by using multiple Event Streams instances in different locations. Two key topologies to consider as starting points for planning a DR solution suitable for your business are:

  • Active-Passive
  • Active-Active

Active-Passive topology

In the Active-Passive topology, there are two Kafka clusters; one active and one passive, which are in two different locations. The active cluster is the primary cluster where the data is processed, while the passive cluster serves as a backup for disaster recovery purposes.

MirrorMaker 2.0 provides unidirectional data replication between the 2 Kafka clusters. This is one way to mitigate the risks that are associated with any outages that might cause data loss and interrupt ongoing business activities.

Active - passive topology diagram.

This approach is typically used when there is a need for data recovery or business continuity because the secondary Event Streams cluster provides a backup to the primary cluster if there is a disaster or outage. Unidirectional replication ensures that the secondary cluster has a copy of the data, but this Event Streams instance is not directly accessed by producer and consumer applications.

If there is a failure in the active cluster, the passive cluster can be activated to take over processing of data. To implement an Active-Passive topology, MirrorMaker 2.0 must be configured on the Red Hat OpenShift Container Platform cluster hosting the passive cluster that will allow the data from the primary active cluster to be copied across to the secondary cluster.

In an Active-Passive topology, the primary Event Streams cluster, Cluster-1, is hosting Topic-A so all producer and consumer applications for Topic-A are connected to Cluster-1.

Cluster-2 maintains a backup of the data from Cluster-1. This backup is ready so that if there is an outage of the active cluster the system can be restarted with little data loss. The backup cluster is not accessed directly by any of the producer or consumer applications.

Active passive with application.

If there is business need to always have access to the data, some applications can be connected to the secondary cluster during an outage to access the backed-up data while the primary cluster might be inaccessible.

The number of Event Streams MirrorMaker 2.0 instances that are needed depends mainly on the requirements of your DR solution. For example:

  • How much data are you putting through your primary cluster?
  • How critical is the access to each topic?
  • How much tolerance is there for the replication lag across your system?

For more information about how to set up MirrorMaker 2.0 on your cluster, see the article Demystifying Kafka MirrorMaker 2: Use cases and architecture.

Active-Active topology

In the Active-Active topology, there are two Event Streams clusters that are both active, and applications produce and consume from both the active clusters. The Active-Active topology can be useful for scenarios where high availability across different geographical regions and having a cluster always active is critical for business continuity.

MirrorMaker 2.0 is configured in both directions between the two active clusters. Each cluster process data independently for the applications that are directly linked to it, while also serving as a backup for the other cluster. This helps to ensure that a failure in one of the clusters does not completely stop business processes. Furthermore, historical data from the disrupted cluster can still be used by applications that are connected to the remaining active cluster.

Active - active topology diagram.

In the previous diagram, there is Topic-A on Cluster-1 and it is being replicated to a copy called Cluster-1.Topic-A on to Cluster-2 and vice versa.

With the Active-Active topology, you can achieve:

  • Availability of data across different geographies
  • Lower latency by connecting clients to a more local Event Streams
  • Disruption isolation by distributing data processing across multiple clusters in different availability zones (for more information, see considerations for multizone deployments).

Further to the DR capabilities offered by the Active-Active topology, a consumer application that is connected to Cluster-1 can be configured to read data that is produced to different clusters as if it were coming from one topic. You can achieve this by setting the topics list to use a name with a wildcard character like *Topic-1. This means that the client reads from both Topic-1 and Cluster-2.Topic-1 and all messages are processed regardless of where they were originally produced.

Single topic across 2 instances diagram.

In the previous diagram, consumers that are connected to Cluster-1 can access the data from Cluster-2 without having to connect directly to Cluster-2.

Active-Active topology can be useful for:

  • Clusters that are in different geographic locations
  • Applications that care about the location where the data comes from
  • Applications that are independent of location and prefer a broader view of the data across different Event Streams clusters.

Single topic across 2 instances diagram.