Table of Contents

Automating HADR failover

IBM Tivoli® System Automation for Multiplatforms (SA MP) is the recommended cluster manager for automating HADR failover. On Linux/Unix platforms, SA MP is integrated in DB2. "DB2 high availability instance configuration utility" (db2haicu) is the user interface to manage the integrated cluster manager. On Windows, SA MP is not Integrated in DB2 and db2haicu is not available. But you can still configure SA MP on Windows using scripts. See " HADR auto failover on Windows " section below.
Starting in Db2 version 11.5.5, Pacemaker as an alternate cluster manager is available for automated failover to HADR standby on Linux for on-premise and non-containerized cloud deployments. See IBM Documentation topic Pacemaker (Linux) for prerequisites and usage instructions.

DB2 Info Center provides complete reference to using the integrated cluster manager. In Info Center, navigate from the page " Configuring a clustered environment for high availability " , or just search for "db2haicu".

If you would like to use other cluster managers, such as

  • IBM® PowerHA® SystemMirror for AIX® (formerly known as "High Availability Cluster Multi-Processing" for AIX or HACMP)
  • Microsoft Cluster Server (MSCS), , for Windows operating systems
  • Sun Cluster, or VERITAS Cluster Server, for the Solaris operating system.

See the page " Supported cluster management software " in Info center

See " Configure multiple HADR databases in a DB2 instance for automated failover using Tivoli System Automation for Multiplatforms " in IBM Community

Automated failover is often used with automated client reroute. See Client Reroute .

Red Book

http://www.redbooks.ibm.com/redbooks/pdfs/sg247363.pdf

Title: High Availability and Disaster Recovery Options for DB2 for Linux, UNIX, and Windows

White Papers

Using db2haicu with HADR

Automated cluster controlled HADR configuration setup using the IBM DB2 high availability instance configuration utility (db2haicu)
White paper. 58 pages. 2009. Authors: Steve Raspudic, Saurabh Sehgal, Malaravan Ponnuthurai
General usage of db2haicu with HADR.

DB2 system topology and configuration for automated multi-site HA and DR
White paper, 77 pages, 2010, Authors: Aruna De Silva, Steve Raspudic, Sunil Kamath
Focuses on multi-site (remote site for DR purpose) setup.

Recommendations for a TSAMP controlled DB2 HA/HADR environment
Tech note for Tivoli System Automation for Mutl-iplatforms (TSAMP) configuration in DB2 HA/HADR environment.

Using DB2 High Availability Disaster Recovery with Tivoli Systems Automation and Reliable Scalable Cluster Technology ,
Tutorial, 2010, Authors: Dipali Kapadia, Michelle Chiu
Focuses on troubleshooting

Automating HADR on DB2 10.1 for Linux, UNIX and Windows Failover Solution Using Tivoli System Automation for Multiplatforms , Dec 2012, By Steve Raspudic, etc.  This paper gives 2 examples of setting up HADR with automation: first via db2haicu in interactive mode; the second via db2haicu with XML input file.  2014 revision: http://public.dhe.ibm.com/software/dw/im/dm-0907hadrdb2haicu/db2-10-hadr-tsa.pdf

Using virtual IP addresses with Reads on Standby feature

Enabling continuous access to read on standby databases using Virtual IP addresses
White paper, 12 pages, 2011, Authors: Steve Raspudic, Aruna De Silva, Louise McNicoll.

This paper describes how to set up two VIP addresses with HADR, one for read/write workload on primary database and one for read-only workload on the standby database. VIP is recommended for rerouting clients when reads on standby is enabled. Automatic client reroute should not be used when reads on standby is enabled.

Using db2haicu for shared disk local HA

Automated instance failover using the IBM DB2 High Availability Instance Configuration Utility (db2haicu)
White paper, 69 pages, 2009, Authors: Steve Raspudic, Selvaprabhu Arumuggharaj

Db2haicu can be used with or without HADR. This is about the shared disk solution for local HA (without HADR).

Note that starting from DB2 V10.1, HADR supports multiple standbys. You can use a local standby for HA and remote standby(s) for DR. In older releases, you can set up shared disk solution for local HA and use HADR for DR (standby at remote site).

HADR auto failover on Windows

On Windows, it used to be possible to implement automated failover using older version of TSA, but that is no longer possible starting in TSA 4.1 (due to dropped support of Windows).

Customers looking at automating HADR failover on Windows would likely need to select a different cluster manager product.   See following article on using HADR on Windows with Microsoft Cluster Server.

Set up DB2 for Linux, UNIX, and Windows for high ... - IBM

Setting up TSA on Two Data Centers Using HADR with Multiple Standbys

It is a common usage scenario with the following environment:

  • Datacenter1 consists of 2 hosts A and B, providing high availability within the data center.
  • Datacenter2 consists of 2 hosts C and D, providing high availability within the data center.
  • Datacenter1 and Datacenter2 are typically some distance apart.  Datacenter2 is for disaster recovery.
  • Initially, the database service is provided on host A in Datacenter1.
  • HADR is initially set up with primary on A, principal standby on B, and auxiliary standby on C and D.


The following procedure will setup TSA in both data centers to handle automated failover.  Details of db2haicu usage can be found in the Automated cluster controlled HADR configuration setup using the IBM DB2 high availability instance configuration utility (db2haicu) white paper.

  1. Run db2haicu on host B to create domain containing A and B, defines tie breaker device, the network and DB2 instance resources .
  2. Run db2haicu on host A to create the second DB2 instance resource and HADR resource (after this step, HADR failover automation has been enabled in Datacenter1) .
  3. I ssue manual graceful takeover on C (after this step, C is the primary, with D as its principal standby, A and B are the auxiliary standbys) .
  4. Run db2haicu on D to create domain containing C and D, defines tie breaker device, the network and DB2 instance resources .
  5. R un db2haicu on C to create the second DB2 instance resource and HADR resource (after this step, HADR failover automation has been enabled in Datacenter2) .
  6. R un db2haicu - disable on B .
  7. Run db2haicu - disable on A .
  8. I ssue manual graceful takeover on A (after this step, A is the primary, with B as its principal standby, C and D are the auxiliary standbys) .
  9. Run db2haicu on B to re - enable automation .
  10. Run db2haicu on A to re - enable automation .


Note that the above setup creates two disjoint TSA domains, one in Datacenter1 and the other in Datacenter2.  After the above setup, the HADR system is now enabled with the following:

  • Automated failover in Datacenter1, should there be an outage of A, B automatically becomes the new primary. This will result in B as primary, with A the principal standby, and C and D are still the auxiliary standbys.
  • Automated restart of DB2 service all standbys B, C and D, should there be an outage (due to either hardware or software failure).  There will be automatic execution of the DB2START command and the ACTIVATE DATABASE command.  This will allow the standby service to be brought up as quickly as possible, and form connection with the primary database.


Procedure for performing maintenance (the key point is : there is nothing special, all the same as if there is no TSA):

  • First change on auxiliary standbys, which are C and D in Datacenter2.
    • Issue DEACTIVATE DATABASE and DB2STOP.
    • Perform change (such as new DB2 fixpak, or new OS, or hardware change).
    • Issue DB2START and ACTIVATE DATABASE.
  • Change principal standby, which is B in Datacenter1.
    • Issue DEACTIVATE DATABASE and DB2STOP.
    • Perform change (such as new DB2 fixpak, or new OS, or hardware change).
    • Issue DB2START and ACTIVATE DATABASE.
  • Issue graceful takeover on principal standby, which will become new primary.
  • Change old primary.
    • Issue DEACTIVATE DATABASE and DB2STOP.
    • Perform change (such as new DB2 fixpak, or new OS, or hardware change).
    • Issue DB2START and ACTIVATE DATABASE.
  • Optionally, issue graceful takeover to bring primary back to the original host.


If there is a need to perform graceful takeover from an auxiliary standby (say C in Datacenter2), use the following procedure.  Note that step 1 and 3 are new.

  1. First run db2haicu - disable on the auxiliary standbys, which are C and D in Datacenter2.
  2. Issue graceful takeover on the auxiliary standby C.  C becomes primary, with D as principal standby, A and B as auxiliary standbys.
  3. Run db2haicu on C and D to re-enable automation.


In an event of disaster that takes out entire Datacenter1, procedure for manual failover to Datacenter2:

  • Run db2haicu -disable on C and D .
  • Run db2pd - hadr on C and D to see which has more logs.
  • Issue forced takeover from the one with more logs. Say it is C.  This makes C the primary, with D as principal standby.
  • Run db2haicu on C and D to re-enable automation.
    • At this point, a utomated failover between C and D is enabled.
  • Later on when Datacenter1 comes back.  A and B will likely need to be reinitialized.
    • In a full disaster, everything is lost, including the TSA domain.  Even if TSA domain is not lost (eg. in a scenario where only the network to Datacenter1 is lost, everything within Datacenter1 is fine), use db2haicu to destroy the old TSA domain between A and B.  It will be recreated later.
    • Take a new online database backup on C, and restore this backup on A and B.
    • Issue START HADR AS STANDBY on A and B.
    • I ssue manual graceful takeover on A (after this step, A is the primary, with B as its principal standby, C and D are the auxiliary standbys) .
    • Run db2haicu on B to create domain containing A and B, defines tie breaker device, the network and DB2 instance resources .
    • R un db2haicu on A to create the second DB2 instance resource and HADR resource (after this step, HADR failover automation has been enabled in Datacenter1) .

Automation Setup on two Data Centers in an Environment having DB2 HADR with Multiple Standbys

Deploying a two-sites multiple standby cluster with same-site failover automation

DB2 Auxiliary HADR and TSA