Table of Contents

HADR Setup
- Is redirected restore supported when initializing the HADR standby?
- Should I start the primary first or the standby first?
HADR Configuration
Log File Management
Replicating Data
Using HADR commands
Monitoring HADR
Automated Failover
- How configure automated failover with PowerHA?
Miscellaneous

HADR FAQ

HADR Setup

Is redirected restore supported when initializing the HADR standby?

Redirected restore is not supported. "Redirected" here means redirecting tablespace containers, as in a "restore ... redirect" command.

Database directory ("restore ... to ...") and log directory ("restore ... newlogpath") changes are supported. When database directory changes, table space containers created by relative paths will be restored to paths relative to the new database directory.

The "restore ... on ... [dbpath on ...]" command is also supported, even though it can be considered as "implicit" redirection.

Should I start the primary first or the standby first?

From Info Center, Managing HADR, initializing HADR:

Usually, the standby database is started first. If you start the primary database first, this startup procedure will fail if the standby database is not started within the time period specified by the HADR_TIMEOUT database configuration parameter.

How to setup automated failover when there are multiple databases in an instance?

In order to configure 2 databases for automated failover, you will need to have 4 separate XML files, here is the sequence in which you would run db2haicu to set this up:

Setting up for database 1:
1) Run db2haicu on the standby (XML file A)
2) Run db2haicu on the primary (XML file B)

Setting up for database 2:
3) Run db2haicu on the standby (XML file C)
4) Run db2haicu on the primary (XML file D)

HADR Configuration

See also HADR config on HADR syncmode and other parameters.

Which HADR synchronization mode shall I use?

See HADR sync mode

Are HADR database configuration parameters dynamic?

See Updating HADR Configuration

Are HADR registry variables dynamic?

No. All HADR related registry variables require DB2 instance shutdown and restart to take effect. The HADR registry variables have DB2_HADR_ prefix. Example: DB2_HADR_PEER_WAIT_LIMIT, DB2_HADR_SOSNDBUF and DB2_HADR_SORCVBUF.

Should I set DB2_HADR_NO_IP_CHECK on one or both databases?

DB2_HADR_NO_IP_CHECK should be set on both primary and standby instances, even if only one of them is under NAT (Network Address Translation) control.

Log File Management

How does the primary manage its log files?

Log file management on the primary database is identical to that of a standard (non-HADR) database.

How does the standby manage its log files?

See Log File Management on Standby

Can I use log retain on HADR databases?

Yes. See also Log File Management on Standby

If archiving is not enabled on the primary (It is using logretain), then the standby will not reclaim any log files. All files will be kept on both primary and standby. Users are responsible to manage the files. Generally, files older than "First active log file" from "get db cfg" can be removed. "First active file" should be read and used for file management on primary and standby separately. Standby's first active file can be older than the primary's because replay on standby can fall behind primary's transaction processing.

The standby never archives log files, thus using logretain on standby has no impact to HADR log file operations on standby.

What are the pros and cons of using shared log archive between the primary and standby?

See LOGARCHMETH1 and LOGARCHMETH2 Log ArchiveDevice

What is the recommendation on using log archive compression with HADR?

The DB2 log archive compression feature (enabled via database configuration parameter logarchcompr1 or logarchcompr2) compresses log files when they are archived. Compression happens when a log file is sent to log archive. Log files in active log path are not compressed. Neither is log shipping from HADR primary to standby. So normally, compression has little effect on HADR.

Log archive compression can affect HADR when standby falls significantly behind the primary (eg. due to planned or unplanned outage of standby). When standby is reactivated, it will need to replay some relatively old log files to catchup. The files are likely to be in archive. Unless shared archive is configured, the primary will need to retrieve, uncompress, and send the files to the standby. This will add to CPU cost on the primary and cause some delay in HADR catchup. Even if standby can directly retrieve the files via shared archive, it will still need to uncompress them, with CPU cost on the standby.

The same log archive compression configuration is recommended on HADR primary and standby. Log archive compression can be configured independently on primary and standby. A database (primary, standby, or standard) can always uncompress log files, regardless of its current compression setting. A standby will not have problem reading compressed files from shared archive.

The decision to use log archive compression should be made based on the benefit vs cost, as if HADR is not being used. The benefit is space saving in archive, the cost is CPU time to compress and uncompress the files. The HADR complication of standby falling significantly behind is relatively rare. It is troublesome with or without log archive compression. When it happens, you may consider reinitializing the standby via a recent backup from the primary, rather than going through a long catchup. If you choose to go through a long catchup, you may retrieve needed log files to overflow log path in advance to speed up catchup.

Replicating Data

What is replicated, what is not?

See What's replicated and what's not

How is LOAD handled by HADR?

HADR replicates load with certain restrictions. See DB2 info center " Load operations and HADR ".

What is the difference and impact to HADR when using IMPORT, INGEST, and LOAD?

There is a significant difference using IMPORT / INGEST vs using LOAD :

IMPORT and INGEST log all data into the transaction logs, thus data is always replicated to standby by HADR.
LOAD does not log the data into transaction logs
- Non-recoverable load should not be used in HADR environment, unless you can conveniently rerun the load to populate the table after a failover.
- Recoverable load generates a load copy image. To retrieve the data for replay, the HADR standby needs to be able to access this load copy image.
  - A shared device such as TSM or NFS is recommended for load copy.
  - No shared device is not really practical. You will need to shut down standby, load on primary, transfer load copy image to the standby, then start standby again to replay the load.
  - Even with shared device, it adds a dependency of standby on the NFS or TSM server
    - If the service is not available at replay time, standby will not be able to access the image
    - Standby replay will not wait, it will just mark the tablespace in roll forward pending state and continue replay.
    - User will have to detect this and repair the tablespace (via taking new backup on primary and restore onto the standby)
    - Impact of replay failure can be quite significant

Thus if the performance of INGEST is adequate, it is preferred. Things to consider when using IMPORT or INGEST:

They can produce much higher logging rate on the primary (more so for INGEST)
This could mean the primary logging rate is higher than the HADR log shipping speed, or the standby log replay speed
- This could potentially cause congestion that could affect primary transaction processing
- If congestion is avoided by using large standby receive buffer or log spooling, it would still mean that standby log replay could fall behind the primary, at least temporarily.
  - This would lead to long takeover time, should there be a primary outage when standby replay is behind.
You should experiment to see what kind of logging rate you get when running INGEST

How is Blob/Clob/DBClob handled in HADR?

DB2 LOB type includes Blob, Clob, and DBClob. A LOB column can be declared as logged or not logged. For HADR, a LOB column is replicated if and only if it is logged. For a not logged LOB, the standby will still allocate space for it; the LOB will have the right size but the content will be binary zero.

What happens if the standby encounters a replay error?

For fatal errors, the standby database will shut itself down. The DBA should identify the cause (read db2diag.log, etc.), fix the problem, then reactivate the standby.

Replay errors on a tablespace may bring the tablespace offline, but leave the database online. Subsequent replays will skip this tablespace. At database level, the standby may look healthy, say, in peer state with a small log gap. But it can have one or more tablespaces offline. These tablespaces will not be accessible upon failover.

Therefore it is very important to also monitor tablespace state on the standby. You can use "db2pd -tablespaces" option to monitor the standby.

Using HADR commands

Do HADR commands use db name or db alias to address databases? ("loopback" catalog not supported)

HADR commands (start hadr, stop hadr, takeover hadr) use db aliases to address the database. Therefore, you could issue the commands from the database server machine or a client machine as long as the alias points to the right database.

However, there is a limitation for the "start hadr ... as standby" command. On the server machine hosting the standby database, the database catalog must have an entry with the same db alias and db name. If you do "db2 list db directory", you should see something like:

         Database alias                       = FOO
         Database name                        = FOO
         Local database directory             = /work1/databases
         Database release level               = a.00
         Comment                              =
         Directory entry type                 = Indirect
         Catalog database partition number    = 0
         Alternate server hostname            =
         Alternate server port number         =

Users do not usually run into this limitation. Such an entry is created automatically on "create db" or "restore db". It's missing only in unusually cases such as "loopback" catalog.

What database aliases shall I use for HADR commands when automatic client reroute is enabled?

Do not use ACR enabled database aliases in HADR admin commands. See HADR Admin Commands

How come the standby fails to start, but the "Start HADR as standby" command still succeeds?

When the "start HADR on database ... as standby" command is issued, the command returns as soon as the relevant EDUs are successfully started. It does not wait for connection to primary (In contrast, with the exception of "start HADR on db ... as primary by force ", the primary database is not considered started until it connects to a standby database). If the standby database encounters an error such as connection rejected by the primary, the "start" command may still return success. Many replay errors will also bring down the standby database. Thus the status of the standby should be continuously monitored.

The same behavior applies to the "activate database" command on the standby.

This behavior applies to primary reintegration too. Returning of "start ... as standby" command does not indicate that reintegration succeeded. See also Primary Reintegration

Monitoring HADR

How do I tell the role of a database when it is offline?

Check database configuration parameter "HADR database role". It is one of PRIMARY, STANDBY, or STANDARD (not an HADR database). When it is online, you can also find the role in the "HADR_ROLE" field in monitoring interfaces.

Why is the HADR state reported in db2diag.log different from the state reported in monitoring interfaces?

See HADR State Change

What do I see the "This operation is not allowed on a standby database" message in db2diag.log?

This message is usually caused by a client (or the client reroute mechanism) trying to connect to the standby database. This message and the subsequent "sqleserl, probe:77" message can usually be ignored.

How to I monitor tablespace status on the standby?

DB2 CLP command "list tablespaces" requires a database connection. But the standby database does not accept client connection unless reads on standby is enabled. You can use the db2pd command to monitor tablespaces on the standby. Use "db2pd -tablespaces -db <dbName>". See also Standby Tablespace Status

How can I determine if there are any unavailable tables on HADR standby database?

It is routine practice to identify if any database objects, such as tables, are unavailable on the HADR standby prior to issuing a takeover. For example, you might check if there any tables that are unavailable because they are in a drop-pending state. Tables in unavailable state may be caused by not logged operation on the primary such as NLI / non-recoverable load.

To determine if there are any tables in drop-pending state on the standby, you need to use the db2dart and the db2pd commands on the standby. You need to use both commands because each provides a potentially incomplete picture of the table state: db2dart reads only changes that are on disk, and db2pd can report only the information about tables that are loaded into memory.

See this technote for detailed steps to check for unavailable tables in the database on HADR standby.

Automated Failover

How configure automated failover with PowerHA?

Please refer to section 7.5 "Automating HADR takeover with PowerHA" of the following redbook:
http://www.redbooks.ibm.com/redbooks/pdfs/sg247363.pdf

Miscellaneous

How do I find out which HADR features were added in which DB2 release?

See feature history

Does HADR support compression in log shipping?

No. HADR sends log data onto TCP as is. If compression is needed, an external hardware or software solution can be used. Note that if row compression or columnar compression is enabled, the data embedded in log stream is already compressed. Therefore additional compression at or below TCP level may not result in high compression ratio.

DB2 "log archive compression" feature only compresses a log file when it is archived. It has no effect on HADR log shipping. See also What is the recommendation on using log archive compression with HADR?

Does HADR support encryption in log shipping? What if I have "native database encryption" enabled?

No. HADR sends log data onto TCP as is. HADR primary and standby usually are connected via private network, for security and performance reasons. If encryption is needed, an external method such as VPN can be used.

The DB2 native database encryption only encrypts data at rest (ie. on disk), Data in memory and in transit (such as client/server communication and HADR primary/standby communication) are not encrypted. Search for "Encrypt a database in an HADR environment" in the knowledge center topic quoted above for example of managing HADR in encrypted database environment.

Primary and standby encryption configurations are independent. You can have both unencrypted, both encrypted, or one encrypted and the other not encrypted. In most business environments, you would want them to have the same encryption policy. HADR will print a warning message to DB2 admin log when it detects that primary and standby have different policy.