Db2diag.log is an important DB2 diagnostic tool. It records history of events in DB2. The db2diag tool can be used to parse and filter db2diag.log file. This page describes db2diag.log messages relevant to HADR.
Table of Contents
One or more threads in the database engine works on HADR tasks. A thread is also known as an EDU (Engine Dispatching Unit). You can use the "db2pd -edus" command to list threads. The HADR threads in a primary database are named "db2hadrp". Those in a standby database are named "db2hadrs". Upon role change ("takeover" command), the threads are renamed to match new role.
Log Stream id and Standby id
In V10.1 and later, log stream id and standby id are appended to HADR edu name, to support pureScale and multiple standby. Standby id is always zero for db2hadrs edus because standby id is assigned by the primary and does not propagate to the standby (similarly, the STANDBY_ID monitor field is always zero when the monitor query is issued on a standby). Example:
The "EDUNAME" field in a db2diag.log message shows the edu who wrote the message. The "DB" field shows the database from which the message comes. DB name is also embedded in the EDUNAME field in parenthesis. When looking at messages, be sure no to confuse messages from different database, standby id, and stream id. The db2diag tool can be used to parse and filter db2diag.log file.
HADR role is changed by an HADR command (start/stop/takeover HADR. See HADR commands ). Look for keyword hdrSetDbRole in V97 and earlier or hdrSetDbRoleAndDbType in V10.1 and later in db2diag.log. Sample messages:
V9.7
"Start HADR as primary" command results in "standard" to "primary" role change:
2013-10-07-14.40.35.450719-420 E68393E386 LEVEL: Event
PID : 4737 TID : 46914980538688PROC : db2sysc
INSTANCE: zhuge NODE : 000 DB : HADRDB
EDUID : 524 EDUNAME: db2hadrp (HADRDB)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrSetDbRole, probe:10010
CHANGE : HADR role set to Primary (was Standard)
V10.1
Role switch result in "primary" to "standby" change (shown below) and "standby" to "primary" change (not shown):
2013-10-07-14.26.41.918203-420 E132007E1255 LEVEL: Event
PID : 1562 TID : 46916419184960 PROC : db2sysc
INSTANCE: zhuge NODE : 000 DB : HADRDB
APPHDL : 0-101 APPID: *LOCAL.DB2.131007212657
HOSTNAME: zeeman
EDUID : 468 EDUNAME: db2agent (HADRDB)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrSetDbRoleAndDbType, probe:10020
CHANGE : HADR DATABASE ROLE/TYPE -
HADR database role set to STANDBY (was PRIMARY).
HADR database type set to PHYSICAL (was PHYSICAL).
Internal vs. External HADR States
The HADR states in db2diag.log is the internal state. It has finer granularity than the external state reported by monitor interfaces. The internal state indicates what the local database (primary or standby) is doing, while the external state indicates what the HADR pair is doing. Internal states have P_ or S_ prefix. The internal states are:
For more info on HADR state, see HADR log shipping
Sample Messages from V9.7
Entering Peer state:
2013-10-07-14.40.36.401196-420 E76255E392 LEVEL: Event
PID : 4737 TID : 46914980538688PROC : db2sysc
INSTANCE: zhuge NODE : 000 DB : HADRDB
EDUID : 524 EDUNAME: db2hadrp (HADRDB)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrSetHdrState, probe:10000
CHANGE : HADR state set to P-Peer (was P-NearlyPeer)
During a role switch, standby changes into P-Peer state. There is a matching standby to primary role change message (not shown).
2013-10-07-14.40.50.329996-420 E81365E448 LEVEL: Event
PID : 4737 TID : 46914993121600PROC : db2sysc
INSTANCE: zhuge NODE : 000 DB : HADRDB
APPHDL : 0-113 APPID: *LOCAL.DB2.131007214120
EDUID : 553 EDUNAME: db2agent (HADRDB)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrSetHdrState, probe:10000
CHANGE : HADR state set to S-Peer (was P-Peer)
Sample Messages from V10.1
Entering Peer state:
2013-10-07-14.26.24.033020-420 E126131E433 LEVEL: Event
PID : 1562 TID : 46916289161536 PROC : db2sysc
INSTANCE: zhuge NODE : 000 DB : HADRDB
HOSTNAME: zeeman
EDUID : 453 EDUNAME: db2hadrp.0.1 (HADRDB)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrSetHdrState, probe:10000
CHANGE : HADR state set to HDR_P_PEER (was HDR_P_NPEER), connId=1
In V10.1 and later, each time the primary and standby establishes connection, an connection id ( connId ) is assigned. The connection id is shared on primary and standby (primary and standby negotiate a common id for a particular connection). ConnId is included in all HADR state change messages. You can use this id to match messages on primary and standby. For example, the entering peer messages on primary and standby sharing the same connId are from the same peer session. Previously, you can only use timestamp to match primary and standby messages, which is not reliable, especially when primary and standby hosts have clock skew. Sample messages:
Primary side:
2013-10-07-14.26.23.545742-420 I119733E421 LEVEL: Info
PID : 1562 TID : 46916289161536 PROC : db2sysc
INSTANCE: zhuge NODE : 000 DB : HADRDB
HOSTNAME: zeeman
EDUID : 453 EDUNAME: db2hadrp.0.1 (HADRDB)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrHandleHsAck, probe:30440
DATA #1 : <preformatted>
Connection succeeded, connId=1
Standby side message for the same round of handshake (connId is also 1):
2013-10-07-14.26.23.551195-420 I48195E422 LEVEL: Info
PID : 2973 TID : 46914393336128 PROC : db2sysc
INSTANCE: zhuge NODE : 000 DB : HADRDB
HOSTNAME: zernike
EDUID : 315 EDUNAME: db2hadrs.0.0 (HADRDB)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrHandleHsAck, probe:30440
DATA #1 : <preformatted>
Connection succeeded, connId=1
You can find out the history of start/stop/takeover HADR command invocation from db2diag.log.
In V10.1 and later, the following messages are recorded:
In V9.7 and earlier, the following message is recorded for all variations of this command:
You may use role and state information to determine which variation of the command was issued.
The following message is recorded for this command:
The following message is recorded for this command on the standby's db2diag.log:
On the standby, a message following this message indicates which variation of takeover command was issued:
On the primary, you can find the following messages: