DB2 HADR Performance Tuning using
DB2 Log Scanner, HADR Simulator, and HADR Calculator
This article provides step by step guide for HADR performance tuning. We will use DB2 Log Scanner to analyze database workload, HADR Simulator to measure network and disk speed, and HADR Calculator to estimate performance of various HADR sync modes. Use this easy-to-follow 4 step procedure to tune your network to its full potential and choose the optimal sync mode for HADR.
For overall discussion on HADR performance, including performance monitoring and diagnostics, see HADR perf
Use DB2 log scanner to scan log files generated in a workload cycle. For example, if you workload cycles each day, then you need to scan log files from a 24 hour period. If you have special processing window such as weekly reporting, or monthly table reorg, scan logs from those windows too.
Ideally, you want to do this with HADR disabled, to establish a base line for performance study. But if you already have HADR enabled and your business requires HADR to stay enabled, you can still scan the logs and explore various sync modes and parameters.
Scanner command example:
db2logscan_linux S0003158.LOG S0003159.LOG ... S0003214.LOG > daily.scan
You just need to provide the log files as arguments. A single scanner binary can scan files from any DB2 version and any endian. The scanner will auto detect endian and DB2 version where the files are generated. Scanner writes output to stdout. Therefore we redirect stdout to a file.
To scan a large number of files, you can create a file with the file names in it, one per line, then feed the file to scanner via the -f option. If some files need to be retrieved from archive, you can use the -retrieve option.
You need not scan all files in one command. As long as you save the output files, you can scan using multiple invocation of the scanner. However, it is recommended that consecutive files be scanned using one command so that the scanner can keep the context across file boundary (process records spanning file boundary and treat files as one continuous stream), and generate distribution data for logging metrics as one set.
Please save all output files from the scanner. We will need them in later steps.
Use DB2 HADR simulator to measure performance of your logging disks. Disk speed is modeled as:
write_time = per_write_overhead + data_amount / transfer_rate
On a traditional hard drive, majority of per_write_overhead is spent on the disk head moving to a given sector on a given track (seek time). Once the head reaches the sector, data can be read or written at (more or less) a constant transfer rate. On SSD (Solid State Drive) devices, per_write_overhead will be very small or negligible.
We will use a small write and a big write to get the per_write_overhead and transfer rate of a disk. <logPath> in the command is the actual log path of the database. The file "testFile" will be generated by simulator. Simulator does not automatically remove the file (so that you can examine its content or sector layout on disk after the test run if needed). When you are done, remove this file.
Command for small write (single page write):
simhadr_linux -write <logPath>/testFile -flushSize 1
Sample output:
Total 9409 writes in 4.000160 seconds, 0.000425 sec/write, 1 pages/write
Total 38.539264 MBytes written in 4.000160 seconds. 9.634431 MBytes/sec
This means per_write_overhead is 0.000425 second (from the highlighted "sec/write" field).
Command for big write (1000 pages per write):
simhadr_linux -write <logPath>/testFile -flushSize 1000
Sample output:
Total 174 writes in 4.014913 seconds, 0.023074 sec/write, 1000 pages/write
Total 712.704000 MBytes written in 4.014913 seconds. 177.514183 MBytes/sec
This means transfer rate is 177.514183 MBytes/sec (from the highlighted "MBytes/sec" field).
In step 4, you will use "-disk 177.5 0.000425" as disk speed option for HADR calculator, with transfer rate from the 1000 page test and per_write_overhead from the single page test.
Theoretically, you can feed the numbers from the two runs into a equation system (linear equation y = a*x + b) to get the two unknowns, but the result will be very close to the simplified method shown above. The error margin of the input (variation from run to run) will be greater than the additional accuracy gained from the formal method. Thus there is no need to do more complicated math.
Disk read and write speed are usually close. And we are mostly concerned about write speed. Thus only write test is needed.
Run the test only when database is offline or when logging load is zero or very low, as the test will stress the logging disk and slow down database logging.
Do the test on both HADR primary and standby hosts. It is recommended that primary and standby have the same hardware. The HADR calculator assumes same disk speed on primary and standby. It only accepts one set of disk speed as argument. The test on primary and standby disks is intended to verify that the disks indeed have same speed. If different disks are used, use the slower speed for HADR calculator.
In this step, you will measure HADR primary/standby network speed. Network speed is specified by two numbers: send rate and round trip time. Round trip time can be easily measured by the "ping" command. Send rate may be sensitive to socket buffer size. A special procedure is used for the measurement. See Using HADR simulator to determine optimal TCP window size for detailed instructions. This step only gives the tentative socket buffer size based on TCP requirement. Final socket buffer size will also depend on the workload and flush size. It will be determined in step 4. Remember to configure DB2 with the chosen socket buffer size. Otherwise, DB2 won't get the send rate from the simulator test. See TCP Tuning for more info.
In step 4, use "-network <send_rate> <round_trip_time>" for network speed option for HADR calculator.
Now feed scanner output from step 1 into HADR calculator. Example:
hadrCalculator.pl -disk 200 0.001 -network 10 0.1 <scanOutFile1> ... <scanOutFileN> > daily.scan.calc
Calculator writes output to stdout. So we redirect its stdout to a file ("daily.scan.calc" in the example).
Now look at the calculator output. Look for question mark ("?") in the output. Calculator uses question marks ?, ??, and ??? to indicate small, medium, and heavy impact to applications when HADR is enabled. If no question mark is found, then all is good (expect no impact to applications at all). You may experiment with hypothetical disk or network speed to see the effect of disk or network upgrade.
Once you have chosen a sync mode, find the max flush size section at end of calculator output:
SYNC Max flush size: predicted 360 pages, workload max 1126 pages
NEARSYNC Max flush size: predicted 360 pages, workload max 1126 pages
ASYNC Max flush size: predicted 17 pages, workload max 1126 pages
HADR socket buffer size should be at least the "predicted" size of your chosen sync mode. This is the expected flush size for the given sync mode. Due to variations in workload, a larger size (such as workload max, which happens when log writing is slower than predicted) is preferred. Both predicted and workload max flush sizes are computed from average transaction size of log rate sample intervals. Actual workload may have peak size above the average. Thus a number even higher than workload max is preferred, as long as the size is reasonable (no more than 32MB). In the above example, a 2000 page (8MB) socket buffer size is recommended.
Some systems may not perform well with large (over 32MB) socket buffer. It is recommended to cap the size at 32 MB. Most databases wouldn't hit this cap. 32MB (8000 pages) is a very large flush size.
For verification, run HADR simulator using the chosen sync mode, socket buffer size, disk speed, and predicted flush size to confirm that the system will perform as expected (compare the throughput from simulator to the line with the same flush size in calculator output). Even though this is simulation, the network is stressed with real data. A simulator run with the final parameters is strongly recommended before applying the parameters to the database.
IDUG 2014 - High availability disaster recovery and performance.pdf, by Dale McInnis and Effi Ofer