Simpana v9 SP7 DDB Baseline

Goal

Determine the maximum throughput of the DDB using our current hardware when doing auxiliary copies.

This will also be a baseline to evaluate possible performance loss as the DDB grows with time.

Methodology

When no other jobs are running, a number of already copied jobs will be set to “Re-copy”. This will allow us to ensure that the network is not a bottleneck and that we put as much stress on the DDB as possible since no data will be copied and only hash comparison will be done.

The auxiliary copy will be done using 50 streams which is the maximum number of streams recommended by Commvault for the DDB. During prior testing using 25 and 75 streams on the same type of hardware, I was able to confirm that using 50 streams does yield better results.

Hardware used

Source Server CPU: 2 * Quad Core @ 1866 Mhz, hyper threading disabled RAM : 16 gb LAN : 1 gbit/s Maglib : 8 arrays of 12 * 500GB 7200 rpm drives

Destination Server CPU: 2 * Quad Core @ 2533 Mhz, hyper threading enabled RAM : 16 gb LAN : 1 gbit/s Maglib : 2 arrays of 12 * 2TB 7200 rpm drives DDB : Fusion IO card

The network link between both sites is a 1 gbit/s dark fiber throttled within Commvault at 500mbit/s. Software configuration

Source media agent - Optimized for concurrent LAN backups is enabled - DataMoverUseLookAheadLinkReader is enabled

Destination media agent - Optimized for concurrent LAN backups is NOT enabled - Network throttling set to 500mbit/s in each direction with source media agent

Storage policy copy - Secondary copy points to a global deduplication policy hosted by the destination media agent - Secondary copy is set to use DASH copy, disk read optimized

Auxiliary copy results summary

Total data size: 14 TB Data over network: 132 GB Job duration: 3 hours 42 minutes Job throughput: 3.8 TB/hr Max throughput*: 8.2 TB/hr

*Maximum throughput was calculated by looking at the streams tab and manually adding the throughput of each stream. I’m not sure how precise this tab’s numbers are though. Job throughput (average) is much lower since the max throughput is only achieved when using close to 50 streams. You lose a lot of speed during the initialization of the job and by the end when only a few streams are left. Also, I only thought about checking this once there were 46 streams left to copy so maximum throughput might have been a little higher at some point.

Extra comments

1. On a server with 2 quad core CPUs, the utilisation of those processors was between 70-90% for the whole time the copy was done. So, to achieve those kinds of throughput, it’s clear that we would need either a dedicated DDB server or have a server with more CPUs as there wasn’t much resources left for the MA to do its regular tasks.

2. Seeing the write IOPS spikes on the Fusion IO card, it is clear that the DDB should be hosted on SSDs.

3. A 14 TB copy generated about 132GB of data transferred over the network (should be only hash comparison traffic). Assuming the throughput is constant, we’re talking about 80mbit/s needed to sustain such throughput. If the WAN link is slower than this, it might be worth testing the same copy using the network optimized copy option. Looking at the destination MA network stats (see details section below), the average throughput of received data over the network is actually 170mbit/s. Commvault already confirmed that there is an overhead not calculated in their above counter:

“There is meta data that is transferred that is not included in the calculations. Roughly this could account for up to 10% of the total data transferred.” Obviously, the difference above is way more than 10%. I am still trying to figure out if this is an issue isolated to us or not. I think it could be a Windows and TCP/IP overhead. Auxiliary copy details and resource usage statistics

Destination global deduplication database Source MA overall CPU usage

Source MA processes CPU usage Source MA Windows disk performance Destination MA overall CPU usage

Destination MA processes CPU usage

Destination MA Windows disk performance Destination MA Fusion IO stats (MB/s)

Destination MA Fusion IO stats (IOPS) Destination MA Network bytes received stats