Redpaper

Tim Bohn Karen Lawrence Advantages of IBM eX5 for Workloads

Executive summary

Methods for optimizing database workloads can vary, depending on the type of database and the purpose of the workload. For online (OLTP) workloads, for example, there are thousands of I/Os per second (IOPS). We describe ways to optimize response time to the user for this type of workload. From the Business Analytics (BA) workload perspective, sometimes you might need to run multiple, sometimes complex, queries that can take a long time, even hours. For this type of workload, optimized I/O throughput is the goal.

IT departments cannot acquire the hardware and software to meet all needs for all projects. To address this requirement, select the equipment that meets as many project needs as possible, realizing that there are trade-offs. For database workloads, a delicate balance exists between CPU processing power and I/O throughput, so the trade-off can be considerable.

When analyzing the difference in OLTP and BA workloads, you must realize that it is impossible to have the best processing power and the highest IOPS from the same hardware and software. The tuning for these workloads differs. To compare, one component of OLTP workloads is database locking. Efficient locking is required because updates are going against the , and inefficient locking degrades as the number of users grows.

In contrast, for BA, becomes much more important. Tools, such as Materialized Query Tables and Multidimensional Clustering Tables, help to speed up complex queries where the data might not change (is precomputed) but tend to be larger (just like the data in BA). Another factor, which becomes more prevalent, is query monitoring and management. Proper monitoring ensures that queries use only a certain amount of resources before either being stopped or moved down to a lower priority to limit the impact to the system. Additional effort is spent on both the denormalization and partitioning, to try and limit I/O effects.

This IBM® Redpaper™ publication describes IBM products that address requirements specific to OLTP and BA workloads. These products are the result of experience with clients and IBM Business Partners over many years. They convey the importance of database workload efficiency in multiple workload environments.

© Copyright IBM Corp. 2012. All rights reserved. ibm.com/redbooks 1 This paper details the features of IBM eX5 servers. IBM designed these servers to increase efficiency and lower operating costs, therefore, maximizing workload optimization for OLTP and BA workloads. Additionally, this paper describes other solutions that work in concert with the eX5 family of servers. This paper is based on in-house studies conducted by the IBM Strategy and Testing Laboratory (STL). For these studies, production-quality IBM System x3690 X5 servers were used.

This publication is directed to IT professionals and decision-makers, such as CEOs, CIOs, CFOs, clients, Business Partners, information architects, business intelligence administrators, and database administrators.

The balanced system design objective

For database systems, a delicate balance exists between CPU processing power and the I/O throughput needed from the disk. This balance involves other factors, such as memory. When you add more CPU processing power, you become I/O limited. The processor spends more time waiting for data from the I/O system. As you add more processing power to your I/O system, the system becomes CPU starved.

In a balanced system design1, the most efficient, optimized systems are not starved for I/O or CPU. The systems are considered balanced so that the CPU and the storage do not become a bottleneck. Figure 1 describes this concept.

CPU Rich ce an I/O Starved rm rfo pe re mo CPU Processing n sig Capacity edDe lanc Ba

CPU Starved I/O Rich

Input/Output Operations per Second (IOPS)

Figure 1 Efficient database OLTP performance requires a balance of optimizations

1 For more information about balanced system design, see The Benefits of Optimizing OLTP Using IBM eXFlash Solid-State Drives, REDP-4849, April 2012, available at this website: http://www.redbooks.ibm.com/abstracts/redp4849.html

2 Advantages of IBM eX5 for Database Workloads Characteristics of data processing workloads

Each type of workload includes certain characteristics or tasks that define the workload. Figure 2 lists characteristics that are applicable to OLTP and BA workloads. Notice the heavy use of system resources by each type of workload.

On Line Transaction Business Analytics Programs (OLTP)

Transactions against operational Complex queries against a data data warehouse Reads and writes Read only Normalized schema design to Star schema design often eliminate redundancy used to speed up queries Multiple user throughput operation Different modes of operation

Figure 2 Data processing workloads have different characteristics

We examine the details of the tasks to help you fully understand how each characteristic is an important factor in workload optimization.

Characteristics of OLTP workloads

OLTP workloads are characterized by many transactions from many parallel users, with those transactions that go against the data. A mixture of reads and writes exists. The database needs to be normalized to eliminate data redundancy and to update efficiently.

Characteristics of BA workloads

Three possible methods for running BA workloads are defined as serial execution, concurrent execution, and fixed execution. The following definitions describe these types of workloads:  Serial execution: A single user uses the system to run successive reports.  Concurrent execution: Multiple users run reports in parallel; you want to know how many reports per hour can be generated.  Fixed execution: Similar to concurrent execution, but multiple users run reports in parallel; you want to know how quickly you can issue a fixed number of reports.

Optimizing database workloads for the best performance and price

OLTP and BA workloads have different, often conflicting, characteristics. Therefore, what is the best way to optimize your systems to supply the best performance for these workloads and at the best price? The IBM STL group tested both the OLTP and BA scenarios to determine how the need for the best performance at the best price can be met.

Advantages of IBM eX5 for Database Workloads 3 Optimizing OLTP workloads

For OLTP workloads, we conducted a study that reviewed the Transaction Processing Performance Council (TPC-C)2 standard benchmarks for systems that are considered balanced in terms of CPU processing power and IOPS. This balance means that neither system is a bottleneck.

We looked for the system setups that provide the most transactions per CPU and hard disk drives used. We used the following calculations in our testing configurations: 1. We took the better half (top 50%) of the systems with the largest transaction per CPU Performance (CP) rate. We then calculated the CPU better half transactions per minute (tpm) C/CP average. 2. We took the better half (top 50%) of the systems with the largest IOPS rate. We then calculated the hard disk drive better half tpm C/HD average. 3. We found the top 30% of balanced systems with the tpm C/CP and tpm C/HD rates closest to the better half averages.

Figure 3 shows our results, where CP units are a standardized set of common performance units, created with multiple benchmarks across the different processors.

140,000

120,000

100,000

80,000

60,000 CPU Performance

40,000 CPU Performance (CP)

20,000

0

0 500,000 1,000,000 1,500,000 2,000,000 2,500,000 3,000,000 3,500,000

Input/OutputInput Output Operations Operations per Secondper Second (IOPS) Figure 3 Best-optimized, hard disk drive (HDD) results for OLTP

2 The TPC-C is a standard testing tool used in the industry. This tool simulates the “execution of multiple transaction types that span a breadth of complexity.” For more information, see this website: http://www.tpc.org/default.asp

4 Advantages of IBM eX5 for Database Workloads Optimizing BA workloads

BA workloads have different modes of operation from which to compare performance. For BA workloads, the IBM STL group tested the I/O throughput made available in three modes of operation. The following scenarios were used to test the I/O throughput:  Serial execution: A single user ran each of nine reports, consecutively, using 100% of the system resources. Serial execution shows how well a system issues the BA reports in isolation.  Concurrent execution: Eighty users ran reports concurrently. We calculated an average number of reports completed per hour. We tested a concurrent mix of three different types of reports that were categorized as simple, intermediate, and complex. Each user repeatedly ran a particular report as many times as possible within a specified time. At the end of the allotted time, no more reports were started. When the running queries finished, the numbers for each type of report were tallied and the averages calculated.  Fixed execution: A fixed number of reports were issued. Eighty users ran reports in parallel until all were finished. The total elapsed time was then measured.

Results of this test are described further in “Workload optimization: BA” on page 14.

Workload optimization: OLTP

In this section, we describe IBM hardware that was designed for use in workload optimization. We also cover specific technologies, in-house testing conducted by the STL group, and how tests results can be useful for your OLTP and BA workloads.

IBM System x and IBM DB2

Together, IBM System x® servers and IBM DB2® are designed to increase performance and efficiency. They can provide what you need to maintain a balanced system as your IT needs grow and change.

There are powerful symmetric multi-threading (SMT) systems that can scale up to the most cores and threads. Or, if the systems are used in conjunction with DB2 IBM PureScale, they can scale out with almost linear scaling, when staying transparent.

If your system needs more I/O throughput, eX5 servers can add eXFlash drives to boost onboard throughput. The IBM Storwize® V7000 can also be added for even more IOPS. Figure 4 on page 6 shows the performance and efficiency data for these two products. The figure also shows how CP units are a standardized set of common performance units, created with multiple benchmarks across the different processors.

Advantages of IBM eX5 for Database Workloads 5 Balanced Systems - Top HDD-Based TPC-C CPU and IOPS used

DB2 140,000 PureScale Clusters 120,000

100,000

Exploit SMT 80,000 Threads CPU Performance(CP) 60,000

40,000

20,000 Input/Output Operations per Second (IOPS)

0

0 500,000 1,000,000 1,500,000 2,000,000 2,500,000 3,000,000 3,500,000

V7000 SSD eXFlash with Easy Tier

Figure 4 System x, eXFlash, and Storwize V7000 are designed to improve efficiency

System x with DB2 or eXFlash The IBM eX5 has the CPU processing power to meet the performance requirements of OLTP workloads. The eX5 servers cover a wide range of needs, from a two-socket, 20-core blade and rack mount server; to a four-socket, 40-core System x3850 X5 that can scale to eight sockets and 80 cores. DB2 helps by scaling efficiently to enable the use of the largest System x server (in terms of processing power). Figure 5 on page 7 outlines more about the processing power of System x servers.

6 Advantages of IBM eX5 for Database Workloads BladeCenter HX5 CPU time Applications 100  2 socket, 20 cores waiting for 90 I/O 80

System x3690 X5 70  2 socket, 20 cores 60 50 •I/O Wait % System x3850 X5 40 Percent  30 •Sys % 4 socket, 40 cores 20 10 •App % 0 Clock time Unbalanced System

DB2 scales efficiently to leverage the largest System x server But CPU power growth has far exceeded growth in I/O rates As you add more CPU, you need more I/O rate capacity

Figure 5 System x delivers processing power for your database workloads

As you add more CPU power, you realize that there is not enough I/O capacity. This lowered capacity becomes a bottleneck, leaving the CPU idle. We wanted to determine the performance impact of HDDs compared with eXFlash for OLTP workloads3. One solution to the lowered CPU idle time is to add eXFlash solid-state drives (SSDs). With eXFlash technology, you can put two drives in the space of one hard disk drive. The System x3850 X5 holds up to 16 eXFlash SSDs, and the System x3690 X5 holds up to 24 eXFlash SSDs. Figure 6 shows the specifics of eXFlash with SSDs. More details about System x with eXFlash are listed in Figure 7 on page 8.

Solid State Disk Option for eX5 rack mount IBM System x3850 X5 servers with eXFlash

Combination backplane shuttle/disks, compatible controller  Supports RAID 5/6 configurations for greater data availability

1.8” Form Factor, 50/200 GB capacities  Twice the density (2) in a 2.5" bay compared to the competition

Max IOPS: 240K (read-only)  OLTP (4K) IOPS: 197K (66% R, 33% W)

Up to 8 drives per module

Figure 6 eXFlash adds SSD to System x

3 The study is briefly described in this paper. The details about the full study are available at this website: http://www.redbooks.ibm.com/abstracts/redp4849.html

Advantages of IBM eX5 for Database Workloads 7 For the database server, we used the System x3690 X5 with two Intel X7650 processors with eight cores for each processor. We compared the relative performance and cost of the following configurations:  eXFlash SSD: 200 GB, Read, 30,000 IOPS per drive, eight drives per eXFlash (240K IOPS per eXFlash)  HDD: 15K rpm serial-attached SCSI/Serial Advanced Technology Attachment (SAS/SATA) 3 Gbps link speed

Figure 7 further describes our test configuration.

System x3690 X5 with HDD Only vs. System x3690 X5 with eXFlash

System x3690 X5 Intel x7560 2.27 GHz 2 sockets, 16 cores 128 GB RAM, DB2 9.7.2 (64 bit) Windows 2008 R2 (64 bit)

Two 146 GB Four 146 GB Two 146 GB One 200 GB 15K SAS HDD 15K SAS HDD 15K SAS HDD eXFlash SSD

OS, Apps, Database OS, Apps, Database DB logs tables DB logs tables

Figure 7 In this study, we assessed the performance of HDDs versus eXFlash for OLTP workloads

The testing focused on a 100 GB OLTP database to determine how many HDDs it needed to match the performance of a single eXFlash SSD. This scenario was entirely focused on the required performance levels. The extra storage capacity of the HDDs was not considered. Figure 8 on page 9 describes our calculations. DB2 can use eX5 with eXFlash to drive increases in performance.

The values in Figure 8 on page 9 indicate the list prices that were published on the IBM website (http://www.ibm.com) when the analysis was conducted in mid-2011. They might not be representative of current list prices. You need to discuss your current environment with your IBM representative to determine how much IBM can save you at the current prices.

8 Advantages of IBM eX5 for Database Workloads Less than 1% cost increase, yields 4X performance improvement IBM System IBM System x3690 X5 x3690 X5 with 1 SSD Price/Performance of eXFlash vs. HDDs on DB2 with 4 HDD eXFlash

1600 $1,000,000 System $24,107 $24,107 1,366 1400 $950,000 Added internal storage $1,756 $3,394 1200 $900,000 4.2X Total HW $25,863 $27,501 1000 $850,000 DB2 license 800 $800,000 PVU=70 $453,600 $453,600 (TPS) 600 $750,000 3 yr Maintenance $181,440 $181,440 325

400 $700,000 Cost Platform Total of

Transactions per Second Total SW $635,040 $635,040 200 $650,000 $660,903 $662,541 0 $600,000 Total Platform IBM x3690 X5 with 4 HDD IBM x3690 w 1 eXFlash $660,903 $662,541 Internal Storage Option Throughput 325 1,366 Throughput performance TPS Single Server cost (HW+SW) Performance TPS $ Dollars per $2,033.55 $485.02 TPS

Figure 8 DB2 can use eX5 with eXFlash to drive higher performance at a lower cost

In another study, we compared IBM hardware and software with the similar hardware and software of a competitor. We compared the System x3850 X5, four socket (Intel X7560 2.26 Ghz, eight core), with SSDs and HDDs, to a four-socket comparison machine with an identical Intel processor.

The software in each configuration used the same OLTP driver, with DB2 on the IBM configuration and a comparable database on the comparison machine. Figure 9 shows more about the configuration details.

IBM DB2 9.7 Enterprise Server Edition

Windows Server 2008 R2 64 bit 32 threads (3) eXFlash 200 GB SSD IBM System x3850 X5 (2) HDD Intel X7560 4 sockets, 32 cores OLTP 128 GB Workload

65% read / 35% write Competitor Database 100 users Windows Server 2008 R2 64 bit 32 threads (3) 50 GB SSD Competitor server (2) HDD Intel X7560 4 sockets, 32 cores 128 GB Figure 9 System x3690 X5 with eXFlash and DB2 compared with a similar system

Advantages of IBM eX5 for Database Workloads 9 Results shown in Figure 10 indicate that the IBM hardware and software stack outperformed the comparison stack by 54%. The hardware price difference was primarily because the IBM eXFlash SSDs were more expensive than the competitor SSDs. The IBM eXFlash SSDs also outperformed the SSDs of the comparison machine. Figure 10 shows more results of this study. Values in Figure 10 indicate the list prices that were published on the IBM website (http://www.ibm.com) when the analysis was conducted in mid-2011. They might not be representative of the current list prices. You need to discuss your current environment with your IBM representative to determine how much IBM can save you at the current prices.

IBM DB2 9.7 Enterprise Server Edition

Windows Server 2008 R2 64 bit Transactions/sec 32 threads 2,490 (3) eXFlash SSD System x3850 X5 (2) HDD $748 per Transaction/sec Intel X7560 4 sockets, 32 cores 128 GB 54% faster Competitor Database Windows Server 2008 R2 64 bit 32 threads 1,618 Transactions/sec (3) Competitor SSD (2) HDD Competitor server $804 per Transaction/sec Intel X7560 4 sockets, 32 cores 128 GB

Figure 10 DB2 on System x increases performance by 54% over a comparable system

IBM DB2 Enterprise Edition on System x In our test, we compared the cost of the DB2 Enterprise Edition to the equivalent comparison software. Results show that the normalized cost of the (TPS) of the DB2 Enterprise Edition came out slightly ahead. See 1 for the comparison.

List prices: Values in Table 1 and Table 2 on page 11 are the list prices that were published on the website when the analysis was conducted in mid-2011. They might not be representative of current list prices. You need to discuss your current environment with an IBM representative to determine how much IBM can save you at the current prices.

10 Advantages of IBM eX5 for Database Workloads Table 1 IBM DB2 Enterprise Edition and the comparison software

IBM DB2 Enterprise Edition Comparison software

Price for each PVUa $405b .5 (Core Factor Table) x 32 Cores: $47,500c

100 PVUs (System x3850 X5) $40,500 $760,000

Cores 32 32

Support, year 1 Included 22% of original cost of each year: $167,200

Year one total price $1,296,000 $927,200

Support, years 2 and 3 $518,400 22% of original cost for each year: $167,200 for each year or $334,400

Three-year total price $1,814,400 $1,261,600 a. Processor value unit b. Values indicate the list prices that were published on the IBM website (http://www.ibm.com) when the analysis was performed in mid-2011. They might not be representative of current list prices. You need to discuss your current environment with an IBM representative to determine how much you can save at the current prices. c. List prices and other information about products that are not IBM were obtained from a supplier of these products, published announcement materials, vendor web pages, or other publicly available sources. Information does not constitute an endorsement of such products by IBM. Questions on the capability of products that are not IBM need to be addressed to the supplier of those products.

By comparison, DB2 is typically sold as the Advanced Server Enterprise Edition. This version has all of the options included that clients normally need at a reduced cost, rather than adding the options separately. For the comparison software, several of those options are available only at an added cost. When this cost comparison was prepared, the IBM stack came out at 46% less cost per TPS, as shown in Table 2.

Table 2 Factor in the cost of frequently used features to extend the IBM total cost of ownership IBM DB2 9.7 Advanced Competitor database Server Enterprise Edition (AESE)

Hardware $48,923a $39,752b

Core functionality $2,016,000 $1,261,600

Advanced security Included $305,440

Compression Included $305,440

Partitioning Included $305,440

Disaster recovery Included $265,600

Total cost (3 years) $2,064,923 $2,483,272

Total TPS $2,490 1,618

Cost per TPS $829/tps $1,535/tps a. Values indicate the list prices that were published on the IBM website (http://www.ibm.com) when the analysis was performed in mid-2011. They might not be representative of current list prices. You need to discuss your current environment with an IBM representative to determine how much you can save at the current prices. b. List prices and other information about products that are not IBM were obtained from a supplier of these products, published announcement materials, or other publicly available sources. Information does not constitute endorsements of such products by IBM. Questions on product capabilities that are not IBM need to be addressed to the supplier.

Advantages of IBM eX5 for Database Workloads 11 IBM Storwize V7000 and IBM Easy Tier

Next, we describe Storwize V7000 and IBM Easy Tier.

Enterprise-class workloads require SSDs with more storage capabilities For larger storage needs, you can add the Storwize V7000. A Storwize V7000 expansion enclosure can hold twenty-four 6.35 cm (2.5 in.) drives. The drives can be a combination of HDDs and SSDs.

The control enclosure holds multiple expansion enclosures, can hold 240 drives per control enclosure, and is designed to move data from the HDD to the SSD. This configuration is based on real-time usage patterns that run in the background and move data automatically.

Figure 11 shows how enterprise-class workloads can benefit from an SSD that provides more storage capabilities.

IBM Storwize V7000 with SSD

More SSD capacity Virtualization of HDD and SSD Automatic non-disruptive optimum assignment of SSD Clustered systems Operational capabilities  Copy services eXFlash  Thin provisioning

Figure 11 Enterprise-class workloads require SSDs with more storage capabilities

Storwize V7000 also has the thin provisioning feature, so that volumes appear to be larger than what is allocated. More space is allocated on demand for less wasted disk space. For more information about Storwize V7000, see this website: http://www.ibm.com/systems/storage/disk/storwize_v7000/index.html

Easy Tier Your SSD performance can be maximized further, while minimizing costs. Built into the storage controller, Easy Tier dynamically moves data to an SSD that is based on hotspots, and migrates data between storage devices. Performance improvements of up to 300% are observed by moving some of the data from HDD to SSD. Easy Tier is shown to improve response time by 78% by moving only 5% of the data from HDD to SSD. All storage vendors have SSD technology, but the Easy Tier is available from IBM only. Figure 12 on page 13 shows more of the optimization characteristics of Easy Tier.

12 Advantages of IBM eX5 for Database Workloads Migrates data extents between SSD and HDD in same pool  Automatic hotspot detection Virtualized SSD is shared across all workloads using the pool More cost effective use of SSD versus ad hoc dedicated assignment Transparent to applications, no code changes required

3000.0

SSD Array 2500.0

2000.0

1500.0 3.8X Improvement Cold data Hot data 1000.0 migrates down migrates up

500.0 Overall Transactionsper second

0.0 0 1 2 3 4 5 6 7 8 9 1 1 1 1 0 1 2 3 HDD Array Example: Complex database transactional workload

Figure 12 Easy Tier® in the Storwize V7000 automatically optimizes the use of SSDs

Figure 13 shows how SSD use becomes optimized over time, automatically. The left side of the figure shows the data that is in high demand and how it moves from HDD to SSD. The right side shows the TPS, which start at zero, and become optimized over time as a result of data movement to SSD.

Over time, Easy Tier seamlessly moves Optimal use of SSD in achieving highest the data in high demand from HDD to performance SSD

600 3000.0

500 2500.0

2000.0 400

GB 1500.0 300

1000.0 200

500.0 100

0.0 0 Solid State Disk Transactions per Utilization Second Figure 13 More results: Easy Tier in the Storwize V7000 automatically optimizes the use of SSDs

We compared performance by using an IBM BladeCenter® HX5 with a Storwize V7000 that was populated only with HDDs. This configuration was compared to the same setup, but we replaced four of the HDDs with SSDs. Figure 14 on page 14 reports these results.

Advantages of IBM eX5 for Database Workloads 13 Values in the figure show list prices that were published on the IBM website (http://www.ibm.com) when the analysis was conducted in mid-2011. They might not be representative of current list prices. You need to discuss your current environment with an IBM representative to determine how much IBM can save you at the current prices.

A 6% cost increase yields 3.5X performance improvement

3000 $2,850,000 2469 2500

$2,350,000 2000 3.5X 1500 $1,850,000

1000 698 Cost of Total Platform Total of Cost $1,350,000 500 $1,110,932 Transactions per Second (TPS) $1,050,936

0 $850,000 IBM BladeCenter HX5 + IBM BladeCenter HX5 + V7000 HDDs V7000 Easy Tier Throughput performance TPS Server cost (HW+SW)

Figure 14 Storwize V7000 automatic data placement with Easy Tier is cost effective

The cost of this performance comparison increased by 6% because of the added SSDs. However, we also saw a 3.5-times boost in performance because of the SSDs. IBM Easy Tier is available at no extra charge.

Workload optimization: BA

In the study described in “Optimizing BA workloads” on page 5, we tested the I/O throughput that is available in the three modes of operation (serial, concurrent, and fixed execution). IBM offers different approaches to optimizing BA workloads, ranging from single-workload to multiple-workload systems. The following approaches to optimizing workloads are offered:  Optimized by design, for example, DB2 on an eX5  Flexible integrated, for example, an IBM Smart Analytics system  Tightly integrated appliances, for example, IBM Netezza® 1000 or IBM DataPower®

These approaches are listed in decreasing order for flexibility, but in increasing order for time-to-value.

Flexible systems require more product knowledge and can be used for multiple workloads. The tightly integrated appliances require little product knowledge, because they are pre-configured to a specific workload.

14 Advantages of IBM eX5 for Database Workloads The IBM STL group studied the process for optimizing systems for BA workloads, with a particular focus on key differentiation and how workloads provide economic value. Our philosophy is that one size does not fit all, because each project workload differs based on many factors. Figure 15 identifies some of these factors.

Client-built with Integrated Optimized Optimized Components Systems Appliances Flexible

DB2 on IBM Smart IBM Netezza IBM System x Analytics System 1000

Focused

Optimization • Components • Complete factory- • Appliances focused on Type optimized to work optimized systems for a single purpose or together by design multiple services service Characteristics • Highly flexible • Focused on selected • Single-purpose • Client-tuned to workloads tuned at the focused for most exact needs factory simplicity • Support broadest • Flexible workload • Factory-tuned to a range of workloads choice specific task and services • Extensible and scalable

Figure 15 IBM has multiple BA solutions to meet your requirements

Several different configurations of optimized systems exist and focus particularly on key differentiations and how they can provide economic value.

The following options are provided with the IBM eX5 portfolio:  IBM System x3850 X5: Versatile four-processor, 4U rack-optimized scalable enterprise server with optional memory expansion unit (MAX5) and workload-optimized models. This option provides a flexible platform to facilitate maximum utilization, reliability, and performance of compute-intensive and memory-intensive workloads.  IBM BladeCenter HX5: Scalable blade server that enables standardization on the same platform for two-socket and four-socket server needs for faster time-to-value. This standardization occurs while delivering peak performance and productivity in high-density environments.  IBM System x3690 X5: High-end, two-processor, 2U scalable server offers up to five times the memory capacity of the two-socket servers of today. This platform offers double the processing cores for superior performance and memory capacity.

The eX5 servers are highly scalable and flexible and provide the MAX5 option that allows for an additional 32 dual inline memory modules (DIMMs) of memory, decoupled from the processor. In addition, new eXFlash SSDs can be added to increase I/O throughput.

Advantages of IBM eX5 for Database Workloads 15 A flexible integrated system: IBM Smart Analytics System 5600

The key attributes of the IBM Smart Analytics System 5600 include the following features:  Pre-packaged system Includes optimal I/O subsystem, large I/O bandwidth, and modular storage scaling for analytics  DB2 and IBM Cognos® capabilities  A sound investment: – Enables you to start with the correct size and scale – Deploys in days, rather than months

A tightly integrated appliance: IBM Netezza 1000

Netezza appliances integrate database, server, and storage, delivering high performance and competitive pricing. Key attributes of the Netezza 1000 include the following benefits:  Designed for rapid analysis of data scaling to petabytes.  Delivers 10 - 100 times the performance improvements of traditional database vendors, at one-third the cost of options.  Integrates business intelligence and advanced analytics.  Delivers fast time-to-value; meaningful proof-of-concept can be available in as little as two weeks.  Offers performance with its default settings, without requiring tuning, indexing, aggregations, and so on.  Integrates with other products, such as Cognos and IBM Tivoli®; data assets are managed for their entire lifecycle.

Various BA workload solutions

IBM designed various solutions to optimize your BA workloads. This section reviews custom-built, integrated, and appliance solutions. Several systems on the market are designed to support the “one-size-fits-all” concept. In contrast, IBM offers custom-built, integrated, and appliance solutions. Figure 16 on page 17 defines these workload-optimized systems and associated products.

16 Advantages of IBM eX5 for Database Workloads Custom Built Integrated Appliances

IBM Workload Optimized Systems IBM eX5 with IBM Smart Analytics IBM Netezza MAX5 memory

Figure 16 IBM offers custom built, integrated, and appliances solutions

Custom-built solution: MAX5

One of the primary features of the IBM eX5 family of servers is the memory expansion unit for the eX5, known as MAX54. This optional unit, compatible with the eX5 family of servers, is designed to considerably increase server memory capacity. With MAX5, you can nearly double your memory capacity without adding processors.

The MAX5 provides the following features and benefits:  Basic functions of MAX5: – Provides memory needed for database and virtualization workloads. – Allows higher memory capacity to be reached with less expensive DIMMs, providing more economical high-end implications.  With MAX5, you can achieve these benefits: – Expand memory capacity (up to 1 TB of additional memory). – Up to double the number of memory DIMMs (32) than is possible with comparable machines. This increase equates to up to 512 GB of extra memory for the Intel Xeon 6500 and 7500 family of processors (by using 16 GB DIMMS). And it equates to up to 1 TB of extra memory for the Intel Xeon E7 family of processors (by using 32 GB DIMMS). – Add memory without adding processor cores, reducing software costs. – Increase database buffer size.

The STL group conducted a series of tests to determine the impact that adding memory, by using MAX5, had on performance of a BA workload. In this test, multiple, different types of reports were generated in succession. The performance and costs for the reports were calculated both before and after adding MAX5. For these tests, we used a production-quality IBM System x3690 X5 server and IBM DB2. These systems were used to test the effects of MAX5 on a 1.3 TB data warehouse in a BA environment. Figure 17 on page 18 depicts the process used.

4 For product specifications, see this website: http://www.ibm.com/systems/x/options/memory/max5/specs.html

Advantages of IBM eX5 for Database Workloads 17 1 user executes complex reports …then… executes intermediate Reports …then… executes simple reports

1 User

Complex Complex Intermediate Intermediate Intermediate Simple Simple Simple Simple Report Report Report Report Report Report Report Report Report 1 3 9 10 11 2 4 5 6

Single Each report Connection executes one Reports generated by Cognos or more queries Single user serial execution test Each query gets all system Data Server resources

Figure 17 Process for testing serial execution of reports in isolation (single user)

As Figure 17 shows, this test was for a single user that requested various kinds of reports. The reports that were generated ran from complex, covering the entire data set; to simple, covering a small amount of data. The intent is to show how quickly the system responds with finished reports. The test was solely to test the effects of the added random access memory (RAM). The entire data set did not fit in the 512 GB RAM of the system without MAX5. When the RAM was doubled by adding MAX5, all of the data set was able to reside in the memory. This test is significant because MAX5 allows us to double the RAM, at a time when no other vendor offers this feature. Figure 18 describes the specifications.

No MAX5 With MAX5 (512 GB total memory) (1 TB total memory)

1.3 TB Business Analytics Data Workload Warehouse Complex queries used to generate various reports IBM System x3690 X5 IBM DS5100 2S, 16 cores (BI Day) Intel Xeon 6500/7500 family of processors 2.27 GHz DB2 9.7FP4 RHEL 5.6

Figure 18 In this study, we assessed the impact MAX5 can have in BA workloads

With compression, a 1.3 TB data set nearly filled the buffer pool when using 1 TB RAM. However, the data set was larger than the buffer pool when using the base 512 GB RAM. Figure 19 on page 19 shows the results. In this example, lower is better.

18 Advantages of IBM eX5 for Database Workloads As shown in Figure 19, with MAX5 attached, and when the response times for the three report types are averaged, reports were completed 1.5 to 2.8 times faster.

Results in Figure 19 are based on IBM internal tests of the System x3690 X5 with the MAX5 solution. This configuration is compared to the System x3690 X5 without MAX5, running an analytics workload in a controlled laboratory environment. The three-year total cost of acquisition includes expected hardware, software, service, and support, and is based on US list prices.

The cost calculation compares the average cost per report for the different report types, if the specified types of reports are run serially for the three-year life of the machine. The test compares the performance of running different reports of varying complexity. Results might not be typical and vary based on actual configuration, applications, specific queries, and other variables in a production environment. Users of this document need to verify the applicable data for their specific environment.

Average Complex Report Average Intermediate Report Average Simple Report Performance Performance Performance Serial Execution Test Serial Execution Test Serial Execution Test (1.3 TB DB Size) (1.3 TB DB Size) (1.3 TB DB Size)

Std Memory (512 GB) MAX5 (1 TB) Std Memory (512 GB) MAX5 (1 TB) Std Memory (512 GB) MAX5 (1 TB) 2500 40 0.6 0.56

2088 35 33.41 0.5 2000 30

0.4 25 1500 1400 19.29 20 0.3

1000 15 0.2 0.2 Response Time(seconds) Response Time(seconds) Response Time (seconds) 10 500 0.1 5

0 0 0 1.5x faster 1.7x faster 2.8x faster 30% less cost/report 40% less cost/report 63% less cost/report

Figure 19 Comparison of report response times: MAX5 (1 TB) versus standard memory (512 GB)

The STL group ran the same test using the System x3690 X5 with MAX5 and comparison hardware and software. Figure 20 on page 20 shows the suggested configurations for these machines. Results in Figure 20 are based on a three-year total cost of acquisition (the competitor configuration is a previous version that is no longer available). Total cost also includes the hardware, software, service, and support that are used to run OLTP workloads. Calculations were derived in January 2012, based on US list prices during that time, from http://www.ibm.com. Prices vary by country. Users of this document need to verify the applicable data for their specific environment.

Advantages of IBM eX5 for Database Workloads 19 Competitor Database System IBM System x3690 X5 2 database nodes System x3690 X5  2x8 cores x86 Core Core Core Core Core Core Core Core

 2x72 GB memory Core Core Core Core Core Core Core Core

Memory (512 GB) High speed switch MAX5 Memory (512 GB)

3 storage cells  3x8 cores x86 DS5100 Enclosure  3x24 GB memory (32 disks)  3x384 GB SSD cache  3x12x600 GB SAS2

3 YR TCA = $2,876,561 3 YR TCA (MAX5) = $910,879

Figure 20 System comparison of competitor versus IBM System x3690 X5

The test in Figure 20 was run with a 1 TB data set, instead of the 1.3 TB data set used in Figure 19 on page 19.

The results in Figure 21 on page 21 are based on IBM internal tests of the System x3690 X5 with a MAX5 solution running a BA workload. This outcome is compared to the results of testing a competitor configuration (previous version; no longer available) with the same workload, both in a controlled laboratory environment. The prices are based on a three-year total cost of acquisition (based on US list prices).

The cost calculation compares the average cost for each report for complex and intermediate report types. The calculation is based on the assumption that the specific types of reports run serially for the three-year life of the machine. The three-year total cost of acquisition includes the expected hardware, software, service, and support. Results might not be typical over a three-year period based on actual configuration, applications, specific queries, and other variables in a production environment. Users of this document need to verify the applicable data for their specific environment.

20 Advantages of IBM eX5 for Database Workloads Average Complex Report Average Intermediate Report Average Simple Report Performance Performance Performance Serial Execution Test Serial Execution Test Serial Execution Test (1 TB DB Size) (1 TB DB Size) (1 TB DB Size)

Competitor x3690+MAX5 Competitor x3690+MAX5 Competitor x3690+MAX5 1400 45 1.6 1291 1.49 39.1 40 1200 1.4 1061 35 1.2 1000 30 1 800 25 0.8 600 20 14.8 0.6 15 400 0.4 Response Time (seconds) Time Response (seconds) Time Response 10 (seconds) Time Response 0.22 200 5 0.2

0 0 0 1.2x faster 2.6x faster 6.9x faster 74% cheaper/report 82% cheaper/report 86% cheaper/report

Figure 21 DB2 on System x3690 X5 with MAX5 runs serial BA queries faster than a comparable system

Results confirmed that the System x3690 X5 was faster for each report type (complex, intermediate, and simple). The reports were also less expensive. And it became even less expensive for each report over a three-year period.

Integrated solutions: IBM Smart Analytics 5600S R2

Next, we look at a BA System x solution that is an already integrated, modular system optimized for analytics workloads. The solution is a System x server known as the IBM Smart Analytics 5600S R25. This server uses Intel technology. It is optimized for analytics workloads by using IBM warehousing and business intelligence software to transform information into actionable insight. Figure 22 on page 22 describes Smart Analytics 5600S R2.

5 For product specifications, see this website: http://www-01.ibm.com/software/data/infosphere/smart-analytics-system/5600/

Advantages of IBM eX5 for Database Workloads 21 A complete, workload optimized, data warehouse system  Including IBM Software, System x hardware, and IBM storage with SSD  System is pre-installed, tuned, and "Data Load Ready" Minimize installation, integration, configuration, and tuning requirements

Analytics  Cognos 8 Business Intelligence  Cubing Services Faster Time To Value  IBM SPSS 19 statistical package • Deploy in days instead of months  Text Analytics and Data Mining • Focus on creating business value, not installing hardware and software Data Warehouse Software  InfoSphere Warehouse Costs less than building it yourself  Advanced Workload Management • Less staff and expertise required to  Tivoli System Automation implement, tune, and maintain

Hardware and Storage Unique offering  IBM System x (System x3500 M3, • Modular for flexibility and growth System x3850 X5, System x3650 M3) • No competitor offers such an  System Storage DS3500 integrated analytics solution  SSD standard in all configurations

Figure 22 Overview of the IBM Smart Analytics system

Additional features of the IBM Smart Analytics 5600S R2 provide the following benefits:  Built on a foundation of IBM data warehouse management software, storage, and the System x server platform  Supports the Linux operating system  Offers a wide range of BA capabilities that include business intelligence reporting, analysis, dashboarding, data mining, cubing services, and text analytics  Ensures an affordable upgrade path with a flexible architecture that is designed to address the future growth of data volume, concurrent users, and analytics capabilities  Incorporates FusionIO SSD technology in optional modules to improve response time and boost efficiency for complex workloads  Accelerates system deployment with the simplicity of an appliance, while retaining the flexibility of a custom integration approach  Allows for coordinated updates and maintenance on the control console  Includes setup services and a single point of support for convenience

Figure 23 on page 23 depicts the process we used for the next test, conducted in 2011: process for running concurrent operational reports with the IBM Smart Analytics 5600S R2. The workload consisted of 80 users, each having one type of test, all submitting their queries to be run in parallel. As each test was completed, it was submitted again. After four hours, the test ended.

22 Advantages of IBM eX5 for Database Workloads 4 Users running 20 Users running 56 Users running complex reports intermediate Reports simple reports

2 Users 2 Users 6 Users 8 Users 6 Users 14 Users 14 Users 14 Users 14 Users

Complex Complex Intermediate Intermediate Intermediate Simple Simple Simple Simple Report Report Report Report Report Report Report Report Report 1 3 9 10 11 2 4 5 6

4 Connections 20 Connections 56 Connections

Each report executes one 80 users executing or more queries reports concurrently Measure maximum Data Server reports per hour

Figure 23 Process for running concurrent operational reports (multiple users)

For each type of report, we calculated the reports generated per hour. Figure 24 on page 24 shows that the Smart Analytics 5600S R2 completed more reports and has a lower three-year total cost of acquisition. The results in Figure 24 are based on IBM internal tests of the Smart Analytics 5600S R2 solution. This outcome is compared to the results of testing a competitor configuration (previous version; no longer available) with the same workload, both in a controlled laboratory environment. The test compares the performance of running different reports of varying complexity, concurrently. The prices are based on a three-year total cost of acquisition (based on US list prices).

Results might not be typical and vary based on actual configuration, applications, specific queries, and other variables in a production environment. Users of this document need to verify the applicable data for their specific environment.

Advantages of IBM eX5 for Database Workloads 23 705% 1418% 21% More complex More intermediate Lower TCA reports per hour reports per hour $3,500,000 2 400 360.5 1.77 1.8 350 $3,000,000 $2.86 M

1.6 300 $2,500,000 1.4 $2.25 M 250 1.2 $2,000,000

1 200 $1,500,000 0.8 150

0.6 $1,000,000 100 0.4 $500,000 0.22 50 0.2 23.75

0 0 $0

Competitor 5600S R2 Competitor 5600S R2 Competitor 5600S R2

Complex Reports Per Intermediate Reports Per Cost – 3YR TCA System Hour 2 TB Hour 2 TB List Price (Higher is Better) (Higher is Better) (Lower is Better)

Figure 24 Throughput of complex reports during concurrent workloads

Another performance comparison was conducted that used Smart Analytics 5600S R2. The comparison was conducted as a fixed report study. We ran a specific number of tests, according to specifications listed in Figure 25. These tests were run by 80 parallel users until all were completed.

4 3 21 17999 29,078 84,854 19,600 27,328

Complex Complex Intermediate Intermediate Intermediate Simple Simple Simple Simple Report Report Report Report Report Report Report Report Report 1 3 9 10 11 2 4 5 6

4 Connections 20 Connections 56 Connections

Each report 161,166 total reports executes one 80 reports concurrently or more queries Measure elapsed time until Data Server all reports are complete

Figure 25 Process for testing parallel execution of reports (fixed execution)

Figure 26 on page 25 shows the results. It took 76% less time for the Smart Analytics 5600s R2 to complete the reports than it did for the comparison server. Running the reports was accomplished at a 21% lower cost over a three-year total cost of acquisition. Values in Figure 26 are based on IBM internal tests of the Smart Analytics 5600S R2. The Smart Analytics solution is compared to the testing of a competitor configuration (previous version; no longer available) with the same workload, both in a controlled laboratory environment.

24 Advantages of IBM eX5 for Database Workloads The three-year total cost of acquisition includes hardware, software, service, and support. The values are based on US list prices. The test compares the performance of running different reports of varying complexity, concurrently. Results might not be typical and vary based on actual configuration, applications, specific queries, and other variables in a production environment. Users of this document need to verify the applicable data for their specific environment.

Time to Completion for Fixed Size Concurrent Workload – 1 TB

350 327

300

250

200 IBM Smart Analytics 5600S R2 150 76% less time for 100 77 21% lower cost

50 3YR TCA 3 YR TCA $ 2,876,561 $2,256,124 0

Time to Complete Workload (minutes) Competitor IBM 5600S R2 (Lower is Better)

This test measures the amount of time to complete a fixed number of 161,166 concurrently executing reports

Figure 26 In this study, we assessed the time to run a fixed set of concurrent reports

Appliance solutions: Netezza

Figure 27 on page 26 provides a brief tour of the Netezza 1000 to describe what makes it so unique.

Advantages of IBM eX5 for Database Workloads 25 Optimized Hardware and Software Workload optimized system for high performance analytics Dedicated High Performance Simplicity Disk Storage Minimal administration and tuning Speed 10-100x faster than traditional systems Clustered SMP Scalability Hosts Peta-scale user data capacity Smart High-performance advanced analytics Blades with Custom Hardware Meaningful Proof of Query Accelerators Concept demonstration in 2 Weeks

Figure 27 Overview of the IBM Netezza 1000 data warehouse appliance

The following additional benefits are unique to the IBM Netezza 1000:  The hardware and software are optimized and fine-tuned to process analytic queries; therefore, the appliance requires zero to minimal tuning for performance.  The software is designed to push the absolute majority of processing to the Massively Parallel Processing (MPP) network.  The architecture maximizes processing efficiency by ensuring there are no bottlenecks along the I/O path.  The unique feature of the Netezza 1000 is the hardware-based query acceleration, which filters out 95 - 98% of the data that is not needed for query processing. The benefit is that the query acceleration manages the deluge of data typical of data warehousing deployments.

The Netezza 1000 is suitable for serial queries for the following reasons:  Purpose-built architecture for BA: Shared nothing architecture for MPP  Data streaming and hardware-based acceleration, based on Field Programmable Gate Array (FPGA), speeds up queries in the following ways: – Data streaming moves data from a disk by using an FPGA chip to the processors without network latency and unnecessary processor usage. – Each MPP node is self-contained and has minimal data movement.

26 Advantages of IBM eX5 for Database Workloads  Faster specialized code path: – Purpose-built optimizer is tailored for complex analytics with higher predictability. – The absence of traditional database structures and OLTP features results in a simpler code path and faster performance.

For the next test, to measure performance levels, we compared the Netezza 1000 with a comparison machine. Results and product specifications are noted in Figure 28, where we compared these two machines against the 1 TB data set, serial execution test.

Values in Figure 28 are the list prices that were published on the IBM website (http://www.ibm.com) when the analysis was conducted in mid-2011. Values are based on a three-year total cost of acquisition of a competitor configuration (previous version; no longer available). Total cost of acquisition includes the hardware, software, service, and support that are used to run analytic workloads. Prices are based on US list prices and vary by country. Users of this document need to verify the applicable data for their specific environment.

Netezza 1000-12

NetezzaNetezza 1000-12 1000-12 Host Host Competitor Database System Core Core Core Core Core Core Core Core Core Core Core Core 2 database nodes CoreCore CoreCore CoreCore CoreCore CoreCore CoreCore  2x8 cores x86 MemoryMemory(48(48 GB) GB)  2x72 GB memory (4)(4) 14 14 GB GB SAS SAS Disk Disk for for RHEL RHEL 5 5 64-bit 64-bit OS OS 1010 GbE GbE 1010 GbE GbE High speed switch

Custom hardware 3 storage cells accelerators (12) Netezza S-Blade Linux 64-bit  3x8 cores x86 2 Quad-core CPU 2.5 GHz 8 FPGAs 125 MHz  3x24 GB memory Core Core Core Core FPGA FPGA FPGA FPGA  3x384 GB SSD cache Core Core Core Core FPGA FPGA FPGA FPGA  Memory (8 GB) 3x12x600 GB SAS2 3 YR TCA = $2,876,561

(8) Disk Enclosures (12) 1 TB Raid 1 Mirroring 3 YR TCA = $2,140,600

Figure 28 System comparison of IBM Netezza 1000-12 versus competitor

Additional product specifications of the IBM Netezza 1000 include the following benefits:  High-performance analytics with its default settings: Requires no indexing or tuning, leading to shorter deployment cycles and faster time-to-value  A powerful platform, designed specifically for unifying business intelligence and advanced analytics on large volumes of data  Appliance simplicity: Easy to deploy and manage, and dramatically simplifies your data warehouse and analytic infrastructure  Powerful scaling and low total cost of ownership: Scales from 1 TB to 1.5 petabytes, delivering 10 - 100 times the performance improvements over traditional architectures

Advantages of IBM eX5 for Database Workloads 27  Unique asymmetric massively parallel processing (AMPP) architecture that combines open, IBM blade servers, and disk storage with the IBM patented data filtering FPGAs

Figure 29 shows the results of the response times for running the analytic queries. Not only was the Netezza considerably less expensive (as shown in Figure 28 on page 27), it was also much faster than the comparison machine. Values in Figure 29 are based on IBM internal tests of the Netezza 1000-12 solution. The Netezza solution is compared to the testing of a competitor configuration (previous version; no longer available), running analytics workloads in a controlled laboratory environment.

Three-year total cost of acquisition includes hardware, software, service, and support. Values are based on US list prices. The test compares the performance of running different reports of varying complexity. Results might not be typical and vary based on actual configuration, applications, specific queries, and other variables in a production environment. Users of this document need to verify the applicable data for their specific environment.

Average Complex Report Performance Average Intermediate Report Performance Serial Execution Test (1 TB) Serial Execution Test (1 TB) (Lower is Better) (Lower is Better) 1400 45 1,291 39.10 40 1200

Competitor 35 Competitor 1000 Netezza 1000-12 30 Netezza 1000-12

800 25

600 20

15 400 12.47 Response Time (seconds) Response TimeResponse (seconds) 10 200 123 5

0 0 10x faster 3x faster Figure 29 Response times for running analytic queries

Figure 30 on page 29 shows test results that compared the Netezza 1000 and a comparison machine to determine the performance of the cost per report over a three-year period. As shown in Figure 30, results showed that Netezza costs considerably less per report. Values in Figure 30 are based on IBM internal tests of the Netezza 1000-12 solution, conducted in 2011. The Netezza solution is compared to the testing of a competitor configuration (previous version; no longer available), running analytics workloads in a controlled laboratory environment.

Three-year total cost of acquisition includes hardware, software, service, and support. Values are based on US list prices. The test compares the performance of running different reports of varying complexity, concurrently. Results might not be typical and vary based on actual configuration, applications, specific queries, and other variables in a production environment. Users of this document need to verify the applicable data for their specific environment.

28 Advantages of IBM eX5 for Database Workloads Average Complex Cost Per Report Average Intermediate Cost Per Report Serial Execution Test (1 TB) Serial Execution Test (1 TB) (Lower is Better) (Lower is Better) $45 1.40 $39.00 $40 $1.18 1.20

$35 Competitor Competitor 1.00 $30 Netezza 1000-12 Netezza 1000-12

$25 0.80

$20 0.60

Cost per report $15 0.40 $10 $0.28 0.20 $5 $2.78

$0 0.00 93% Less Cost Per 76% Less Cost Per Complex Report Intermediate Report

Figure 30 Comparison of cost per report

The entire study, shown in Figure 30, was completed in a span of four hours for Netezza. However, it took about four days of reading manuals and tuning (ASM rebalance, partitioning, and parallelizing hints) for the comparison database to obtain optimal performance.

Figure 31 on page 30 describes the performance of Netezza 1000 (with no tuning) and a comparison machine (after four days of tuning).

Advantages of IBM eX5 for Database Workloads 29 Netezza SPSS Modeler Client Competitor 1000-12 IBM x3650 M2

Compressed Compressed High Analytic model* 102 M rows 18 GB of data Out of box Out of box performance performance After tuning Load Data 23 seconds7x faster 15 minutes 164 seconds

Run Regression Model 9 seconds(customer churn prediction) 59 minutes 178 seconds 20x faster

* Created 10 Telco Churn Models using Multinomial Logistic Regression, and scored a compressed table with 102 M rows using SQL Pushback

Figure 31 Right from the beginning, Netezza 1000 requires no tuning and outperforms a comparable system

A description of the test method, shown in Figure 31, follows.

Data Load Phase The data was from a Telco study that used a table with 102 million rows.

Compression was used for both databases (Netezza supports only compressed data, and we used Compress Query Low and High for the comparison database). The comparison database Compress Query High data load is slower than Compress Query Low. We used the better number to provide the comparable system the benefit of the doubt (for scoring performance, the number does not change between Compress Low or Compress High).

Performance of the built-in components indicates that the tests were conducted with the same SQL queries (that is, there was no tuning on either system). Netezza did not require tuning, and the (DDL) included random partitioning. For the comparison database, performance of the default settings was conducted for 15 minutes (Compress Query High) and five minutes (uncompressed). We tuned the comparison database manually, which created four table spaces (each for input file S100M and output file OP100M). And we used 64 partitions and 64 parallel threads (degree) and modified INSERT SQL with parallelization hints (such as /*+ APPEND */).

Table 3 on page 31 shows the data load times with compression time (seconds) for this test.

30 Advantages of IBM eX5 for Database Workloads Table 3 Data load times and compression time (seconds) Netezza Comparison database Comparison database (Normal compression) (High compression)

Create table S100M 0.246 0.48 0.33

Load a 100,000-record 3.315 10.47 18.01 data file

Fold (ten times) 19.809 102.63 145.55

Total time 23.37 113.58 163.89

Table 3 shows the results of this test in terms of total time, where Netezza performed eleven times faster than the comparison machine. The Netezza was tested with its default settings. To be ready for the test, the comparison machine was configured over a period of four days.

Figure 32 describes how comparison systems are sometimes designed to support the “one-size-fits-all” concept. This Redpaper publication shows that this approach is typically not the best solution. Our workload-optimized solutions can offer more efficiency and better price performance.

Custom Built Integrated Appliances

Business Analytics Business Analytics Business Analytics

Concurrently executing Serial analytic queries Serial analytic queries analytic queries

DB2 + eX5 + MAX5 Competitor Competitor inefficiency "One-Size-Fits-All" Storage cell software wasted System Netezza beats the competition

Smart Analytics 5600 beats the competition

Figure 32 Database workloads can benefit from one of the IBM workload-optimized solutions

IBM software and System x are best on Intel for data workloads

For OLTP and BA workloads, this paper describes how IBM software and System x can meet your needs, without regard to complexity. One size does not fit all; IBM is aware of that fact. Part of the success of the IBM software and System x is the powerful, expansive Intel processor. Together, IBM software, System x, and Intel are designed to address your database workload requirements. Figure 33 on page 32 defines the primary factors that make IBM and Intel a good combination for your OLTP and BA workloads.

Advantages of IBM eX5 for Database Workloads 31 OLTP workloads and the competition:  DB2 and System x with eXFlash provided  54% more throughput at up to 46% less cost  DB2 and System x with Storwize V7000 Easy Tier provided  1.25x more throughput at 60% less cost Business Analytics workloads and the competition:  DB2 and System x with MAX5 won for serial reports in performance and cost  IBM Smart Analytics System 5600 R2 won for concurrent workloads in performance and cost  Netezza won for serial reports in performance and cost with an appliance solution

Figure 33 IBM software and System x are best on Intel for OLTP and BA workloads

Summary

Two specific database workloads are discussed in this Redpaper document: OLTP and BA workloads.

For OLTP workloads, the eX5 family of servers offers the high performance processors for CPU processing. When I/O throughput becomes a bottleneck, System x offers eXFlash internal SSDs to boost I/O. When more boost is needed, the Storwize V7000 with Easy Tier provides it.

For BA workloads, choose from the most configurable high-end systems with the best memory, or choose pre-configured systems (with several options) that are tuned for BA. You can also select an appliance that does what you need and works quickly, with its initial default settings.

For both workload types, IBM offers the hardware and software to optimize your database workloads.

For more information

For more information about the products described in this publication, see these websites:  IBM System x overview http://ibm.com/systems/x/  IBM eX5 enterprise systems overview http://www.ibm.com/systems/info/x86servers/ex5/  IBM System x3690 X5 server overview http://www.ibm.com/systems/x/hardware/enterprise/x3690x5/  IBM eX5 Portfolio Overview: IBM System x3850 X5, x3950 X5, x3690 X5, and IBM BladeCenter HX5, REDP-4650, December 2011. http://www.redbooks.ibm.com/abstracts/redp4650.html

32 Advantages of IBM eX5 for Database Workloads  IBM System x3850 X5 overview http://www.ibm.com/systems/x/hardware/enterprise/x3850x5/  IBM BladeCenter HX5 overview http://www.ibm.com/systems/info/x86servers/ex5/blade/  IBM Easy Tier overview http://www.ibm.com/systems/storage/disk/storwize_v7000/  IBM eXFlash storage overview http://www.ibm.com/systems/info/x86servers/optimized/database/index.html  IBM MAX5: – Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology, 2012, REDP-4846. http://www.redbooks.ibm.com/abstracts/redp4846.html – The MAX5 Advantage: How IBM System x MAX5 Benefits Microsoft SQL Server Data Warehouse Workloads, REDP-4751, 2011. http://www.redbooks.ibm.com/abstracts/redp4751.html – IBM Unveils Industry’s First Systems that Rewrite Economics of ‘Industry-Standard’ Computing, 2010 http://www.ibm.com/press/us/en/pressrelease/29570.wss  IBM ProtecTIER® overview http://www.ibm.com/systems/storage/tape/ts7650g/  IBM Systems overview http://www.ibm.com/systems/

The team who wrote this paper

This paper was produced by a team of specialists from around the world, working at the International Technical Support Organization, Poughkeepsie Center.

Tim Bohn is an IBM Senior Certified IT Specialist and Open Group Master Certified IT Specialist with the IBM Software Group, Strategy and Technology Lab (STL). His current focus is the IBM System x platform hardware and software. Prior to joining STL in 2010, Tim was the Global Community of Practice leader for the IBM Rational® Systems Development Community, where he helped clients with systems and software development practices. Prior to joining IBM Rational Software, Tim worked as a software and systems engineer for 15 years. Tim holds Bachelor and Master degrees in Computer Science from USC. He is a yearly guest lecturer at USC and UCLA.

Advantages of IBM eX5 for Database Workloads 33 Karen Lawrence is a Technical Writer with the IBM ITSO team in North Carolina, US. She has 25 years experience in IT, with expertise in application design, change management, and the software development lifecycle (SDLC). She has worked with SMEs on leading technologies in global, regulated enterprises, in the areas of storage systems, security, and disaster recovery, and the applications to support these. Karen holds a Bachelor degree in Business Administration from Centenary College in New Jersey.

Thanks to the following people for their contributions to this project:

Many thanks to Randall W. (Randy) Lundin, IBM Product Marketing - High End System x, for his contributions and sponsorship of this project.

Thanks also to Linda Robinson, IBM Graphics Specialist; and to the IBM ITSO editing team for their contributions to this project.

Now you can become a published author, too!

Here’s an opportunity to spotlight your skills, grow your career, and become a published author—all at the same time! Join an ITSO residency project and help write a book in your area of expertise, while honing your experience using leading-edge technologies. Your efforts will help to increase product acceptance and client satisfaction, as you expand your network of technical contacts and relationships. Residencies run from two to six weeks in length, and you can participate either in person or as a remote resident working from your home base.

Obtain more about the residency program, browse the residency index, and apply online at: ibm.com/redbooks/residencies.html

Stay connected to IBM Redbooks

 Find us on Facebook: http://www.facebook.com/IBMRedbooks  Follow us on Twitter: http://twitter.com/ibmredbooks  Look for us on LinkedIn: http://www.linkedin.com/groups?home=&gid=2130806  Explore new IBM Redbooks® publications, residencies, and workshops with the IBM Redbooks weekly newsletter: https://www.redbooks.ibm.com/Redbooks.nsf/subscribe?OpenForm  Stay current on recent Redbooks publications with RSS Feeds: http://www.redbooks.ibm.com/rss.html

34 Advantages of IBM eX5 for Database Workloads Notices

This information was developed for products and services offered in the U.S.A.

IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.

The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.

Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment.

This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

Any references in this information to non-IBM websites are provided for convenience only and do not in any manner serve as an endorsement of those websites. The materials at those websites are not part of the materials for this IBM product and use of those websites is at your own risk.

IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.

COPYRIGHT LICENSE:

This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.

© Copyright International Business Machines Corporation 2012. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. 35 This document REDP-4848-00 was created or updated on August 31, 2012.

Send us your comments in one of the following ways: ®  Use the online Contact us review Redbooks form found at: ibm.com/redbooks  Send your comments in an email to: [email protected]  Mail your comments to: IBM Corporation, International Technical Support Organization Dept. HYTD Mail Station P099 2455 South Road Poughkeepsie, NY 12601-5400 U.S.A. Redpaper ™

Trademarks

IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. These and other IBM trademarked terms are marked on their first occurrence in this information with the appropriate symbol (® or ™), indicating US registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml

The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both:

BladeCenter® IBM® Redbooks (logo) ® Cognos® ProtecTIER® Storwize® DataPower® Rational® System x® DB2® Redbooks® Tivoli® Easy Tier® Redpaper™

The following terms are trademarks of other companies:

Netezza, and N logo are trademarks or registered trademarks of IBM International Group B.V., an IBM Company.

Intel Xeon, Intel, Intel logo, Intel Inside logo, and Intel Centrino logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

Linux is a trademark of Linus Torvalds in the United States, other countries, or both.

Microsoft, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.

Other company, product, or service names may be trademarks or service marks of others.

36 Advantages of IBM eX5 for Database Workloads