Data management

Optimizing data warehouse configurations for peak efficiency

By Satheesh Iyer

Reference architectures and best practices enable enterprises to capitalize on their information resources. Building on Microsoft guidelines, has engineered end-to-end data warehousing solutions utilizing Microsoft® SQL Server® database software.

he proliferation of data from vast collections of data and historical capacity and performance within their diverse sources has a significant information quickly and efficiently, allowing data warehouse systems. And none impact on data center workers to derive meaningful insights too soon. Counterproductive increases T operations. An efficient, highly that advance business and organizational in response time are resulting from a scalable data warehousing strategy helps outcomes. For this reason, data warehouses convergence of factors such as escalating enterprises manage the volume, velocity, are rapidly becoming essential elements data volumes and their related loading and variety of data running into and across in the IT infrastructure, as organizations demands, mounting complexity of online the organization, enabling them to tap the harness data to identify trends, provide analytical processing (OLAP) queries, and immense potential of big data for a distinct business intelligence reporting, and perform a rising number of end users. competitive advantage. predictive analyses (see Figure 1). Best practices and guidelines jointly Adopting an effective data warehousing To deliver on this vision, organizations developed by Dell and Microsoft can assist strategy enables enterprises to mine need to strike the optimal balance of IT staff in designing and implementing a

Preprinted from Dell Power Solutions, 2012 Issue 3. Copyright © 2012 Dell Inc. All rights reserved. Preprinted from Dell Power Solutions, 2012 Issue 3. Copyright © 2012 Dell Inc. All rights reserved. dell.com/powersolutions | 2012 Issue 03 69 Data management

balanced configuration for Microsoft SQL Data sources Server 2012 data warehouse workloads. Decision support These recommendations enable Logistics organizations to quickly and cost-effectively and inventory achieve scalable performance while helping Timely reports and Trend analysis outputs with real data to reduce management complexity. Financial forecast

Web portal Overcoming obstacles Customer to data optimization relationship Despite the high business value of accurate management and timely information, obstacles may Call record Accurate and Organized data hinder organizations from gathering data timely decisions and making sense of it in a cost-effective Figure 1. Deriving valuable insights from resources way. Data continues to grow, both internally and externally. Disparate data to balance the hardware and software system using standards-based, off-the-shelf sources with diverse formats and duplicate capabilities of the data warehouse system. hardware. The reference configurations information add complexity. And OLAP is The Microsoft SQL Server Fast Track Data are applicable to a range of industries, heightening pressure on data management Warehouse (FTDW) reference architecture segments in public and private sectors, systems as online business initiatives implements data warehousing that is and organization sizes. continue to gain momentum. optimized for the large sequential scans and Moreover, Dell and Microsoft continually Many IT groups lack the internal reads characteristic of many OLAP systems refresh SQL Server FTDW reference resources to design and implement a data today. This methodology is designed to yield architecture offerings with technology warehouse approach that can effectively outstanding performance compared with advancements. For example, Microsoft SQL address these challenges, primarily because traditional data warehousing implementations. Server 2012 and 12th-generation PowerEdge of IT staff size and budget constraints. Dell has built on SQL Server FTDW servers powered by the Intel® Xeon® processor Ensuring an optimal balance of I/O, storage, to develop best practices and reference E5 product family have recently been memory, and processing power is essential guidelines that help organizations implement added. Dell is committed to continuously in designing a database configuration SQL Server FTDW on Dell™ hardware. The refreshing data warehousing solutions as so that no single element becomes a reference configurations integrate next- technology evolves. (See the sidebar, “Take bottleneck. Without the requisite expertise generation Dell PowerEdge™ servers and the fast track to data warehousing.”) in database architecture and administration, Dell PowerVault™ storage arrays with the Based on SQL Server FTDW 4.0, the organizations may overprovision and data warehouse capabilities of Microsoft SQL refreshed configurations are built on small, experience costly inefficiencies during the Server. The result is an optimized, balanced, 5 TB; medium, 10 TB; and large, 20 TB design and implementation process. end-to-end data warehouse solution. platforms (see Figure 2). The 5 TB platform Using external resources to piece together Utilizing this Dell approach helps IT is based on the PowerEdge R720xd server software, servers, and storage also has its organizations reverse the conventional with internal storage for small and medium drawbacks, often resulting in complex systems notion that a data warehouse solution often businesses (SMBs). The 10 TB platform is that offer no single point of accountability. requires a complex, proprietary, and one- based on the PowerEdge R720 server and Other alternatives, such as traditional size-fits-all approach that can be costly. Dell a PowerVault MD3620i Internet SCSI (iSCSI) warehouse systems based on proprietary helps organizations avoid the intricacies storage array for medium data warehouse technology, are often costly to acquire and and learning curves required to architect requirements. The 20 TB platform is may require expensive, ongoing contracts a complex data warehouse solution, based on the PowerEdge R720 server and to optimize and maintain these systems. which often results in overprovisioning of a PowerVault MD3620f Fibre Channel hardware, cost overruns, and time delays. storage array for large data warehouse Implementing advanced data Several configuration alternatives are requirements. For high availability, Dell warehouse capabilities available, enabling organizations to start at provides tested and validated solutions To help overcome these challenges, Microsoft the size and stage of infrastructure they for configuring database clustering using has developed a framework designed currently have and scale their database multiple PowerEdge R720 servers.

70 2012 Issue 03 | dell.com/powersolutions Preprinted from Dell Power Solutions, 2012 Issue 3. Copyright © 2012 Dell Inc. All rights reserved. Preprinted from Dell Power Solutions, 2012 Issue 3. Copyright © 2012 Dell Inc. All rights reserved. Data management

Delivering solutions, not just a set and an integrated out-of-band network Cloud is also easily integrated into SQL of instructions interface to provide hands-free deployment Server FTDW deployments to streamline data Organizations utilizing the combination of and maintenance of PowerEdge servers. integration by facilitating the connection next-generation Dell hardware, Dell Services Integrated Dell PowerVault storage of cloud computing and on-premises data offerings, optional integration of Dell helps improve operational efficiencies, sources more rapidly and cost-effectively Boomi™ AtomSphere™ Integration Cloud™ performance, and utilization; minimizes than conventional integration solutions. software, and the reference configurations cost and complexity; supports backup and From April through June 2012, testing based on Microsoft SQL Server FTDW can recovery; and can scale to meet capacity was conducted by Dell engineers at Dell capitalize on a tested, certified, production- needs. These arrays also offer nondisruptive Labs and validated and certified by Microsoft ready approach for their environments. firmware upgrades and automated I/O path engineers. The testing validated SQL Server The data warehouse system can be up and protection with host-based, multipath failover FTDW performance by measuring the running quickly, with minimal impact on an drivers. Other features include dual-active maximum consumption rate (MCR) of the organization’s operational systems. controllers that incorporate mirrored cache to processor and the benchmark consumption Dell Data Warehouse solutions are based protect data in the event of controller failure rate (BCR).1 Testing of the entire database on Dell PowerEdge servers that provide robust while the I/O continues processing without stack was also performed for robustness and expandability of server memory, connectivity, interruption. Data protection features such availability, as well as to determine the best and internal storage, while maintaining a level as snapshots, virtual disk copy, and remote practices for building a solid and balanced of power efficiency that is critical for dense replication help protect data effectively. SQL Server FTDW system. Dell engineers environments. PowerEdge servers offer high- In addition, Dell Services supports data worked closely with Microsoft, and the availability features such as on-board RAID warehouse solutions with comprehensive tremendous amount of information they controllers and redundant fans and power knowledge and end-to-end services shared helps strengthen the foundation for supplies that help ensure maximum uptime offerings. In particular, Dell Services can these Dell Data Warehouse solutions and for the consolidated environment. Maintaining assist with assessment and implementation ongoing development of the SQL Server the data warehouse infrastructure can be even of a comprehensive data warehouse FTDW framework. easier using innovative management features solution based on the SQL Server FTDW Each layer of hardware and software is that PowerEdge servers provide. These features framework to help simplify the procurement, comprehensively tuned to help optimize include the Integrated Dell Remote Access configuration, and deployment experience performance of the data warehouse stack. Controller 7 (iDRAC7) with Lifecycle Controller for organizations. AtomSphere Integration Dell began by taking query response time

1 For more information on the test configurations, results, and best practices, see “Dell SMB reference configuration for Microsoft SQL Server 2012 Fast Track Data Warehouse on PowerEdge R720xd,” by Anthony Fernandez and Mayura Deshmukh, Dell Database Solutions Engineering, May 2012, qrs.ly/nt23dko; “Microsoft SQL Server 2012 Fast Track reference configuration using PowerEdge R720 and PowerVault MD3620i,” by Jisha J, Dell Database Solutions Engineering, July 2012, qrs.ly/ml242yi; and “Microsoft SQL Server 2012 Fast Track reference configuration using PowerEdge R720 and PowerVault MD3620f,” by Narasimha Reddy Gopu and Jisha J, Dell Database Solutions Engineering, May 2012, qrs.ly/tw23dl9.

Small solution* Medium solution* Large solution*

Rated capacity 5 TB 10 TB 20 TB

Processors One: quad-core Intel Xeon processor Two: quad-core Intel Xeon processor E5 Two: eight-core Intel Xeon processor E5 product family product family E5 product family

RAM 96 GB 128 GB 160 GB

Storage 300 GB and 600 GB Serial Attached Dell PowerVault MD3620i Dell PowerVault MD3620f SCSI (SAS) drives

Switching None One iSCSI switch One Fibre Channel switch

Single-server configuration ID** 2209618 2405036 2319046

Highly available configuration ID** None 2405110 2316699

*All platforms take advantage of the columnstore index capability available with SQL Server 2012. In the SQL Server database engine, a columnstore index stores data in a specified table one column at a time and, combined with query optimization features, helps facilitate fast query performance for data warehouses. For more information, visit msdn.microsoft.com/en-us/library/ gg492088.aspx. **For more information on a specific platform, provide the appropriate configuration ID when contacting a Dell representative.

Figure 2. Configuring data warehouse platforms based on SQL Server FTDW 4.0

Preprinted from Dell Power Solutions, 2012 Issue 3. Copyright © 2012 Dell Inc. All rights reserved. Preprinted from Dell Power Solutions, 2012 Issue 3. Copyright © 2012 Dell Inc. All rights reserved. dell.com/powersolutions | 2012 Issue 03 71 Data management

requirements and calculating the number of processor cores necessary to achieve that throughput. The I/O channel and storage Take the fast track arrays were then matched to maximize the processor throughput. The goal was to take to data warehousing advantage of all the hardware components Building a data warehouse using the Microsoft SQL Server Fast Track Data in equilibrium, without underutilizing or Warehouse (FTDW) 4.0 architecture and 12th-generation Dell PowerEdge servers overburdening any one component. is designed to provide several key benefits for organizations:

Engineering data warehousing • A comprehensive solution pre-engineered at Dell Labs and certified by Microsoft for specific needs • High availability using multiple servers for redundancy • A balanced and optimized system at all levels of the hardware and Dell offers a variety of data warehouse software stack solutions to meet specific organizational • Predictable, out-of-the-box performance to maximize SQL Server requirements. In addition to the Microsoft database capabilities SQL Server FTDW reference architecture for • Avoidance of design pitfalls such as overprovisioning hardware resources Dell hardware, Dell also provides engineered • Rapid deployment and accelerated data warehouse projects that help reduce departmental and midsized data warehouses costs for planning and enhance productivity quickly on Dell hardware, the Dell Quickstart Data Warehouse Appliance 1000, and enterprise- Organizations can also potentially reduce future support costs because the wide parallel data warehouses on Dell SQL Server FTDW reference architecture is designed to limit the need for changes hardware. The Quickstart Data Warehouse resulting from scalability challenges as the data warehouse grows. It is designed to Appliance 1000, which is validated on simplify the procurement, configuration, and deployment process. SQL Server FTDW reference architecture guidelines, offers a comprehensive approach that includes the hardware, software, and their needs, from do-it-yourself approaches, hardware for their needs. It also helps them implementation services necessary to quickly to build-and-transfer approaches, to data save time and avoid the potential ramp-up and deploy a powerful, robust data warehouse. warehousing as a service. overprovisioning costs associated with their Other technologies that help own technology procurement processes. organizations achieve their data warehouse Harnessing vital information objectives include AtomSphere in a data-driven world Author Integration Cloud software for integrating Escalating growth in data from myriad, cloud computing. Placing an organization’s disparate sources provides many organizations Satheesh Iyer is a senior product marketing data into the data warehouse is vital to with a compelling opportunity to implement manager in the Enterprise Solutions Group at Dell and is focused on database solutions. the success of a data analysis system. But data warehousing. Data organized into in many cases, integrating multiple data useful information helps organizations gain a sources efficiently, easily, and accurately can competitive advantage in an increasingly data- Learn more be a challenging and time-intensive task. Dell driven world. Dell Data Warehouse solutions Microsoft SQL Server solutions on Dell hardware: helps simplify this critical step by offering the running on 12th-generation Dell PowerEdge dell.com/sql AtomSphere Integration Cloud software. servers and Dell PowerVault storage arrays Dell-supported configurations Organizations can also take advantage enable organizations to leverage their own for Microsoft SQL Server: of Dell Services offerings available at each data to greatly enhance their understanding qrs.ly/ky23dnd stage of the data life cycle. These services of their business operations and adapt to Dell SQL Server Advisor tool: range from assessment and design to ever-changing demands. bit.ly/yMKCLi operation and maintenance. Additionally, The Dell and Microsoft-certified reference global software and hardware support is architecture provides server and storage Dell high availability cluster servers: available for Dell-based data warehouse deployment best practices and guidance qrs.ly/jx23dlo environments. Organizations can opt for the for various data warehouse workloads, Microsoft SQL Server data warehousing: level of service and support that best suits helping organizations identify highly efficient microsoft.com/fasttrack

72 2012 Issue 03 | dell.com/powersolutions Preprinted from Dell Power Solutions, 2012 Issue 3. Copyright © 2012 Dell Inc. All rights reserved. Preprinted from Dell Power Solutions, 2012 Issue 3. Copyright © 2012 Dell Inc. All rights reserved.