Optimizing Data Warehouse Configurations for Peak Efficiency
Total Page:16
File Type:pdf, Size:1020Kb
Data management Optimizing data warehouse configurations for peak efficiency By Satheesh Iyer Reference architectures and best practices enable enterprises to capitalize on their information resources. Building on Microsoft guidelines, Dell has engineered end-to-end data warehousing solutions utilizing Microsoft® SQL Server® database software. he proliferation of data from vast collections of data and historical capacity and performance within their diverse sources has a significant information quickly and efficiently, allowing data warehouse systems. And none impact on data center workers to derive meaningful insights too soon. Counterproductive increases T operations. An efficient, highly that advance business and organizational in response time are resulting from a scalable data warehousing strategy helps outcomes. For this reason, data warehouses convergence of factors such as escalating enterprises manage the volume, velocity, are rapidly becoming essential elements data volumes and their related loading and variety of data running into and across in the IT infrastructure, as organizations demands, mounting complexity of online the organization, enabling them to tap the harness data to identify trends, provide analytical processing (OLAP) queries, and immense potential of big data for a distinct business intelligence reporting, and perform a rising number of end users. competitive advantage. predictive analyses (see Figure 1). Best practices and guidelines jointly Adopting an effective data warehousing To deliver on this vision, organizations developed by Dell and Microsoft can assist strategy enables enterprises to mine need to strike the optimal balance of IT staff in designing and implementing a Preprinted from Dell Power Solutions, 2012 Issue 3. Copyright © 2012 Dell Inc. All rights reserved. Preprinted from Dell Power Solutions, 2012 Issue 3. Copyright © 2012 Dell Inc. All rights reserved. dell.com/powersolutions | 2012 Issue 03 69 Data management balanced configuration for Microsoft SQL Data sources Server 2012 data warehouse workloads. Decision support These recommendations enable Logistics organizations to quickly and cost-effectively and inventory achieve scalable performance while helping Timely reports and Trend analysis outputs with real data to reduce management complexity. Financial forecast Web portal Overcoming obstacles Customer to data optimization relationship Despite the high business value of accurate management and timely information, obstacles may Call record Accurate and Organized data hinder organizations from gathering data timely decisions and making sense of it in a cost-effective Figure 1. Deriving valuable insights from vast data resources way. Data continues to grow, both internally and externally. Disparate data to balance the hardware and software system using standards-based, off-the-shelf sources with diverse formats and duplicate capabilities of the data warehouse system. hardware. The reference configurations information add complexity. And OLAP is The Microsoft SQL Server Fast Track Data are applicable to a range of industries, heightening pressure on data management Warehouse (FTDW) reference architecture segments in public and private sectors, systems as online business initiatives implements data warehousing that is and organization sizes. continue to gain momentum. optimized for the large sequential scans and Moreover, Dell and Microsoft continually Many IT groups lack the internal reads characteristic of many OLAP systems refresh SQL Server FTDW reference resources to design and implement a data today. This methodology is designed to yield architecture offerings with technology warehouse approach that can effectively outstanding performance compared with advancements. For example, Microsoft SQL address these challenges, primarily because traditional data warehousing implementations. Server 2012 and 12th-generation PowerEdge of IT staff size and budget constraints. Dell has built on SQL Server FTDW servers powered by the Intel® Xeon® processor Ensuring an optimal balance of I/O, storage, to develop best practices and reference E5 product family have recently been memory, and processing power is essential guidelines that help organizations implement added. Dell is committed to continuously in designing a database configuration SQL Server FTDW on Dell™ hardware. The refreshing data warehousing solutions as so that no single element becomes a reference configurations integrate next- technology evolves. (See the sidebar, “Take bottleneck. Without the requisite expertise generation Dell PowerEdge™ servers and the fast track to data warehousing.”) in database architecture and administration, Dell PowerVault™ storage arrays with the Based on SQL Server FTDW 4.0, the organizations may overprovision and data warehouse capabilities of Microsoft SQL refreshed configurations are built on small, experience costly inefficiencies during the Server. The result is an optimized, balanced, 5 TB; medium, 10 TB; and large, 20 TB design and implementation process. end-to-end data warehouse solution. platforms (see Figure 2). The 5 TB platform Using external resources to piece together Utilizing this Dell approach helps IT is based on the PowerEdge R720xd server software, servers, and storage also has its organizations reverse the conventional with internal storage for small and medium drawbacks, often resulting in complex systems notion that a data warehouse solution often businesses (SMBs). The 10 TB platform is that offer no single point of accountability. requires a complex, proprietary, and one- based on the PowerEdge R720 server and Other alternatives, such as traditional size-fits-all approach that can be costly. Dell a PowerVault MD3620i Internet SCSI (iSCSI) warehouse systems based on proprietary helps organizations avoid the intricacies storage array for medium data warehouse technology, are often costly to acquire and and learning curves required to architect requirements. The 20 TB platform is may require expensive, ongoing contracts a complex data warehouse solution, based on the PowerEdge R720 server and to optimize and maintain these systems. which often results in overprovisioning of a PowerVault MD3620f Fibre Channel hardware, cost overruns, and time delays. storage array for large data warehouse Implementing advanced data Several configuration alternatives are requirements. For high availability, Dell warehouse capabilities available, enabling organizations to start at provides tested and validated solutions To help overcome these challenges, Microsoft the size and stage of infrastructure they for configuring database clustering using has developed a framework designed currently have and scale their database multiple PowerEdge R720 servers. 70 2012 Issue 03 | dell.com/powersolutions Preprinted from Dell Power Solutions, 2012 Issue 3. Copyright © 2012 Dell Inc. All rights reserved. Preprinted from Dell Power Solutions, 2012 Issue 3. Copyright © 2012 Dell Inc. All rights reserved. Data management Delivering solutions, not just a set and an integrated out-of-band network Cloud is also easily integrated into SQL of instructions interface to provide hands-free deployment Server FTDW deployments to streamline data Organizations utilizing the combination of and maintenance of PowerEdge servers. integration by facilitating the connection next-generation Dell hardware, Dell Services Integrated Dell PowerVault storage of cloud computing and on-premises data offerings, optional integration of Dell helps improve operational efficiencies, sources more rapidly and cost-effectively Boomi™ AtomSphere™ Integration Cloud™ performance, and utilization; minimizes than conventional integration solutions. software, and the reference configurations cost and complexity; supports backup and From April through June 2012, testing based on Microsoft SQL Server FTDW can recovery; and can scale to meet capacity was conducted by Dell engineers at Dell capitalize on a tested, certified, production- needs. These arrays also offer nondisruptive Labs and validated and certified by Microsoft ready approach for their environments. firmware upgrades and automated I/O path engineers. The testing validated SQL Server The data warehouse system can be up and protection with host-based, multipath failover FTDW performance by measuring the running quickly, with minimal impact on an drivers. Other features include dual-active maximum consumption rate (MCR) of the organization’s operational systems. controllers that incorporate mirrored cache to processor and the benchmark consumption Dell Data Warehouse solutions are based protect data in the event of controller failure rate (BCR).1 Testing of the entire database on Dell PowerEdge servers that provide robust while the I/O continues processing without stack was also performed for robustness and expandability of server memory, connectivity, interruption. Data protection features such availability, as well as to determine the best and internal storage, while maintaining a level as snapshots, virtual disk copy, and remote practices for building a solid and balanced of power efficiency that is critical for dense replication help protect data effectively. SQL Server FTDW system. Dell engineers environments. PowerEdge servers offer high- In addition, Dell Services supports data worked closely with Microsoft, and the availability features such as on-board RAID warehouse solutions with comprehensive tremendous amount of information they controllers and redundant fans and power knowledge