Get more from your data.™

Business Intelligence: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

September, 2013 Author Andy Isley Scalability Experts, BI Practice Leader

Executive Summary If you are pursuing a (BI) solution using Microsoft SQL Server you more than likely have been introduced to the Fast Track reference architectures and PDW Appliances available in numerous sizes, shapes, and capabilities. Each reference architecture is designed to take the guesswork out of properly sizing a relational for your environment while reducing costs and shortening deployment times. Kudos to Microsoft for developing this program as it takes numerous industry best practices and wraps them in an easily consumable package for a SQL Server data warehouse project.

Microsoft’s Fast Track program and PDW Appliances are a great advancement; however the fact remains that if you do not have the right BI system in place to fully optimize and unleash the power of the appliance it is just another piece of hardware. This paper will give you information, advice and insight on how to get the greatest advantage and performance from a Fast Track Appliance. The benefits are significant and if implemented correctly as part of an overall BI solution, it can give you a real competitive advantage.

1 Get more from your data.™

Contents

Introduction...... 3 What is Fast Track?...... 4 What is PDW?...... 5 Appliance Scenarios...... 6 Scan-Intensive...... 6 Nonvolatile...... 7 Index-Light...... 7 Partition Aligned...... 7 Appliance Scaling...... 7 BI System/Foundation Design Work ...... 9 Business Needs...... 10 ETL ...... 11 Cubes ...... 12 Reporting...... 13 Recent Case Study...... 14 How SE can help...... 17 Conclusion...... 17 Biography...... 18 About Scalability Experts...... 18

2 Get more from your data.™

Introduction “Knowledge Is Power,” Sir Francis Bacon, Religious Meditations of Heresies 1597.

It has been nearly 400 years since Sir Francis Bacon wrote those words. However, they are as true today as they were then. With knowledge, an organization can make educated decisions that will impact their place in the market. For many organizations, knowledge is found only after the data has been analyzed by their BI systems. For these organizations, BI systems have given them a strategic advantage and are now considered mission critical just as like their OLTP systems. BI systems today require the use of High Availability servers and software and efficient storage processes.

But not all organizations are running an efficient BI system. Several organizations have implemented BI systems only to find that each BI project has created a new copy of the same data over and over for each department. For others, they simply replicate the data from their OLTP systems to other servers for reporting. As OLTP systems are designed for small quick transactions and not performing advanced analytics, this is costing the company unnecessary hardware and system resources. Still others will perform data extracts to pull the data into Access or Excel files to perform data analysis. All of these methods are costing companies time, money and resources. In today’s market, IT executives are looking for ways to squeeze more value from their IT budgets – do more with less. Creating a central Data Warehouse can save time and resources while providing the information necessary for critical business decisions.

OLTP systems have been around since the 1960’s. Large volumes of material may be found on the methodologies, architecture and best practices for creating an OLTP system. However, BI systems have been in existence only for the past 20 years. Dozens of methodologies, architectures and best practices can be found and many disagree with one another.

In 1991 published Building the Data Warehouse and is considered the father of data warehousing. In 1996 Ralph Kimball published The Data Warehouse Tool Kit and is also considered one of the forefathers of data warehousing movement. The methodologies of the two individuals are widely debated as to whose methodology is the best approach. This paper will not debate this. While both are a good place to start, they cover only the data warehouse structure and data flow, not the hardware architecture or the Best Practices for loading the Data Warehouse (DW).

3 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

Enter Microsoft’s Fast Track program and PDW. This program was developed to:

• Offer a reference architecture to help jump start new data warehouse projects

• Aid organizations with existing data warehouses cope with the explosion of data that has occurred since their implementation

• Outline the best practices for loading data into the DW

• Supply a hardware architecture that can be scaled up and out

While Microsoft’s Fast Track reference architecture and PDW Appliances provide many advantages, one needs to understand the best ways to unleash the real power and potential of the solution. The purpose of this white paper is to provide you with tips and things to think about when planning a DW implementation.

What is Fast Track?

Microsoft’s new Fast Track Reference Architectures and Appliance solutions provide a turnkey approach that provides a single view of information across the enterprise and is able to add from tens to hundreds of terabytes of DW capacity for less than one third of the cost of traditional systems. As a customer’s data volume and user base grows, it may not always be feasible to re-architect their system in order to add needed performance and scalability. Microsoft’s high performing SQL Server-based Data Warehouse solution is an integrated, all-in-one hardware and software solution that is easier to deploy, to operate, and to manage than traditional, proprietary data warehouse offerings. Appliances using Microsoft’s SQL Server 2008 R2 Enterprise and SQL Server 2012 scalable reference architectures configured for HP, Dell, Bull, EMC and IBM hardware provide an easy and cost effective way to scale up.

There are significant benefits to implementing a Fast Track solution:: The Parallel Data Warehouse solution is • Reduces hardware testing and tuning time by leveraging the out-of- 10 times more scalable the-box performance that is built-in and responds to queries • Scales from 4 up to 96 terabytes using SQL Server 2008 R2 up to 100 times faster Enterprise compression capabilities than traditional SQL • Lowers costs through better price performance, rapid deployment Server data warehouse and using industry-standard hardware deployments. • Offers the right performance, scalability and pricing options to suit your own needs

4 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

What is PDW?

For customers who need something bigger, faster and more scalable, the Parallel Data Warehouse (PDW) Appliance solution is 10 times more scalable and responds to queries up to 100 times faster than traditional SQL Server DW deployments. This new solution works with Microsoft SQL Server 2008 R2 Parallel Data Warehouse MPP architecture to create a massively scalable, easy-to-manage appliance that can grow from 48 to more than 500 terabytes of information, without a negative impact on performance. PDW is pre-configured multi parallel processing solution that has been optimized with DW and SQL Server configuration best practices. The appliance can be further configured based on the storage and processing requirements into a Half or Full Rack solution. Additional Full Rack Compute/Storage Node Racks can be added during the initial configuration. Both the Half and Full Rack configurations contain the same amount of control hardware. The Control Rack is the user interface to the backend Compute Nodes and controls all access to the PDW solution.

5 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

The Control Rack is made up of the following:

1 Backup server – optimized to back up the PDW solution

2 Control Nodes – Clustered access point for all queries that then passes the queries to the compute nodes for processing.

2 Management Servers – access point for all management activities

1 Landing Zone – ETL and data loading point optimized to load data across all compute in parallel.

The Compute Rack is made of equal number of Compute Nodes and Storage Nodes. The Compute Rack configuration resembles multiple Fast Track solutions. It can be added to with additional Compute node Racks should a customer need the Processing and Storage capacity. Each of the Compute nodes utilizes its own CPU, Memory and Disk resulting in a shared nothing architecture.

With either a Full or Half Rack solution, processing of queries occurs in the same manor. The query is received and parsed by the Control Node. The Control Node then passes the parsed query to each of the Compute nodes. The Compute node retrieves its portion of the data from its storage system and performs any calculations it can. The data is then passed to the Compute Node which assembles the result set from each Compute node and passes the data back to the client. This method of parallel processing provides clients with an improved method to store and access data as well as a replacement option for older, proprietary data warehouse appliances (Teradata, Greenplum) that have reached their capacity limits. With faster deployment and tight integration with the industry’s most popular tools for analysis and data access, it solves complex requirements for strategic business decisions. It also improves data quality, enabling clients to meet stringent governance requirements.

Appliance Scenarios

Fast Track and PDW are designed for Data Warehouses and not OLTP systems. Even though you have created a DW, these solutions still might not be a good fit for your BI solution. Before considering an appliance or reference architecture, fundamental criteria need to be evaluated to see if Fast Track or a PDW appliance is a fit for your data warehouse.

Scan-Intensive Whether the DW is being used to load an Analysis Services Cube, generate reports or return results from ad hoc queries, most activity on a Data Warehouse requires scanning through large amounts of data. These appliances have been developed to optimize sequential read operations.

6 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

Nonvolatile Data Warehouses by nature should be based on nonvolatile data. Update and delete operations introduce fragmentation to the underlying data structure causing slow query response. To prevent this from occurring, data should be written to a table and seldom changed. This does not mean you cannot have volatile data in you BI solution. As a matter of fact, Fast Track data load methodologies cover techniques for loading volatile data so that fragmentation is kept to a minimum. The same loading strategies can be used for both appliances. Should you require data to change on a frequent basis, it will be important to understand this need. It may be necessary to partition the data to separate the volatile from the static data.

Index-Light Non-Clustered Indexes add over head in the form of additional maintenance tasks to correct fragmentation that occurred to the index during the load process. These indexes are usually added to a DW to aid in performing specific lookup operations. While this can be beneficial for smaller tables, adding unnecessary indexes to large tables can introduce random disk I/O which can be up to 10 times slower than sequential reads.

Partition-Aligned In the nonvolatile section above, we indicated that volatile data can have a place in a data warehouse. Partitions in SQL Server allow volatile data to be physically separated from the static data. This separation prevents the static data structure from becoming fragmented as the volatile data is updated during the load process. As the data transitions to static data, the partitions can be integrated without introducing performance issues.

Additionally, partitions allow queries to take advantage of data ranges without the need to introduce additional indexes.

Appliance Scaling

The Fast Track reference Architecture is based on a balanced architecture to make sure it can scale. Most DW solutions start off with a standard SQL Server installation set up in the same method as any OLTP solution. This is often not appropriate for a DW. These solutions over time will have several bottle necks along the IO path. When determining the correct appliance solution, IO, memory and CPU need to be balanced so that the IO path can handle the large Data volumes.

In order to identify the correct appliance solution, your DW needs to be analyzed to better understand its workload. A complete analysis should look at the overall load of the solution. Based on the load, a reference Architecture can be determined.

7 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

Figure 1: Scope of Work Load Analysis

The typical Fast Track appliance is said to scale according to Figure 2. However this is not always the case. A system with high IO per query, yet low data volumes, may require a PDW solution. A solution with low IO per query, yet high data volumes, may require only a Fast Track appliance. A full analysis of the work load will be able to determine the appropriate solution.

Figure 2: Typical Scaling model 8 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

BI System/Foundation Design Work This paper is not focused on the design merits of an appliance, but more on next steps and ultimately the foundational design work needing to be done inclusive of the backend solution. If we look at the figure below, Fast Track is the component contained in the dotted-line box. We cannot forget the components before and after the warehouse solution. We can design the most efficient Fast Track solution on the planet, but if it fails to deliver value through the reporting and data analysis requirements of the business, it could create extensive and potentially costly rework to bring the solution up to par. Not to mention the loss in confidence from the business that the solution will ever work.

On a recent customer engagement we were asked to evaluate the Fast Track solution as a potential replacement for their current DW. The customer was surprised at our line of questioning along workloads, analytics, current and future desired reporting needs, and that we were not immediately concerned with the volume of data and server configuration. While the appliance solutions offer detailed documentation on calculating consumption rates to balance the loads across CPU, SAN, Memory, HBA cards, etc., there’s also a distinct need to understand what data you’re going to store in the data warehouse. By design, this element is missing from the reference architectures. This is where the IT and analyst teams must determine the reporting needs and the data behind them to build out the warehouse. But what if you’re not sure of the workload and future reporting needs? What if current workloads are on a different platform? What if the system currently in place has been there so long the business is completely unaware of the possibilities of data analysis capabilities today? These are some of the numerous questions that need to go into the ultimate design of a BI solution.

There is a very systematic approach to designing a BI solution, and each step will need to be followed. It’s generally good to start with the business data model and reporting needs because these will drive the ultimate data repository needs. Once these key elements have been defined, it’s good to prioritize them based upon business value and implementation achievability. These ranked priorities will drive the design and architecture of the BI solution. Either reference architecture fits nicely into the end to end solution and has the potential to deliver a lower TCO and quicker ROI in the value chain. Once in place, the possibilities of integrating line-of-

9 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

business data marts, CUBES, and adhoc reporting tools into the solution become infinitely easier. Additionally, the ETL (Extract/Transform/Load) from the source systems becomes stream-lined and efficient as there is a systematic approach to data loads due to the detailed upfront analysis that was completed.

As part of the integration of an LOB solution utilizing the reference architecture solution, HP is offering several business appliances preconfigured with SharePoint 2010 and SQL Server 2008R2. These appliances are excellent accelerators for further implementing and enabling the Microsoft technology stack. The key component to remember is the continued need for solid data design aligned with business needs. The goal is to avoid deploying technology for technology’s sake – the ROI must be realized. Although it builds a level of confidence that a solid technology and software platform exists as an organization matures in its data reporting and analytics capabilities, how we use this data for business purposes is dependent on the reporting tools accessing the data directly, the CUBES we hang off of it, or the analysis and applications digging into the data. These are the critical path items.

Business Needs Reference architectures are a great prescriptive guide for building a scalable DW solution on SMP servers and are designed to take advantage of key SQL Server features and physical server capabilities. They do not remove the need to design proper data models aligned with business needs. Ultimately, Fast Track is a guide to building a scalable data store and the need for added business tools will not be removed – how to facilitate that need? Business needs:

• LOB – Identifying and prioritizing business reporting and analysis needs based upon value and achiev- ability is critical to early and long term ROI. It’s also extremely important to accelerate user feedback so as not to design in a bubble and fail to deliver what’s expected by the end user community.

• Users – The most important element to designing any BI solution is meeting the business user’s needs. The needs of the users will drive the data model, ETL from source systems, workload determination, and data retention policies. There are generally several classifications of users:

–– Power analysts – require direct access to CUBES, DW’s, and source systems. The base of their usage patterns will be grounded in adhoc usage patterns. Taking advantage of SQL Server Re- source Governor to provide them a min/max amount of server capacity will be key.

–– Executive users – generally they care about high-level balance scorecard or types of data. This can generally be derived from the CUBE or Analysis Services data stores. It’s key to ensure timely and accurate delivery of scorecard information so the executive team has the right information to drive corporate direction on solid metrics.

10 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

–– Report Consumers – these are the LOB managers and other company employees that rely on daily reports to accurately perform and deliver their job functions. These reports may be gener- ated from the direct source system they’re attached to (ERP, CRM, SCM, etc.) or role (HR, Sales, Finance, IT, etc.) they’re fulfilling. The reporting needs may come from cross data needs such as sales and inventory, which may be found in the DW if modeled correctly.

• Concurrency – how many, what type, what time zone are these report users/consumers? ETL ETL – where does ETL live? Will it have its own server, or be configured on the appliance? Depending on volume, Source Location, DML/Transformation, it may make sense to move the ETL process to its own server. If ETL is light, it can be configured on the FT server. Is it SSIS, Data Stage, Data Integrator, Ab Initio, Informatica, scripts, Powershell, etc.? The type of ETL tool will play a critical role in where it’s installed.

Where are the source systems located? Same Data Center, Cloud, Remote Data Center with Low Latency or Remote DC with high latency in relation to the FT solution? Where the source systems reside affects the overall design of the ETL process. Should the source and destination DW not be in the same Data Center, ETL processing may need to be accomplished in both locations.

Will the Data Warehouse need to be replicated throughout multiple geographic regions? As data volumes grow, replicating changes throughout a distributed DW environment requires careful planning.

Before we get to the design patterns Fast Track has approved for loading data, data retention policies are often overlooked. Most organizations first data warehouse project was completed with little or no thought as to how much data will be kept and for how long. At first this seems to be a simple problem until a deeper analysis of the retention period and archival requirements are completed. Different subject areas have different retention periods. Some will require 2, 5, 7 and 15 years of data. Bill Inmon introduces data retention practices and processes in his DW 2.0 methodology. Archiving data is not as simple as moving transactional/fact data to another DB. One of our customers was doing just this until they found some of the supporting tables were growing quite large and causing performance issues. Their largest support table (Product) contained millions of records. That would not have been that bad except 90% of the records were for old products which referenced fact records that had been archived. They were in the process of evaluating new hardware and SAN storage when this was found. By removing the excess records, the system was restored to its original performance.

Also, with the PDW solution, it may make sense to change from an ETL process to an ELT (Extract / Load / Transform) process. This is due to the high speed data loader process, DWLOADER. DWLOADER efficiently transfers data from the Landing Zone node of the PDW Appliance to each compute node’s associated storage. Once the data is on the PDW Appliance, the CTAS (Create Table As Select – similar to a “SELECT INTO”

11 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

statement, but optimized for the PDW architecture) statement can be leveraged to transform the data into the desired format. Even if it means transferring data from one node to another, the high-speed internal infiniband network makes and movement very fast.

Archiving data is a complex topic. Where do you move the data to? Will moving it to slower disk be adequate? Should it remain in the existing DB or moved to another server? If moved to another DB or Server, will new schema changes be implemented? Should the schema remain the same as when the data was archived? Will you need to access the Data for audit reasons? Will existing reports need to access the archived data? This is a complex topic and one that needs to be addressed to insure your BI solution continues to function and meet the SLA’s.

Cubes A reference architecture does not address the need for Cubes within the infrastructure. Cubes/OLAP are often debated as to their need in a BI architecture. The bottom line is that they are often needed and offer advantages to the end users and IT staff..

• OLAP provides the ability to pre-aggregate data for faster retrieval times across all hierarchies and at- tributes

• Cubes define natural hierarchies and drill paths that are meaningful to the organization

• They provide a drill through path from aggregated data to the relational DW

• Complex calculations are stored and usable by all reporting and ad hoc tools without the need to rec- reate them

• Complex calculations are available across all hierarchies and attributes and are stored in a central loca- tion

• Scales to 1000’s of users

• Scales both up and out

• Available to be used by Reporting Services, Excel, and many other 3rd party products

• Allows end users to query the data directly without the need to understand complex table relationships, calculations, or SQL syntax

In the past year PowerPivot has emerged as a new tool for end users to query data that is available in their existing DW. PowerPivot combines disparate data sources and provides ad hoc analysis equal to using Analysis Services. Currently PowerPivot is created and accessed through Excel. If an organization is using SharePoint, PowerPivot workbooks can be published to the server and stored in a PowerPivot Version of Analysis Services.

12 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

With SQL Server 2012, PowerPivot is now able to be deployed directly to Analysis Services and will not require SharePoint. This leap forward allows existing PowerPivot In-memory cubes to be pulled into BIDS and edited by IT. These can then be distributed throughout the enterprise architecture.

PowerPivot utilizes a columnar store to make queries very fast over 100’s of millions of records. SQL Server 2012 has taken a lesson from PowerPivot and now offers Column Store Indexes. Column Store indexes can offer 5-100X performance increases in query times.

Reporting Reporting is the make or break component in the BI Stack. Correctly implementing the reporting layer will make the difference between the BI solution being seen as a strategic enterprise application or a complete failure. Reporting is the interface into the DW. An appliance can be implemented according to all the best practices, but if the users are unable to access Correctly implementing the the data in an easy to use manner, implementation will be a waste of time. reporting layer will make the difference between BI solutions should start with reporting and work backwards. If the the BI solution being seen correct elements are not captured during the DW design build process, as a strategic enterprise the solution will not be used. Our suggestion is to start with your existing application or a operational reports and any Excel or Access DB’s the user community complete failure. has created. If the user community has created the reports in Excel and Access, then making the same data available from your BI solution will be seen as a success.

Microsoft has many tools to choose from for reporting. Not to take anything away from the other tools, but, hands down the most important reporting interface is Excel. When creating the DW and cubes, make sure the data shows up in Excel according to the user’s needs.

Dashboards and static reports are another area of focus. PerformancePoint Services in SharePoint offers a great interface for executives and end users to share and access the BI solution. By utilizing the features in Analysis Services, you can expose “drill through” reporting from the dashboards to detail reports, to operations reports, and to operational systems.

13 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

Recent Case Study One of our customers decided it was time for them to implement a BI solution that would give them a single version of the truth. The goals of the solution were:

• Provide a Central DW that consolidated the operational and financial data into a single source

• Provide BI capabilities across the organization from Top to Bottom

• The BI Solution needs to be capable of providing remote users with the same data and views as the corporate office users

• KPI’s should function based on thresholds set for each operational role and department.

• Remote access would be accomplished via an internet portal

• Operational dashboards need to display KPI’s based on location

• Reuse any existing investments in reports or source systems if possible

• Decrease the work load in IT and end users

• Reduce the reporting work load on the operational system

By utilizing the Microsoft BI Stack, we were able to satisfy all of the customer’s goals and more. This project did not start out by looking at an appliance. Instead, we utilized our BI methodology to gain an understanding of the customer’s needs through interviews and workshops. The workshops pulled resources together from each department and different roles within each department. These resources identified the KPI’s and calculations that were to be stored in the BI solution and mapped to user roles.

The outcome of these sessions provided Scalability Experts with the functional specifications needed to work with the IT group to identify the source systems and the data volumes that would need to be integrated into a central DW. Next, we analyzed the types of user queries necessary to generate the dashboards and reports focusing on the data volume and complexity of the query.

One of the items uncovered during the workshops was the fact that end users failed to utilize existing system reports due to the complexity of accessing the systems. Users would call IT or other users to have them run a report or send them an Excel file with the extracted data. After analyzing the source systems, several reports had been written using reporting services. In addition to their operational solution, an imaging solution was used to look up paid invoices to perform a reconciliation process. This imaging solution was not currently being utilized by everyone.

14 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

Solution: • Integrate all the source systems data into a central DW

• Utilize the existing operational system to integrate the new dashboards and reports into a single solu- tion

• Corporate users would have access to the solution through either the operations system or from the SharePoint portal

• Utilize SSAS and reporting services to reduce the workload on the operation systems

The central data warehouse utilized the Fast Track reference architecture and data loading strategies. This process ensured fragmentation was kept to a minimum and performance optimized.

After the solution was implemented, the remote users would start each morning reviewing their dashboard to identify any critical issues needing to be addressed. Should an issue be found, the user could use drilldown reporting to identify the root cause. This process would walk the user through the dashboard to DW details report, to operational reports, and under the right scenario down to the invoice image. By using the MS BI Stack, we were able to meet and exceed the user’s expectations.

Another benefit to using the MS BI Stack along with the Fast Track Reference Architecture is that the Central DW can be moved to a PDW appliance if current solution becomes IO, Memory or Processor bound. This migration will require configuration changes and an analysis of how the data should be hashed across the compute nodes of a PDW. However it will not require Application or ETL changes.

Things to Consider

1. Implementing Fast Track Methodology can save you money. Customers are often experiencing performance issues with their existing DW’s because they have not implemented best practices for ETL and Data layout. The logical step for them is to upgrade the hardware. They see this as the most cost efficient approach when restructuring the DW and implementing best practices could save you money. We have seen BI solutions that have been distributed across 10-12 servers just to keep up with user demand. After a full analysis was completed, the solution was to implement Fast Track Best Practices and reduce the number of servers by 80%. The return on investment for this approach was only 7 months when all costs (server maintenance, software license and power consumption) were factored in.

2. First Time BI Projects go through many revisions before they are correct. Your first BI project will be a learning experience. For many companies, their first implementation goes through

15 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

2-4 revisions and still might not be scalable. Save yourself some time and heart aches by looking for assistance from a BI Expert. At the very least, bring an expert in for guidance and mentoring.

3. BI implementations that are not Customer Focused may not succeed. In today’s IT environment, you need to balance cost with business value. The best way to accomplish this with a BI solution is to make sure you give your users what they need. Talk to your users on a regular basis. BI solutions are not static and new ways to look at the data need to evolve with the business just like your BI solution. Work with the users to understand their needs. We often see IT shops creating a BI solution with little or no input from the user community. The reasons for this vary from “IT knows the users need it but it not a funded engagement” to “IT was told to implement a BI solution”. Not matter what the reason, get the users involved. If you are trying to prove out BI for the business, offer up a short POC that can be completed in 1-2 Weeks using their data.

4. A DW is just a bunch of data if it cannot be used. Identify the reporting requirements up front. Last year we worked with a group who created a DW and did not pull together the reporting requirements. When the Solution was completed and they started to generate reports, they quickly realized they did not have the necessary data elements for 60% of their reporting needs.

5. Operational data replication can help jump start BI but is not a BI solution. Many organizations utilize SQL replication to copy their existing operational systems to a new server for reporting purposes. While this is a valid solution for real time reporting, near real time (5-15 min delay) BI may be more suitable. By using a combination of Operation Data Stores (ODS) and DW, an organization can see the benefits of getting a jump start on the ETL process by moving data to an ODS before consolidating the data into the DW.

6. Fast Track Reference Architecture will only scale so far. Then comes a PDW Appliance. You have implemented the Fast Track methodology and migrated to the largest Fast Track Architecture. Your next step might be to implement a HUB and Spoke model by using PDW as the HUB. The good news is that the investment you have made in the ETL process can be utilized to load PDW. Better news is that you can utilize the investment in Fast Track hardware to be used as a Spoke/ for the overall solution. PDW will become the central DW. New ETL processes will move data from PDW to departmental/subject areas data marts. This will greatly reduce the amount of data the users need to query in the data mart and still leave the data available in the central PDW.

16 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

How Can SE Help

Being one of the first partners to be trained by Microsoft on their Fast Track and PDW appliance initiative, Scalability Experts has developed and optimized two new services to assess a customer’s environment and then deploy the right Fast Track solution that best fits their needs.

SE’s Architecture Design Review (ADR) service: will identify an appropriate Fast Track and Parallel architecture for the customer’s environment to provide the foundation for predictable performance, scalability, and availability. The SE ADR provides the customer with a strategy to re-architect the data warehouse infrastructure, a road map and recommendations on how to meet future growth requirements. The session includes a benchmarking of a customer’s environment in order to determine Maximum CPU Core Consumption Rate (MCR), Benchmark Consumption Rate (BCR) and Required User Data Capacity (UDC). The service also covers knowledge transfer, design and best practices and methodologies for supporting an enterprise business intelligence solution.

SE ScaleFastSM Data Warehouse Deployment solution: is an end-to-end implementation solution using industry proven Fast Track methodologies to very quickly design and implement an appliance solution that is fine- tuned for your particular environment. Based on SE’s expertise and unique processes, an appliance can be up and running significantly faster than deployment times provided by other service providers. Let us unleash the power of the appliance and optimize its features and functionality to deliver you the highest level of performance. SE’s ScaleFast deployment solution includes a deep dive workload evaluation and environment assessment, specific hardware recommendations, and turnkey implementation services.

Conclusion Fast Track is a great reference architecture and methodology for configuring and implementing a Microsoft Data Warehouse. However, Fast Track and PDW appliances are only a part of the overall BI solution. Designing a BI solution requires taking an End-to-End look at all the components that make up BI and implementing them to meet the needs of your organization. Fast Track takes the confusion out of selecting the right hardware based on your needs and not what you think you need. Fast Track along with PDW will give an organization the peace of mind that they have chosen a platform that provides high availability and is scalable. However, the fact remains that if you do not have the right BI system in place to fully optimize and unleash the power of the appliance it is just another piece of hardware. What makes these advancements sizzle is making sure you have designed the right BI solution to meet your business and user’s needs today and far into the future.

17 Get more from your data.™ White Paper :: How to Unleash the Real Power of Microsoft’s Fast Track and PDW Appliances

Biography Author - Andy Isley Andy Isley is the BI Practice Manager and a Senior BI Architect for Scalability Experts and is a subject matter expert in designing and implementing end-to-end BI solutions. Andy has over 15 years of experience in architecture, software architecture, Data Warehouse Design and Software Configuration Management Life Cycle. Andy has worked in various industries including manufacturing, consumer package goods, utilities, food services, banking and profession services. Before entering the IT Industry, Andy worked as an accountant for a major textile manufacture where he managed 21 cost centers.

About Scalability Experts We are an award-winning global leader in Data Management and Business Intelligence solutions. Our services help you get more value from your data and increase the performance and scalability of your computing environment. With 10 years of industry experience and our deep understanding of every facet of the data lifecycle, we can optimize the performance of your database operations, make your systems more responsive and provide business insight critical to gain a competitive advantage. The world’s leading software and hardware companies such as Microsoft and HP rely on Scalability Experts to help their customers. Let us help you. Contact us at 469-635-6200 or visit our Website at www.scalabilityexperts.com.

18