Modernizing the Data Warehouse
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Netezza EOL: IBM IAS Is a Flawed Upgrade Path Netezza EOL: IBM IAS Is a Flawed Upgrade Path
Author: Marc Staimer WHITE PAPER Netezza EOL: What You’re Not Being Told About IBM IAS Is a Flawed Upgrade Path Database as a Service (DBaaS) WHITE PAPER • Netezza EOL: IBM IAS Is a Flawed Upgrade Path Netezza EOL: IBM IAS Is a Flawed Upgrade Path Executive Summary It is a common technology vendor fallacy to compare their systems with their competitors by focusing on the features, functions, and specifications they have, but the other guy doesn’t. Vendors ignore the opposite while touting hardware specs that mean little. It doesn’t matter if those features, functions, and specifications provide anything meaningfully empirical to the business applications that rely on the system. It only matters if it demonstrates an advantage on paper. This is called specsmanship. It’s similar to starting an argument with proof points. The specs, features, and functions are proof points that the system can solve specific operational problems. They are the “what” that solves the problem or problems. They mean nothing by themselves. Specsmanship is defined by Wikipedia as: “inappropriate use of specifications or measurement results to establish supposed superiority of one entity over another, generally when no such superiority exists.” It’s the frequent ineffective sales process utilized by IBM. A textbook example of this is IBM’s attempt to move their Netezza users to the IBM Integrated Analytics System (IIAS). IBM is compelling their users to move away from Netezza. In the fall of 2017, IBM announced to the Enzee community (Netezza users) that they can no longer purchase or upgrade PureData System for Analytics (the most recent IBM name for its Netezza appliances), and it will end-of-life all support next year. -
EMC Greenplum Data Computing Appliance: High Performance for Data Warehousing and Business Intelligence an Architectural Overview
EMC Greenplum Data Computing Appliance: High Performance for Data Warehousing and Business Intelligence An Architectural Overview EMC Information Infrastructure Solutions Abstract This white paper provides readers with an overall understanding of the EMC® Greenplum® Data Computing Appliance (DCA) architecture and its performance levels. It also describes how the DCA provides a scalable end-to- end data warehouse solution packaged within a manageable, self-contained data warehouse appliance that can be easily integrated into an existing data center. October 2010 Copyright © 2010 EMC Corporation. All rights reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com All other trademarks used herein are the property of their respective owners. Part number: H8039 EMC Greenplum Data Computing Appliance: High Performance for Data Warehousing and Business Intelligence—An Architectural Overview 2 Table of Contents Executive summary ................................................................................................. -
Microsoft's SQL Server Parallel Data Warehouse Provides
Microsoft’s SQL Server Parallel Data Warehouse Provides High Performance and Great Value Published by: Value Prism Consulting Sponsored by: Microsoft Corporation Publish date: March 2013 Abstract: Data Warehouse appliances may be difficult to compare and contrast, given that the different preconfigured solutions come with a variety of available storage and other specifications. Value Prism Consulting, a management consulting firm, was engaged by Microsoft® Corporation to review several data warehouse solutions, to compare and contrast several offerings from a variety of vendors. The firm compared each appliance based on publicly-available price and specification data, and on a price-to-performance scale, Microsoft’s SQL Server 2012 Parallel Data Warehouse is the most cost-effective appliance. Disclaimer Every organization has unique considerations for economic analysis, and significant business investments should undergo a rigorous economic justification to comprehensively identify the full business impact of those investments. This analysis report is for informational purposes only. VALUE PRISM CONSULTING MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT. ©2013 Value Prism Consulting, LLC. All rights reserved. Product names, logos, brands, and other trademarks featured or referred to within this report are the property of their respective trademark holders in the United States and/or other countries. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this report may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. -
Dell Quickstart Data Warehouse Appliance 2000
Dell Quickstart Data Warehouse Appliance 2000 Reference Architecture Dell Quickstart Data Warehouse Appliance 2000 THIS DESIGN GUIDE IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND. © 2012 Dell Inc. All rights reserved. Reproduction of this material in any manner whatsoever without the express written permission of Dell Inc. is strictly forbidden. For more information, contact Dell. Dell , the DELL logo, the DELL badge, and PowerEdge are trademarks of Dell Inc . Microsoft , Windows Server , and SQL Server are either trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries. Intel and Xeon are either trademarks or registered trademarks of Intel Corporation in the United States and/or other countries. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell Inc. disclaims any proprietary interest in trademarks and trade names other than its own. December 2012 Page ii Dell Quickstart Data Warehouse Appliance 2000 Contents Introduction ............................................................................................................ 2 Audience ............................................................................................................. 2 Reference Architecture .............................................................................................. -
The Teradata Data Warehouse Appliance Technical Note on Teradata Data Warehouse Appliance Vs
The Teradata Data Warehouse Appliance Technical Note on Teradata Data Warehouse Appliance vs. Oracle Exadata By Krish Krishnan Sixth Sense Advisors Inc. November 2013 Sponsored by: Table of Contents Background ..................................................................3 Loading Data ............................................................3 Data Availability .........................................................3 Data Volumes ............................................................3 Storage Performance .....................................................4 Operational Costs ........................................................4 The Data Warehouse Appliance Selection .....................................5 Under the Hood ..........................................................5 Oracle Exadata. .5 Oracle Exadata Intelligent Storage ........................................6 Hybrid Columnar Compression ...........................................7 Mixed Workload Capability ...............................................8 Management ............................................................9 Scalability ................................................................9 Consolidation ............................................................9 Supportability ...........................................................9 Teradata Data Warehouse Appliance .........................................10 Hardware Improvements ................................................10 Teradata’s Intelligent Memory (TIM) vs. Exdata Flach Cache. -
The Roles of ETL, ESB, and Data Virtualization Technologies in Integration Landscape Chapter 1 Chapter 5 Data Integration Data Integration Strategies
The Roles of ETL, ESB, and Data Virtualization Technologies in Integration Landscape Chapter 1 Chapter 5 Data Integration Data Integration Strategies 3-4 Compared 11 Chapter 2 Chapter 6 Extract, Transform, Load - Integrating Data Integration ETL Strategies 5 12 Chapter 3 Case Studies Enterprise Service Bus - ESB 7-8 13-15 Chapter 4 Data Virtualization - DV 9-10 The Roles of ETL, ESB, and Data Virtualization Technologies in Integration Landscape 2 Chapter 1 Data Integration Data Silo Syndrome The problem of data silos, which are data sources that are unable to easily share data from one to the other, has plagued the IT landscape for many years, and continues to do so today, despite the advents of broadband Internet, gigabit networking, and cloud-based storage. Data silos exists for a variety of reasons: • Old systems have trouble talking with modern systems. • On-premises systems have difficulty talking with cloud-based systems. • Some systems only work with specific applications. • Some systems are configured to be accessed by specific individuals or groups. • Companies acquire other companies, taking on differently-configured systems. The Roles of ETL, ESB, and Data Virtualization Technologies in Integration Landscape 3 Chapter 1 Bringing the Data Together The problem with data silos is that no one can run a query across them; they must be queried separately, and the separate results need to be added together manually, which is costly, time-consuming, and inefficient. To bring the data together, companies use one or more of the following data integration strategies: 1. Extract, Transform, and Load (ETL) Processes, which copy the data from the silos and move it to a central location, usually a data warehouse 2. -
State of the Data Warehouse Appliance Krish Krishnan
Thema Datawarehouse Appliances The future looks bright considering all the innovation State of the Data Warehouse Appliance Krish Krishnan Krish Krishnan is a recognized expert worldwide in the strategy, architec- ture and implementation of high performance data warehousing solutions. He is a visionary data warehouse thought leader and an independent analyst, writing and speaking at industry leading conferences, user groups and trade publications. His conclusion: The future looks bright for the data warehouse appliance. We have been seeing in the past 18 months a steady influx of runs for 10 hours in a 1,000,000 records database”; data warehouse appliances, and the offerings include the esta- - “My users keep reporting that the ETL processes failed due to blished players like Oracle and Teradata. There are a few facts large data volumes in the DW”; that are clear from this rush of options and offerings: - “My CFO wants to know why we need another $500,000 for - The data warehouse appliance is now an accepted solution and infrastructure when we invested a similar amount just six is being sponsored by the CxO within organizations; months ago”; - The need to be nimble and competitive has driven the quick - “My DBA team is exhausted trying to keep the tuning process maturation of data warehouse appliances, and the market is online for the databases to achieve speed.” embracing the offerings where applicable; - Some of the world’s largest data warehouses have augmented The consistent theme throughout the marketplace focuses on the the data warehouse appliance, including: New York Stock need to keep performance and response time up without incur- Exchange has multiple enterprise data warehouses (EDWs) on ring extra spending due to higher costs. -
Gartner Magic Quadrant for Data Management Solutions for Analytics
16/09/2019 Gartner Reprint Licensed for Distribution Magic Quadrant for Data Management Solutions for Analytics Published 21 January 2019 - ID G00353775 - 74 min read By Analysts Adam Ronthal, Roxane Edjlali, Rick Greenwald Disruption slows as cloud and nonrelational technology take their place beside traditional approaches, the leaders extend their lead, and distributed data approaches solidify their place as a best practice for DMSA. We help data and analytics leaders evaluate DMSAs in an increasingly split market. Market Definition/Description Gartner defines a data management solution for analytics (DMSA) as a complete software system that supports and manages data in one or many file management systems, most commonly a database or multiple databases. These management systems include specific optimization strategies designed for supporting analytical processing — including, but not limited to, relational processing, nonrelational processing (such as graph processing), and machine learning or programming languages such as Python or R. Data is not necessarily stored in a relational structure, and can use multiple data models — relational, XML, JavaScript Object Notation (JSON), key-value, graph, geospatial and others. Our definition also states that: ■ A DMSA is a system for storing, accessing, processing and delivering data intended for one or more of the four primary use cases Gartner identifies that support analytics (see Note 1). ■ A DMSA is not a specific class or type of technology; it is a use case. ■ A DMSA may consist of many different technologies in combination. However, any offering or combination of offerings must, at its core, exhibit the capability of providing access to the data under management by open-access tools. -
DATA VIRTUALIZATION: the Modern Data Integration Solution
Whitepaper DATA VIRTUALIZATION: The Modern Data Integration Solution DATA VIRTUALIZATION LAYER Contents Abstract 3 Introduction 4 Traditional Integration Techniques 5 Data Virtualization 6 The Five Tiers of Data Virtualization Products 8 Summary: 10 Things to Know about Data Virtualization 9 Denodo Platform: The Modern Data Virtualization Platform 10 Abstract Organizations continue to struggle with integrating data quickly enough to support the needs of business stakeholders, who need integrated data faster and faster with each passing day. Traditional data integration technologies have not been able to solve the fundamental problem, as they deliver data in scheduled batches, and cannot support many of today’s rich and complex data types. Data virtualization is a modern data integration approach that is already meeting today’s data integration challenges, providing the foundation for data integration in the future. This paper covers the fundamental challenge, explains why traditional solutions fall short, and introduces data virtualization as the core solution. Introduction We have been living through an unprecedented explosion in the volume, variety, and velocity of incoming data. This is not new, but emerging technologies, such as the cloud and big data systems, which have brought large volumes of disparate data, have only compounded the problem. Different sources of data are still stored in functional silos, separate from other sources of data. Today, even data lakes contain multiple data silos. Business stakeholders need immediate information for real-time decisions, but this is challenging when the information they need is scattered across multiple sources. Similarly, initiatives like cloud first, app modernization, and big data analytics cannot move forward until data from key sources is brought together as a unified source. -
Migrating to Virtual Data Marts Using Data Virtualization Simplifying Business Intelligence Systems
Migrating to Virtual Data Marts using Data Virtualization Simplifying Business Intelligence Systems A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy January 2015 Sponsored by Copyright © 2015 R20/Consultancy. All rights reserved. Cisco and the Cisco logo are trademarks or registered trademarks of Cisco and/or its affiliates in the U.S. or there countries. To view a list of Cisco trademarks, go to this URL: www.cisco.com/go/trademarks. Trademarks of companies referenced in this document are the sole property of their respective owners. Table of Contents 1 Introduction 1 2 Data Marts 1 3 The Costs of Data Marts 3 4 The Alternative: Virtual Data Marts using Data Virtualization 5 5 Developing Virtual Data Marts 6 Step 1: Recreating Physical Data Marts as Virtual Data Marts 6 Step 2: Improving Query Performance on Virtual Data Marts 12 Step 3: Identifying Common Specifications Among Virtual Data Marts 14 Step 4: Redirecting Reports to Access Virtual Data Marts 16 Step 5: Extracting Definitions from the Reporting Tools 16 Step 6: Defining Security Rules 16 Step 7: Adding External Data to Virtual Data Marts 17 6 Getting Started 17 About the Author Rick F. van der Lans 19 About Cisco Systems, Inc. 19 Copyright © 2015 R20/Consultancy, all rights reserved. Migrating to Virtual Data Marts using Data Virtualization 1 1 Introduction Almost every BI system is made up of many data marts. These data marts are commonly developed to improve query performance, to deliver to users the data with the right data structure and the right aggregation level, to minimize network delay for geographically dispersed users, to allow the use of specific database technologies, and to give the users more control over their data. -
Next-Generation Data Virtualization Fast and Direct Data Access, More Reuse, and Better Agility and Data Governance for BI, MDM, and SOA
WHITE PAPER Next-Generation Data Virtualization Fast and Direct Data Access, More Reuse, and Better Agility and Data Governance for BI, MDM, and SOA Executive Summary It’s 9:00 a.m. and the CEO of a leading consumer electronics manufacturing company is meeting with the leadership team to discuss disturbing news. Significant errors in judgment resulted in the loss of millions of dollars in the last quarter. Overproduction substantially increased the costs of inventory. The sales team is not selling where market demands exist. Customer shipping errors are on the rise. With competitive threats at an all-time high, these are indeed disturbing developments for the company. All of these problems can be traced to one thing: the lack of a common view of all enterprise information assets across the company, on demand, when key decision makers need it. Some lines of business have stale or outdated data. Many define common entities, such as customer, product, sales, and inventory, differently. Others simply have incomplete and inaccurate data, or they can’t view the data in the proper context. A few years ago, a couple of business lines embarked on business intelligence (BI) projects to identify better cross-sell and up-sell opportunities. Their IT teams found that creating a common view of customer, product, sales, and inventory was no easy task. Data wasn’t just in the data warehouse—it was everywhere, and it was growing with each passing day. Various tools and integration approaches—such as hand coding, ETL, EAI, ESB, EII—were applied to access, integrate, and deliver all this data to different consuming applications. -
Teradata Data Warehouse Appliance 2800
Teradata Data Warehouse Appliance 2850 07.16 EB9388 DATA WAREHOUSING The Best Performing Appliance Effortless Scalability Unmatched in its scalability, the Data Warehouse Appliance in the Marketplace accommodates future business growth by expanding incrementally from two to 2,048 nodes. It also accommodates Teradata Corporation makes taking your first step toward an user data space from 16TB to 34PB of uncompressed integrated data warehouse an easy one with the Teradata® user data. Featuring a massively parallel processing (MPP) Data Warehouse Appliance 2850. With this enterprise-ready architecture, the platform supports scalable growth, not only platform from Teradata, you can start building your inte- in data capacity, but in all dimensions including performance, grated data warehouse, and grow it as your needs expand. users, and applications. Ready to Run The Teradata BYNET® system interconnect for high-speed, fault tolerant messaging between nodes is a key ingredient Delivered ready to run, the Teradata Data Warehouse to achieving scalability. The BYNET is based on a robust, Appliance is a fully-integrated system purpose built for powerful protocol with innovative database messaging data warehousing. The appliance features the industry- features that optimize the use of a dual Mellanox® leading Teradata Database, a robust Teradata hardware InfiniBand™ fabric for the Teradata MPP architecture. platform with dual Intel® Xeon® 18-core processors, up to 12TB of memory in a single cabinet, SUSE® Linux Mission-Critical Availability operating system, and enterprise-class storage—all The Teradata Data Warehouse Appliance achieves preinstalled into a power-efficient unit. That means the availability and performance continuity through Teradata’s system is up and running live in just a few hours for rapid unique clique architecture where nodes are connected to time to value.