Greenplum Data Warehouse Technical

Total Page:16

File Type:pdf, Size:1020Kb

Greenplum Data Warehouse Technical Data In. Decisions Out. Bart Sjerps Advisory Technology Consultant Oracle SME - EMEA [email protected] +31-6-27058830 5/20/2011 1 Blog: http://bartsjerps.wordpress.com “If I’d asked my customers what they wanted they’d have said a faster horse.” Henry Ford • I’m pretty sure that if Ford had asked his customers what they wanted, they’d have said something like faster horses and the reason is fairly simple: they couldn’t imagine anything else. • In fact, they didn’t want faster horses, they wanted a faster personal transportation method. It’s as simple as that. For Henry Ford, achieving this goal was absolutely impossible with a horse, so he came up with the idea of building a car that everybody could afford. Nobody knew they needed a car before they saw the Model-T (and knew they could afford it). T-Mobile Rob Strickland, CIO 5/20/2011 2 Database sizes over the years: -1996: 11 TB, in a Teradata Database -1999: 130 TB, in a Teradata Database - May 2008: 2 PB, Yahoo 5/20/2011 3 It’s Time for a Change . Yesterday’s Data Warehouse and Analytic Infrastructure The Greenplum Future Proprietary Commodity Expensive Cost-Effective Centralized, Monolithic Distributed Process-Heavy Self-Service Batch Real-Time Summarized Deep Slow Agile 4 •4 5/20/2011 4 Greenplum – True market disruption 20 Terabytes 70 Terabytes 100 Terabytes 20 kW, 8 Racks 20 kW, 6 Racks 12 kW, 2 Racks $20M $7M $1.8M 5/20/2011 5 Market Momentum • 170+ global enterprise customers • 100%+ Year-to-year growth 2009 • Acquired by EMC july 2009 • Growing more quickly than Netezza and Teradata • +$250 Million saved by customers choosing GP over Teradata • +5 Billion shares analyzed daily by Financial Markets using GP • +20 Trillion rows being mined for business value • +1Billion consumers receiving more secure and personalize services from GP customers 5/20/2011 6 Industry Recognition: 2009 Gartner Magic Quadrant Gartner: • Strengths – Scale *2007 was our first year on the MQ – Mixed workloads – Cloud ready – Self service – Low cost • Concerns – Company size • Fixed by EMC – R&D budget • Fixed by EMC Source: Gartner (January 2010) 5/20/2011 7 Customers by Industry Financial Services Telco Media & Internet Retail Gov’t & Health/Ins. 5/20/2011 8 Greenplum Database Data in. Decisions out. Fastest Advanced Data analytics Loading Data in In Database Analytics Decisions out Scatter/Gather Streaming™ for the Optimized for fast query execution Unified data access for greater world’s fastest data loading and linear scalability insight and value from data • Eliminate data load bottlenecks • Move processing closer to data • Enable parallel analysis across the enterprise • Shared nothing MPP scale-out • Clean and integrate new data architecture • Open platform with broad language support • Several loading options ranging • Computing is automatically from bulk load updates to micro- optimized and distributed across • Certified enterprise connectivity batching for near real-time resources and integration with most BI, ETL and management products processing • Provides the best concurrent multi-workload performance 5/20/2011 9 Greenplum Database Architecture Overview 5/20/2011 10 10 Data Computing Division Product Portfolio Greenplum Greenplum Greenplum Community Data Greenplum Database Computing Chorus Edition Appliance Enterprise Industry’s Data Cloud World’s most platform most Free entry powerful scalable level purpose- MPP analytic built database database database Virtualized, platform system self-service analytic infrastructure 5/20/2011 11 Deployment models • Greenplum Community Edition – Free downloadable – Limited to 2 segment servers – All software is enabled • Greenplum Software Only – I.e. run on Vsphere / Vblock – Or on standard (Intel) servers • Greenplum DCD Appliance – Pre-configured, tested, supported, plug & play – Huge bandwidth • DCD Appliance hybrid DAS / SAN 5/20/2011 12 Architecture of Greenplum DCA Flexible framework for processing large datasets SQL MapReduce Process large datasets with support for UDF’sUDF’s:: R,Java,C,Python,Perl ODBC, JDBC, OLEDB both SQL and MapReduce etc BI/ETL Tools Master servers optimize queries Master Master for the most efficient query execution Interconnect for continuous pipelining of data processing Segment Segment Segment Segment Segment Segment servers process queries … close to the data in parallel MPP Scatter /Gather streaming for fast loading of data 5/20/2011 13 Architecture • Based on PostgreSQL (open source) database – 15+ years of development – Feature-rich, mission critical-ready • Greenplum adds features on top of PostgreSQL – Very low development cost (compared to traditional RDBMS vendors) • Linear Scale-out • Parallel loading • Not depending on classic (OLTP) RDBMS tricks – Special indexes, materialized views, … 5/20/2011 14 Greenplum Database: Technical Stack CLIENT ACCESS 3rd PARTY TOOLS ADMIN TOOLS CLIENT ACCESS & ODBC, JDBC, OLEDB, etc. BI Tools, ETL Tools GP Performance Monitor TOOLS Data Mining, etc pgAdmin3 for GPDB LOADING & EXT. ACCESS STORAGE & DATA ACCESS LANGUAGE SUPPORT Petabyte-Scale Loading Hybrid Storage & Execution Comprehensive SQL -Oriented) PRODUCT Trickle Micro -Batching (Row - & Column Native MapReduce FEATURES Anywhere Data Access In-Database Compression SQL 2003 OLAP Extensions Multi-Level Partitioning Programmable Analytics Indexes – Btree, Bitmap, etc. GPDB ADAPTIVE Multi-Level Fault Tolerance Online System Expansion Workload Management SERVICES Shared-Nothing MPP Parallel Dataflow Engine CORE MPP ARCHITECTURE Parallel Query Optimizer gNet™ Software Interconnect Polymorphic Data Storage™ MPP Scatter/Gather Streaming™ 5/20/2011 15 What is MPP & Shared Nothing? MPP = Massively Parallel Processing • Two or more Servers (with own CPU/RAM/Disk) working on the same task • Multiple units of parallelism working together • Parallel Database Operations • Parallel CPU Processing • Segments = Greenplum Units of Parallelism (one Postgres database) ‘Shared Nothing’ Architecture • Each Segment is a separate Postgres Database • Segments only operate on their portion of the data • Segments are self-sufficient • Dedicated CPU Processes • Dedicated storage that is only accessible by the Segment 5/20/2011 16 Shared---Nothing-Nothing Architecture Massively Parallel Processing (MPP) • Most scalable database architecture – Optimized for BI and analytics Interconnect • Provides automatic parallelization – No need for manual partitioning or tuning – Just load and query like any database • Tables are distributed across segments – Each has a subset of the rows • Extremely scalable and I/O optimized Loading – All nodes can scan and process in parallel – No I/O contention between segments • Linear scalability by adding nodes – Each adds storage, query performance and loading performance 5/20/2011 17 Greenplum Database Master Node • Stores no user data • Manages global system catalog • Provides single view of multiple, independent postgres databases • Performs user authentication, query parsing/optimizing, error messaging, returns result sets to the Client • Most importantly : Creates MPP-optimized query plan for broadcast to GP cluster 5/20/2011 18 Anatomy of a Segment Node Four Postgres Databases Running Within One Segment Host server Segment Segment Segment Segment Database Database Database Database Open Source Open Source Open Source Open Source Postgres Postgres Postgres Postgres Red Hat / SuSE / Centos Linux or Solaris Primary A1 Primary A2 Primary A3 Primary A4 Core Core Mirror A4 Mirror A1 Mirror A2 Mirror A3 6 SAS/SATA Drives 1A 2A Primary B1 Primary B2 Primary B3 Primary B4 Gig/E Intel Intel G6 G6 Mirror B4 Mirror B1 Mirror B2 Mirror B3 6 SAS/SATA Drives Core Core Gig/E 1B 2B Primary C1 Primary C2 Primary C3 Primary C4 Mirror C4 Mirror C1 Mirror C2 Mirror C3 6 SAS/SATA Drives RAM 48GB Primary D1 Primary D2 Primary D3 Primary D4 Mirror D4 Mirror D1 Mirror D2 Mirror D3 6 SAS/SATA Drives RAID 5 Sets 5/20/2011 19 Greenplum Database How a distributed database works 5/20/2011 20 20 Data Distribution : The Key to Parallelism Strategy: spread data evenly across as many nodes (and disks) as possible Order ID Order # # OrderOrder Date Date OrderOrder CustomerCustomer 43 Oct 20 2005 12 64 Oct 20 2005 111 45 Oct 20 2005 42 46 Oct 20 2005 64 77 Oct 20 2005 32 48 Oct 20 2005 12 50 Oct 20 2005 34 56 Oct 20 2005 213 63 Oct 20 2005 15 44 Oct 20 2005 102 53 Oct 20 2005 42 55 Oct 20 2005 55 5/20/2011 21 Distribution Policies •Hash Distribution – CREATE TABLE … DISTRIBUTED BY (column [,…]) – Keys of the same value always sent to the same segments Round-Robin Distribution – CREATE TABLE … DISTRIBUTED RANDOMLY – Rows with columns of the same value not necessarily on the same segment 5/20/2011 22 Planning & Dispatching a Query Master=Query Dispatch (QD) Segment=Query Execution (QE) 5/20/2011 23 Further Improve Scan Times SELECT COUNT(*) FROM orders WHERE order_date >= ‘Oct 20 2005’ AND order_date < ‘Oct 27 2005’ Segment 1A Segment 1B Segment 1C Segment 1D Segment 1A Segment 1B Segment 1C Segment 1D Segment 2A Segment 2B Segment 2C Segment 2D Segment 2A Segment 2B Segment 2C Segment 2D VS Segment 3A Segment 3B Segment 3C Segment 3D Segment 3A Segment 3B Segment 3C Segment 3D Hash Partition Multi-Level Partition 5/20/2011 24 Greenplum Database Key Features and Differentiators 5/20/2011 25 25 Greenplum Database: Core Architecture & Dynamic Services Dynamic GPDB DYNAMIC Self-Healing Online System Workload SERVICES Fault Tolerance Expansion Management Parallel Dataflow Engine Shared-Nothing MPP gNet™ Software CORE MPP Parallel Query Optimizer
Recommended publications
  • Greenplum Database Performance on Vmware Vsphere 5.5
    Greenplum Database Performance on VMware vSphere 5.5 Performance Study TECHNICAL WHITEPAPER Greenplum Database Performance on VMware vSphere 5.5 Table of Contents Introduction................................................................................................................................................................................................................... 3 Experimental Configuration and Methodology ............................................................................................................................................ 3 Test Bed Configuration ..................................................................................................................................................................................... 3 Test and Measurement Tools ......................................................................................................................................................................... 5 Test Cases and Test Method ......................................................................................................................................................................... 6 Experimental Results ................................................................................................................................................................................................ 7 Performance Comparison: Physical to Virtual ......................................................................................................................................
    [Show full text]
  • Data Warehouse Fundamentals for Storage Professionals – What You Need to Know EMC Proven Professional Knowledge Sharing 2011
    Data Warehouse Fundamentals for Storage Professionals – What You Need To Know EMC Proven Professional Knowledge Sharing 2011 Bruce Yellin Advisory Technology Consultant EMC Corporation [email protected] Table of Contents Introduction ................................................................................................................................ 3 Data Warehouse Background .................................................................................................... 4 What Is a Data Warehouse? ................................................................................................... 4 Data Mart Defined .................................................................................................................. 8 Schemas and Data Models ..................................................................................................... 9 Data Warehouse Design – Top Down or Bottom Up? ............................................................10 Extract, Transformation and Loading (ETL) ...........................................................................11 Why You Build a Data Warehouse: Business Intelligence .....................................................13 Technology to the Rescue?.......................................................................................................19 RASP - Reliability, Availability, Scalability and Performance ..................................................20 Data Warehouse Backups .....................................................................................................26
    [Show full text]
  • EMC Secure Remote Services 3.18 Site Planning Guide
    EMC® Secure Remote Services Release 3.26 Site Planning Guide REV 01 Copyright © 2018 EMC Corporation. All rights reserved. Published in the USA. Published January 2018 EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. EMC2, EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United States and other countries. All other trademarks used herein are the property of their respective owners. For the most up-to-date regulatory document for your product line, go to Dell EMC Online Support (https://support.emc.com). 2 EMC Secure Remote Services Site Planning Guide CONTENTS Preface Chapter 1 Overview ESRS architecture........................................................................................ 10 ESRS installation options ...................................................................... 10 Other components ................................................................................ 11 Requirements for ESRS customers......................................................... 11 Supported devices.....................................................................................
    [Show full text]
  • Dell EMC IT Big Data Analytics Journey
    Dell EMC IT Big Data Analytics Journey Nagesh Madhwal Client Solutions Director, Consulting, Southeast Asia, Dell EMC Agenda 1 Dell EMC IT Big Data Journey 2 Building the Data Lake 3 Marketing Science Lab Use Case 4 Technical Benefits 5 Lessons Learned 6 Q&A 3 Dell - Internal Use - Confidential Dell EMC IT Big Data Journey A Journey Of Maturity 1 AGGREGATE 2 LIBERATE 3 INNOVATE/ITERATE HARNESS Consolidation BA-as-a-Service Flexible / Scalable Analytics -based decision making Master Data Data Scientist Services Mission Critical Leveraging data to predict future models Common BI Tools Collaborative Analytic Tools Real Time Capable Transforming operations by BI Governance Unified Analytical Platform Collaborative Delivery applying analytics FOUNDATION ANALYTICS ENABLEMENT DATA LAKE ANALYTICS ENTERPRISE 2010 2011 2012 2013 2014 2015 2016 4 Dell - Internal Use - Confidential Building The Data Lake PROCESS MONITOR THE MEASURE BUSINESS IMPROVE THE EXECUTION BUSINESS PERFORMANCE BUSINESS APPS ERP INNOVATE CRM ITERATE REFINE Master Data Workspace Analytics Machine BU App Data EMBED INTO BUSINESS APPS “MAKE THEM SMARTER” GOVERNANCE 5 Dell - Internal Use - Confidential Powered by Intel® Xeon® Processors Dell EMC IT Data Lake Architecture ANALYTICS TOOLBOX APPLICATIONS DATA GOVERNANCE APPLICATIONS COLLIBRA BATCH - DATA PLATFORM MICRO EXECUTION CASSANDRA POSTGRESQL MEMORY DB GEMFIRE PROCESS SPRING XD PIVOTAL HD GREENPLUM DB ATTIVIO BATCH APACHE APACHE RANGER INGESTION Social Media Sensor Network Web Supplier Market ERP CRM PLM UNSTRUCTURED STRUCTURED
    [Show full text]
  • Wherescape RED for Pivotal Greenplum
    WhereScape RED for Pivotal Greenplum Wherescape red for pivotal greenplum WhereScape RED is an agile data warehouse development and management solution that automates much of the data warehouse life cycle—from initial scoping, prototyping, loading and populating to ongoing management and optimization. In addition, WhereScape RED automates the creation and management of documentation, diagrams and lineage information. “Our results using Optimized for Greenplum WhereScape have been extremely impressive. WhereScape RED for Pivotal Greenplum is optimized to fully leverage the Greenplum Database. WhereScape RED accelerates time to value for your WhereScape enabled Greenplum investment by requiring fewer resources to model, build and us to design, develop, deploy your data warehouse. Eliminating hand coding and automating document and deploy Greenplum development creates a simplified infrastructure a production-ready and dramatically reduces total cost of ownership. solution in 8 weeks. WhereScape RED “knows” all Greenplum objects—including views, Using traditional data distribution keys and append-only tables, and utilizes Greenplum’s warehouse development rich feature set to build native Greenplum objects, document them methods would have and schedule data to be loaded. Utilizing the RED user interface, users taken us 6-8 months.” can simply drag and drop to develop Greenplum objects—build tables, generate Greenplum SQL code to populate the tables, and create HTML documentation. RED’s open metadata architecture is stored in database Ryan Fenner, VP, Data tables for easy access and integrates with external testing and source Solutions Architect, control tools. Union Bank WhereScape RED works seamlessly as an ELT (extract, load and transformation) using the Greenplum GPLOAD bulk load utility, the fast method for loading data into Greenplum.
    [Show full text]
  • Pivotal Greenplum Command Center Documentation | Pivotal GPCC Docs
    Table of Contents Table of Contents 1 Pivotal Greenplum Command Center Documentation 2 About Pivotal Greenplum Command Center 3 Installing the Greenplum Command Center Software 6 Downloading and Running the Greenplum Command Center Installer 7 Setting the Greenplum Command Center Environment 9 Creating the gpperfmon Database 10 Upgrading Greenplum Command Center 12 Uninstalling Greenplum Command Center 14 Creating Greenplum Command Center Console Instances 15 Greenplum Command Center User Guide 18 Connecting to the Greenplum Command Center Console 19 Dashboard 20 Query Monitor 23 Host Metrics 25 Cluster Metrics 27 Monitoring Multiple Greenplum Database Clusters 29 History 30 System 33 Segment Status 34 Storage Status 37 Admin 38 Permission Levels for GPCC Access 39 Authentication 41 Workload Management 43 Administering Greenplum Command Center 47 About the Command Center Installation 48 Starting and Stopping Greenplum Command Center 49 Administering Command Center Agents 50 Administering the Command Center Database 51 Administering the Web Server 52 Configuring Greenplum Command Center 53 Enabling Multi-Cluster Support 54 Securing a Greenplum Command Center Console Instance 56 Configuring Authentication for the Command Center Console 58 Enabling Authentication with Kerberos 60 Securing the gpmon Database User 65 Utility Reference 67 gpcmdr 68 gpccinstall 70 Configuration File Reference 71 Command Center Agent Parameters 72 Command Center Console Parameters 74 Setup Configuration File 75 Greenplum Database Server Configuration Parameters 77 © Copyright Pivotal Software Inc, 2013-2017 1 3.3.1 Pivotal Greenplum Command Center Documentation Documentation for Pivotal Greenplum Command Center. About Greenplum Command Center Pivotal Greenplum Command Center is a management tool for the Greenplum Big Data Platform. This section introduces key concepts about Greenplum Command Center and its components.
    [Show full text]
  • In the United States District Court for the Eastern District of Texas Tyler Division
    Case 6:11-cv-00660-LED Document 1 Filed 12/08/11 Page 1 of 16 PageID #: 1 IN THE UNITED STATES DISTRICT COURT FOR THE EASTERN DISTRICT OF TEXAS TYLER DIVISION Personalweb Technologies LLC Plaintiff, v. Civil Action No. 6:11-cv-660 EMC Corporation, and JURY TRIAL REQUESTED VMware, Inc. Defendants. COMPLAINT FOR PATENT INFRINGEMENT Plaintiff PersonalWeb Technologies LLC files this Complaint for Patent Infringement against EMC Corporation and VMware Inc. (collectively, “Defendants”) and states as follows: THE PARTIES 1. Plaintiff PersonalWeb Technologies LLC (“PersonalWeb” or “Plaintiff”) is a limited liability company organized under the laws of Texas with its principal place of business at 112 E. Line Street, Suite 204, Tyler, Texas, 75702. PersonalWeb was founded in August 2010 and is in the business of developing and distributing software based on its technology assets. 2. PersonalWeb protects its proprietary business applications and operations through a portfolio of patents that it owns, including 13 issued and pending United States patents. PersonalWeb is assignee and owner of eight patents at issue in this action: U.S. Patent Nos. 5,978,791, 6,415,280, 6,928,442, 7,802,310, 7,945,539, 7,945,544, 7,949,662, and 8,001,096. 3. Defendant EMC Corporation (“EMC”) is a Massachusetts Corporation with its principal place of business at 176 South Street, Hopkinton, Massachusetts. EMC is qualified to McKool 298950v1 Case 6:11-cv-00660-LED Document 1 Filed 12/08/11 Page 2 of 16 PageID #: 2 do business in the state of Texas, Filing No. 0007347306, and has appointed CT Corporation System, 350 N Saint Paul St.
    [Show full text]
  • EMC STRATEGY Journey to Cloud -Big Data
    EMC STRATEGY Journey to Cloud -Big Data Agathi Galani Indirect District Manager Greece, Malta, Cyprus 5th December 2011 © Copyright 2011 EMC Corporation. All rights reserved. 1 EMC’s Mission To Lead Customers On Their Journey To Hybrid Cloud Computing © Copyright 2011 EMC Corporation. All rights reserved. 2 The Journey to Your Cloud: Infrastructure Private Cloud is the logical first step Enterprise IT Private Cloud Public Cloud ComplexTrusted Simple ControlledExpensive Low Cost InflexibleReliable Flexible SecureSiloed Dynamic “70% Will Spend More On Private Cloud through 2012” GARTNER DATA CENTER CONFERENCE 2009 Infrastructure © Copyright 2011 EMC Corporation. All rights reserved. 3 The Journey To The Private Cloud % Virtualized Simplicity Scalability Efficiency Continuity Standardization Protection Security Automation IT Production Business Production IT-as-a-Service Infrastructure Focus Applications Focus Business Focus © Copyright 2011 EMC Corporation. All rights reserved. 4 IT Production Virtualize non-business-critical IT-owned applications Challenges Approach • Islands of infrastructure • Consolidated infrastructure • CAPEX • Virtualized servers • Power • Tiered SANs • Disk-based backup Efficiency © Copyright 2011 EMC Corporation. All rights reserved. 5 EMC IT: IT Production Benefits Realized IT Production EMC IT Department Efficiency Benefits Realized $12M Power and Space Savings $74M Data Center Equipment Savings 170% Gain in Storage Admin Productivity 34% Increase in Energy Efficiency 60M Pounds of CO 2 Reduced Phase 1 IT -owned Apps © Copyright 2011 EMC Corporation. All rights reserved. 6 “VNXe is the easiest storage device we’ve ever used” THE CITY OF SAFFORD “Extremely well equipped, and starting at under $10,000 represents excellent value” COMPUTER RESELLER NEWS Simple. Efficient. Affordable. © Copyright 2011 EMC Corporation. All rights reserved.
    [Show full text]
  • EMC Secure Remote Support IP Solution 2.08 Site Planning Guide
    EMC® Secure Remote Support IP Solution Release 2.08 Site Planning Guide P/N 300-012-317 REV A01 EMC Corporation Corporate Headquarters: Hopkinton, MA 01748-9103 1-508-435-1000 www.EMC.com Copyright © 2005-2011 EMC Corporation. All rights reserved. Published February, 2011 EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date regulatory document for your product line, go to the Document/Whitepaper Library on EMC Powerlink. For the most up-to-date-listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All other trademarks used herein are the property of their respective owners. 2 EMC Secure Remote Support IP Solution Release 2.08 Site Planning Guide Contents Preface Chapter 1 Overview About the ESRS IP Solution............................................................. 16 What is new ................................................................................ 17 ESRS IP components ................................................................. 18 Requirements for ESRS IP customers ....................................
    [Show full text]
  • EMC DATA COMPUTING APPLIANCE Integrated Platform Driving the Future of Big Data Analytics
    EMC DATA COMPUTING APPLIANCE Integrated Platform Driving the Future of Big Data Analytics ADDRESSING THE NEW CHALLENGES OF A DATA-DRIVEN WORLD Exploding data volumes, new data types, and ever-growing competitive challenges have led to radical changes in analytical technologies and a new approach to exploiting data. Decades-old legacy architectures for data management have reached scale limitations that make them unfit for processing big data. The fast-growing data assets, broad diversity in data type and structure, and the need for complex analytics to unlock value from these data assets have overwhelmed traditional architectures. ® The EMC Data Computing Appliance (DCA) is an integrated analytics platform that accelerates analysis of Big Data assets within a single integrated appliance by using the Pivotal GreenplumTM database for analytics-optimized SQL on structured data. Delivery as a preconfigured appliance assures rapid deployment, simplified administration, and industry-leading TCO. Modular Design for Scalability and Flexibility Designed as a modular platform, DCAs can be scaled at any time by adding new modules. Adding modules provides linear scalability of storage and compute capacity for database capabilities. The structured data processing of SQL in the Pivotal Greenplum database delivers maximum flexibility and scalability for organizations that require fast analysis of diverse data sets. ESSENTIALS • Purpose-built, high-performance Performance Architecture big data analytics appliance The DCA employs a massively parallel processing (MPP) architecture for fast SQL processing, plus the fastest data loading rates in the industry—without the complexity • Includes Pivotal Greenplum for advanced SQL and predictive and constraints of proprietary hardware. DCAs are purpose-built for analytics, and analytics on big data provide scalable computation, storage, and interconnect, delivered as a pre-configured appliance.
    [Show full text]
  • Expecting EOL: a Comprehensive Guide to EOL and EOSL Best Practices and Deadlines
    Expecting EOL: A Comprehensive Guide to EOL and EOSL Best Practices and Deadlines When EOL happens, EOSL follows. But you just bought a new line of hardware. Why should you be thinking about End of Life or End of Service Life so soon after your purchase? While you don’t need to immediately plan an exit strategy for those brand-new units, you certainly want to make sure you are educated 3140 Northwoods Parkway, Suite 700 on the matter, and that you have a plan-of-action for when the Norcross, Georgia 30071 OEM does announce those dates. The last thing you want is to Phone: 1 (877) 531-7466 be surprised by an EOSL announcement and only have a limited Fax: (877) 568-2114 www.CentricsIT.com amount of time to respond with critical operational decisions. United States // Canada // United Arab Emirates Learn Your EOL Dates Legally, the manufacturer must publish the EOL date for a particular line of hardware as soon as it is determined by the company. Although the OEM is required to post the announcement, it is not required to notify you directly. It’s up to you and your IT Team to stay informed and consistently check the manufacturer websites or leverage ITAM tools. Consider this first announcement as your indicator to begin prepping for a change in maintenance support or procurement initiatives. After EOL is called, EOSL can come months, or even years, later. Once EOL is announced, you will no longer be able to buy that same hardware from the OEM. Over time your OEM warranty- with-original-purchase will expire, and your maintenance bills will increase dramatically.
    [Show full text]
  • "Business Transformation Through
    VMware on VMware Case Study Business Transformation through IT Transformation: VMware IT Tackles Big Data in the Cloud BUSINESS GROUP VMware business units across the enterprise. Big data is here to stay. Gaining actionable business insight out of all that data presents huge opportunities for companies and KEY CHALLENGES poses unique challenges for IT. VMware IT was no exception as Deliver a scalable, cost-effective it sought to quickly deliver business information to users, at way for VMware business scale, with a minimum of IT overhead. VMware’s business units units to create their own data analyses while eliminating had multiple independent data marts delivering custom reports multiple, separate database in a “hub and spoke” fashion that siloed information and instances for greater efficiency consumed resources. This model would not be sustainable as in production. VMware continued to rapidly grow. SOLUTION The Challenge Consolidate 7 Oracle data “VMware’s growth demanded that we shift from a ‘hub and spoke’ approach to a marts into one, federated data shared enterprise data model,” said David McMath, IT director and executive program mart in the cloud using sponsor for the Magellan initiative. “This was necessary to ensure users had timely Greenplum from Pivotal. access to business information at scale, but also because there was no way IT would be able to keep up as we grew.” VMware IT evaluated options for a federated data mart that would scale with user needs and meet big data requirements. They needed to be able to serve both operational and analytical needs of multiple departments worldwide. For economy, agility and because VMware is committed to the cloud, IT considered migrating its existing Oracle instances but found it would require the deployment of Oracle RAC, which meant dramatically higher cost and complexity.
    [Show full text]