Choosing a Database- As-A-Service an Overview of Offerings by Major Public Cloud Service Providers

Total Page:16

File Type:pdf, Size:1020Kb

Choosing a Database- As-A-Service an Overview of Offerings by Major Public Cloud Service Providers CHOOSING A DATABASE- AS-A-SERVICE AN OVERVIEW OF OFFERINGS BY MAJOR PUBLIC CLOUD SERVICE PROVIDERS Warner Chaves Principal Consultant, Microsoft Certified Master, Microsoft MVP With Contributors Danil Zburivsky, Director of Big Data and Data Science Vladimir Stoyak, Principal Consultant for Big Data, Certified Google Cloud Platform Qualified Developer Derek Downey, Practice Advocate, OpenSource Databases Manoj Kukreja, Big Data and IT Security Specialist, CISSP, CCAH and OCP When it comes to running your data in the public cloud, there is a range of Database-as-a-Service (DBaaS) offerings from all three major public cloud providers. Knowing which is best for your use case can be challenging. This paper provides a high-level overview of the main DBaaS offerings from Amazon, Microsoft, and Google. After reading this white paper, you’ll have a high-level understanding of the most popular data repositories and data analytics service offerings from each vendor, you’ll know the key differences among the offers, and which ones are best for each use case. With this information, you can direct your more detailed research to a manageable number of options. www.pythian.com | White Paper 1 This white paper does not discuss private cloud providers or colocation environments, streaming, data orchestration, or Infrastructure-as-a-Service (IaaS) offerings. This paper is targeted to IT professionals with a good understanding of databases and also business people who want an overview of data platforms in the cloud. WHAT IS A DBAAS OFFERING? A DBaaS is a database running in the public cloud. Three things define a DBaaS: • The service provider installs and maintains the database software, including backups and other common database administration tasks. The service provider also owns and manages the operating system, hypervisors, and bare metal hardware. • Application owners pay according to their usage of the service. • Usage of the service must be flexible—users can scale up or down on demand and also create and destroy environments on demand. These operations should be possible through code with no provider intervention. FOUR CATEGORIES OF DBAAS OFFERINGS To keep things simple, we’ve created four categories of DBaaS offerings. Your vehicles of choice are: • The Corollas: These are the classic RDBMS services in the cloud: Amazon Relational Database Service (RDS), Microsoft Azure SQL Database, and Google Cloud SQL. • The Formula One offerings: These special-purpose offerings ingest and query data very quickly but might not offer all the amenities of the Corollas. Options include Amazon DynamoDB, Microsoft Azure DocumentDB, Google Cloud Datastore, and Google Cloud Bigtable. • The 18-wheelers: These data warehouses of structured data in the cloud include Amazon Redshift, Microsoft Azure SQL Data Warehouse, and Google BigQuery. • The container ships: These Hadoop-based big-data systems can carry anything, and include Amazon Elastic MapReduce (EMR), Microsoft Azure HDInsight, and Google Cloud Dataproc. This category also includes the further automated offering of Azure Data Lake. The rest of this white paper discusses each category and the Amazon, Microsoft, and Google offerings within each category. We describe each offering, explain what it is well suited for, provide expert tips or additional relevant information, and provide high-level pricing information. www.pythian.com | White Paper 2 COROLLAS With the Corollas, just like with the car, you know what you’re getting, and you know what to expect. This type of classic RDBMS service gets you from point A to point B reliably. It’s not the flashiest or newest thing on the block, but it gets the job done. AMAZON RDS Amazon Relational Database Service (RDS) is the granddaddy of DBaaS offerings available on the Internet. RDS is an automation layer that Amazon has built on top of MySQL, MariaDB, Oracle, PostgreSQL, and SQL Server. Amazon has also developed its own MySQL fork called Amazon Aurora, which also lives inside RDS. RDS is an easy way to transition into DBaaS because the service mimics the on- premises experience very closely. You simply need to provision an RDS instance, which maps very closely to the virtual machine models that Amazon offers. Amazon then installs bits, manages patches and backups, and can also manage the high availability, so you do not need to plan and execute these tasks yourself. RDS is very good for lift-and-shift types of cloud migrations. It makes it easy for existing staff to take advantage of the service because it mimics the on-premises experience, be it physical or virtual. EXPERT TIP The storage is very flexible: this is both a pro and a con. The pro is that you have a lot of control over storage. The con is that there are so many storage options, you need the knowledge to choose the best one for your use case. Amazon has general storage, provisioned IOPS (input/output operations per second), and two categories of magnetic storage. The storage method you choose will depend on your particular use cases. You need to be aware that Amazon does not make every patch version of all products available on RDS. Instead, Amazon makes only some major service packs or Oracle patch levels available. As a result, the exact patch level that you have on premises might not map to a patch level on RDS. In this situation, do not move to a patch level that is below the patch level you have because that may result in product regressions. Instead, wait until Amazon has deployed a patch level higher than what you have. At this point, it should be fairly safe to start testing if you want to migrate to RDS. HOW IT’S PRICED The hourly rate for RDS depends on: • whether you have your own license or if Amazon is leasing you the license; www.pythian.com | White Paper 3 • how much compute power you choose: The number of cores, and amount of memory and temporary disk you want on this instance; • the storage you require; and • whether you pre-purchased with Reserved Instances. MICROSOFT AZURE SQL DATABASE Microsoft Azure SQL Database is a “cloud-first” SQL Server fork. The term “cloud- first” means that Microsoft now tests and deploys their code continuously with Azure SQL Database, and the code and lessons learned are implemented in the retail SQL Server product—whether the product is on premises or on a virtual machine. Even if you don’t have any investment in SQL Server, Azure SQL Database is an excellent DBaaS platform because of the investments made to support the elastic capabilities and to the ease of scaling horizontally. As you need more capacity, you just add more databases. It’s also easy to manage the databases by pooling resources, performing elastic queries, and performing elastic job executions. You could deploy your own code to do something similar in Amazon RDS , but in Azure SQL Database, Microsoft has already built it for you. In addition, Azure SQL Database makes it easy to build an elastic application on a relational service. This capability supports the Software-as-a-Service (SaaS) model, wherein you have many clients and each has a database. The SaaS provider has a data layer that is easier to manage and scale than if they were running on their own infrastructure. Unlike Amazon RDS, Azure SQL Database does not exactly map to a type of retail database, such as Oracle, SQL Server, or open-source MySQL. It is closely related to SQL Server but it’s not licensed or sold in a similar way. As a result, Azure SQL Database does not have any licensing component. At the same time, Azure SQL Database does not give you a lot of control over the hardware. With Amazon RDS, you need to select CPUs, memory, and your storage layout. Azure SQL Database does all this for you. With Azure SQL Database the only thing that you need to choose is the service tier. Your choice determines how much power your database has. There are three service tiers: basic, standard, and premium. Each of these also has some sub-tiers to increase or decrease performance. If you have many databases in Azure SQL Database, you can also choose the elastic database pool pricing option to increase your savings by sharing resources. www.pythian.com | White Paper 4 Azure SQL Database is a good choice if you already have Transact-SQL (T-SQL) skills in-house. If you have a large investment in SQL Server, Azure SQL Database is the most natural way to take advantage of DBaaS offerings in the cloud. It’s also a very good web scale relational service in its own right because of all the investments made to support the SaaS model. EXPERT TIP You do need to ensure that you do the proper SQL tuning to be able to choose the right service tier for your needs. In the past, it was more difficult to scale up because all equipment was on premises. Now, it’s very easy to increase the power of the service and therefore pay more money. However, just because scaling up is easy does not mean it’s always what you need to do. If you perform the proper SQL tuning, you will not need to pay more for raw power. HOW IT’S PRICED Azure SQL Database has a simple pricing model. You pay an hourly rate for the service tier your database is running on: Basic, Standard, or Premium. Each has a different size limit for the database and provides more performance as you go up in the tier. GOOGLE CLOUD SQL Google Cloud SQL is a MySQL managed database service that is very similar to Amazon RDS for MySQL and Amazon Aurora.
Recommended publications
  • Google-Cloud Documentation Release 0.20.0
    google-cloud Documentation Release 0.20.0 Google Cloud Platform October 06, 2016 google-cloud 1 Base Client 1 2 Credentials Helpers 5 3 Base Connections 9 4 Exceptions 13 5 Environment Variables 17 6 Configuration 19 6.1 Overview................................................. 19 6.2 Authentication.............................................. 19 7 Authentication 21 7.1 Overview................................................. 21 7.2 Client-Provided Authentication..................................... 21 7.3 Explicit Credentials........................................... 22 7.4 Troubleshooting............................................. 23 7.5 Advanced Customization......................................... 24 8 Long-Running Operations 27 9 Datastore Client 29 9.1 Connection................................................ 32 10 Entities 37 11 Keys 39 12 Queries 43 13 Transactions 47 14 Batches 51 15 Helpers 55 16 Storage Client 57 16.1 Connection................................................ 59 i 17 Blobs / Objects 61 18 Buckets 69 19 ACL 77 20 Batches 81 21 Using the API 83 21.1 Authentication / Configuration...................................... 83 21.2 Manage topics for a project....................................... 83 21.3 Publish messages to a topic....................................... 84 21.4 Manage subscriptions to topics..................................... 84 21.5 Pull messages from a subscription.................................... 86 22 Pub/Sub Client 87 22.1 Connection................................................ 88
    [Show full text]
  • Google Certified Professional - Cloud Architect.Exam.57Q
    Google Certified Professional - Cloud Architect.exam.57q Number : GoogleCloudArchitect Passing Score : 800 Time Limit : 120 min https://www.gratisexam.com/ Google Certified Professional – Cloud Architect (English) https://www.gratisexam.com/ Testlet 1 Company Overview Mountkirk Games makes online, session-based, multiplayer games for the most popular mobile platforms. Company Background Mountkirk Games builds all of their games with some server-side integration, and has historically used cloud providers to lease physical servers. A few of their games were more popular than expected, and they had problems scaling their application servers, MySQL databases, and analytics tools. Mountkirk’s current model is to write game statistics to files and send them through an ETL tool that loads them into a centralized MySQL database for reporting. Solution Concept Mountkirk Gamesis building a new game, which they expect to be very popular. They plan to deploy the game’s backend on Google Compute Engine so they can capture streaming metrics, run intensive analytics, and take advantage of its autoscaling server environment and integrate with a managed NoSQL database. Technical Requirements Requirements for Game Backend Platform 1. Dynamically scale up or down based on game activity 2. Connect to a managed NoSQL database service 3. Run customize Linux distro Requirements for Game Analytics Platform 1. Dynamically scale up or down based on game activity 2. Process incoming data on the fly directly from the game servers 3. Process data that arrives late because of slow mobile networks 4. Allow SQL queries to access at least 10 TB of historical data 5. Process files that are regularly uploaded by users’ mobile devices 6.
    [Show full text]
  • Studi Perbandingan Layanan Cloud Computing
    Jurnal Rekayasa Elektrika Vol. 10, No. 4, Oktober 2013 193 Studi Perbandingan Layanan Cloud Computing Afdhal Jurusan Teknik Elektro, Fakultas Teknik, Universitas Syiah Kuala Jl. Tgk. Syech Abdurrauf No. 7 Darussalam, Banda Aceh 23111 e-mail: [email protected] Abstrak—Selama beberapa tahun terakhir, cloud computing telah menjadi topik dominan dalam bidang teknologi informasi dan komunikasi (TIK). Saat ini, cloud computing menyediakan berbagai jenis layanan, antara lain layanan hardware, infrastruktur, platform, dan berbagai jenis aplikasi. Cloud computing telah menjadi solusi dan pelayanan, baik untuk meningkatkan kehandalan, mengurangi biaya komputasi, sampai dengan memberikan peluang yang cukup besar bagi dunia industri TIK untuk mendapatkan keuntungan lebih dari teknologi ini. Namun disisi lain, pengguna akhir (end user) dari layanan ini, tidak pernah mengetahui atau memiliki pengetahuan tentang lokasi fisik dan sistem konfigurasi dari penyedia layanan ini. Tujuan artikel ini adalah untuk menyajikan sebuah pemahaman yang lebih baik tentang klasifikasi-klasifikasi cloud computing, khususnya difokuskan pada delivery service (pengiriman layanan) sebagai model bisnis cloud computing. Artikel ini membahas model-model pengembangan, korelasi dan ketergantungan antara satu model layanan dengan model layanan yang lainnya. Artikel ini juga menyajikan perbandingan dan perbedaan tingkatan-tingkatan dari model delivery service yang dimulai dari mengidentifikasi permasalahan, model pengembangan, dan arah masa depan cloud computing. Pemahaman tentang klasifikasi delivery service dan isu-isu seputar cloud computing akan melengkapi pengetahuan penggunanya untuk menentukan keputusan dalam memilih model bisnis mana yang akan diadopsi dengan aman dan nyaman. Pada bagian akhir artikel ini dipaparkan beberapa rekomendasi yang ditujukan baik untuk penyedia layanan maupun untuk pengguna akhir cloud computing. Kata kunci: cloud computing, delivery service, IaaS, PaaS, SaaS Abstract—In the past few years, cloud computing has became a dominant topic in the IT area.
    [Show full text]
  • Every App in the Universe
    THE BIGGER BOOK OF APPS Resource Guide to (Almost) Every App in the Universe by Beth Ziesenis Your Nerdy Best Friend The Bigger Book of Apps Resource Guide Copyright @2020 Beth Ziesenis All rights reserved. No part of this publication may be reproduced, distributed, or trans- mitted in any form or by any means, including photocopying, recording or other elec- tronic or mechanical methods, without the prior written permission of the publisher, except in the case of brief quotations embodied in critical reviews and certain other non- commercial uses permitted by copyright law. For permission requests, write to the pub- lisher at the address below. Special discounts are available on quantity purchases by corporations, associations and others. For details, contact the publisher at the address below. Library of Congress Control Number: ISBN: Printed in the United States of America Avenue Z, Inc. 11205 Lebanon Road #212 Mt. Juliet, TN 37122 yournerdybestfriend.com Organization Manage Lists Manage Schedules Organize and Store Files Keep Track of Ideas: Solo Edition Create a Mind Map Organize and Store Photos and Video Scan Your Old Photos Get Your Affairs in Order Manage Lists BZ Reminder Pocket Lists Reminder Tool with Missed Call Alerts NerdHerd Favorite Simple To-Do List bzreminder.com pocketlists.com Microsoft To Do Todoist The App that Is Eating Award-Winning My Manager’s Favorite Productivity Tool Wunderlist todoist.com todo.microsoft.com Wunderlist Plan The Award-Winning Task Manager with a Task Manager and Planning Tool Rabid Fanbase
    [Show full text]
  • Announcement
    Announcement 68 articles, 2016-03-23 18:03 1 Android Pay is coming to the UK, Google’s first market in Europe (2.00/3) Now signing up 1.5 million users each week in the U. S., Android Pay, Google's payment platform and mobile wallet for Android devices and apps, is coming to.. 2016-03-23 18:02 2KB feedproxy.google.com 2 Bill Gates calls for tighter safeguards on bulk data collection Gates discusses the iPhone case, AI and the Microsoft Surface Book in his latest Reddit chat,Business Software,Privacy ,Microsoft,Bill Gates,Windows 10,Artificial Intelligence 2016-03-23 18:02 2KB www.computing.co.uk 3 Opera to add native ad blocker to desktop web browser Need for speed? Opera's web browser will come with a built-in speedometer ,Software,Internet,Internet of Things,Cloud and Infrastructure ,Opera,Opera Software,Browsers 2016-03-23 18:02 2KB www.computing.co.uk 4 US Department of Defense to migrate millions of PCs to Windows 10 - in just one year DoD to migrate four million desktops in just one year - or, at least, that's the plan,Operating Systems,Software ,Windows,Windows 10,Microsoft,Department of Defense,operating system 2016-03-23 18:02 3KB www.computing.co.uk 5 AI is coming, so we'd better start preparing for it Francois Tung of law firm Pinsent Masons examines the legal implications of rapidly improving artificial intelligence,Cloud and Infrastructure,Software,Legislation and Regulation ,Artificial Intelligence,Machine Learning,Pinsent Masons 2016-03-23 18:02 1017Bytes www.computing.co.uk 6 Security flaw affecting thousands of Linux apps
    [Show full text]
  • GAE — Google App Engine
    GAE — Google App Engine Prof. Dr. Marcel Graf TSM-ClComp-EN Cloud Computing (C) 2017 HEIG-VD Google App Engine Introduction ■ Google App Engine is a PaaS for building scalable web applications and mobile backends. ■ Makes it easy to deploy a web application: ■ The client (developer) supplies the application’s program code. ■ Google’s platform is responsible for running it on its servers and scaling it automatically. ■ The platform offers built-in services and APIs such as NoSQL datastore, memcache and user authentication. ■ There are some restrictions regarding the supported application types. ■ The application cannot access everything on the server. ■ The application must limit its processing time. ■ Launched in April 2008 (preview) ■ In production since September 2011 ■ Supported programming languages and runtimes: ■ Go ■ Node.js ■ PHP ■ .NET ■ Java ■ Ruby ■ Python ■ … TSM-ClComp-EN Cloud Computing | Google App Engine | Academic year 2017/2018 2 Platform as a Service Reminder: Web applications in Java — Servlets ■ A servlet runs inside an application server ■ The server calls the servlet to handle an HTTP request ■ The task of the servlet is to create the response Application server HTTP request Servlet HTTP response Browser Servlet Database Servlet TSM-ClComp-EN Cloud Computing | Google App Engine | Academic year 2017/2018 3 Platform as a Service Reminder: Web applications in Java — Servlets ■ Complete picture of the processing of an HTTP request Application Servlet servier Browser Database HTTP request HTTPServlet Request object URL
    [Show full text]
  • An Analytic Real Life Case Study of Google Cloud Computing Architecture Platform’S for Big Data Analytics Products from AWS, Azure and GCP
    Vol-6 Issue-5 2020 IJARIIE-ISSN(O)-2395-4396 An Analytic Real Life Case Study of Google Cloud computing architecture Platform’s for Big Data Analytics Products from AWS, Azure and GCP Devendra Kumar Nagayach1, Dr. Pushpneel Verma2 1Research Scholar, Deptt. of Computer Science & Application, Bhagwant University, Rajasthan, India 2Associate Professor, Bhagwant University, Rajasthan, India ABSTRACT A large amount of data, structured or unstructured, is called "Big Data", which mainly consists of five properties such as volume, velocity and variation, validation and value of which value is the most important whose main purpose is to extract relevant information. The other four V’s to solve this problem of polite data and analysis; various technologies have emerged in the last decades. "Cloud computing" is a platform where thousands of servers work together to meet various computing needs and billing is done as per 'pay-as-you-go' increases. This study will present a novel dynamic scaling methodology to improve the performance of big data systems. A dynamic scaling methodology will be developed to scale the system from a big data perspective. Furthermore, these aspects will be used by the algorithm of the supporting project, which can be broken down into smaller tasks to be processed by the system. These small bangles will be run on multiple virtual machines to perform parallel work to increase the runtime performance of the system. Keyword: - SCC, GFRSCC, Properties, and EFNARC etc. 1. INTRODUCTION Currently, we live in an era where big data has emerged and is attracting attention in many fields such as science, healthcare, business, finance and society.
    [Show full text]
  • Approved by Gsa 29 May 19
    APPROVED BY GSA 29 MAY 19 End User License Agreement Your use of the Software is subject to the terms and conditions of the Google Cloud Platform License Agreement, but only to the extent that all terms and conditions in the Google Cloud Platform License Agreement are consistent with Federal Law (e.g., the Anti- Deficiency Act (31 U.S.C. § 1341 and 41 U.S.C. §6301), the Contracts Disputes Act of 1978 (41. U.S.C. § 601-613), the Prompt Payment Act, the Anti-Assignment statutes (41 § U.S.C.6405), 28 U.S.C. § 516 (Conduct of Litigation Reserved to Department of Justice (DOJ), and 28 U.S.C. § 1498 (Patent and copyright cases)). To the extent the terms and conditions in the Google Cloud Platform License Agreement or this End User License Agreement (EULA) are inconsistent with Federal Law (See FAR 12.212(a)), they shall be deemed deleted and unenforceable as applied to any Orders under this EULA. Under this Agreement, which is incorporated in Customer’s Quote and Order, Customer purchases the right to access and use the Google Services specified in an order form. 1. Products and Services. Customer hereby purchases the Google Cloud Platform services (including any associated APIs) listed at Appendix 1 and https://cloud.google.com/cloud/services (or such other URL as Google may provide) also known as the “Google for Work & Google for Education” Products and Services (the “Products and Services”) and in accordance with the terms and conditions of this EULA. Customer will find the terms specific to each Product and Service at Appendix 2 and _ (“Service Specific Terms”).
    [Show full text]
  • Breathe London Technical Report Pilot Phase (2018 – 2020)
    Breathe London Technical Report Pilot Phase (2018 – 2020) January 2021 BREATHE LONDON PARTNERS Acknowledgements The pilot phase of Breathe London was delivered by a consortium led by Environmental Defense Fund Europe and funded by the Children’s Investment Fund Foundation, with continued funding from Clean Air Fund and additional funding support provided by Valhalla Charitable Foundation. The project was convened by C40 Cities, the leading global alliance of cities committed to addressing climate change, and the Greater London Authority. The project consortium included ACOEM Air Monitors, Cambridge Environmental Research Consultants, Google Earth Outreach, the National Physical Laboratory and the University of Cambridge. The Wearables study was commissioned by the Environmental Research Group at Imperial College London (formerly at King’s College London). The pilot phase ran from July 2018 to November 2020. Breathe London partners would like to thank the many hosts of Breathe London air quality monitors including local councils, schools and residents, as well as the scientific and project advisors, technical partners and other non- governmental organisation partners for their contributions. The lessons and insights gained from Breathe London draw from these collective efforts. i Breathe London Technical Report | Pilot Phase Table of Contents Part 1 The Breathe London project ......................................................................................... 4 1.1 Overview ...........................................................................................................................
    [Show full text]
  • Check and Evernote Named to PC Magazine's Top 100 Android Apps
    http://www.pcmag.com/article2/0,2817,2393102,00.asp The 100 Best Android Apps of 2013 By Max Eddy ARTICLE DATE : October 28, 2013 pcmag.com It seems like only yesterday Google Play was a quiet hamlet populated by mere hundreds of thousands of apps. But as of this past summer, the official Android app store crossed the twin thresholds of a million apps and 50 billion downloads. Navigating this bustling metropolis of applications has always been difficult, which is why we've created this handy list to help guide you toward the best apps out there. We're constantly pruning and updating this list, trying to make sure it has a little something for everyone while also keeping it fresh and interesting. While our tastes and yours may be different, we think our list is a good starting point. Hopefully the next time you stop by, you'll see something new. Whoa, 10 pages? Uncool. We can all agree that paginated stories are annoying. But with 100 apps, we had to break it up just to make it readable. We've even divided the article into themed sections, so you can skip over the first 50 or so if you're only interested in, say, music apps. Also, if you're interested in more Android games (and be honest, you are) check out our list of the 10 Best Android Games. You're welcome! The 100 Best Android Apps of 2013 Hey, You Missed Something Though we are professionals, we have yet to physically and mentally merge with the Google Play store, becoming an omnipotent being of total app knowledge.
    [Show full text]
  • Getting Started with Google Cloud
    <CloudOnBoard> Getting Started With Google Cloud </CloudOnBoard> Cloud OnBoard Welcome to Cloud OnBoard #GoogleCloudOnBoard ©Google Inc. or its affiliates. All rights reserved. Do not distribute. ©Google Inc. or its affiliates. All rights reserved. Do not distribute. May only be taught by Google Cloud Platform Authorized Trainers. Cloud OnBoard ©Google Inc. or its affiliates. All rights reserved. Do not distribute. ©Google Inc. or its affiliates. All rights reserved. Do not distribute. May only be taught by Google Cloud Platform Authorized Trainers. Page 1 Agenda Why choose Google Cloud Platform? Google Cloud Platform enables developers to build, test and 1 Introduction to Google Cloud Platform deploy applications on Google’s highly-scalable, secure, and reliable infrastructure. 2 Quiz Choose from computing, storage, big data/machine learning, and application services for your web, mobile, analytics, and backend solutions. ©Google Inc. or its affiliates. All rights reserved. Do not distribute. 3 ©Google Inc. or its affiliates. All rights reserved. Do not distribute. 5 The Future of Cloud Computing GCP is organized into regions and zones Now Next ● Regions: collections of zones ○ Specific geographical locations where you can run resources ○ Regions are interconnected using Google’s global, meshed backbone network Storage Processing Memory Network Storage Processing Memory Network ● Zones: isolated deployment areas in a region Physical/Colo Virtualized Serverless/No-Ops ● Your resources can be regional, zonal, or in some cases multi-regional
    [Show full text]
  • Modern ETL Tools for Cloud and Big Data
    Modern ETL Tools for Cloud and Big Data Ken Beutler, Principal Product Manager, Progress Michael Rainey, Technical Advisor, Gluent Inc. Agenda ▪ Landscape ▪ Cloud ETL Tools ▪ Big Data ETL Tools ▪ Best Practices ▪ QnA © 2018 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 2 Landscape © 2018 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 3 Data Warehousing Has Transformed Enterprise BI and Analytics © 2018 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 4 ETL Tools Are At The Heart Of EDW © 2018 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 5 Rise Of The Modern EDW Solutions Cloud Data Warehouse Big Data Warehouse © 2018 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 6 Enterprises Are Increasingly Adopting Cloud and Big Data Cloud Big Data NoSQL Amazon S3 Amazon Web Services 24% MongoDB 27% 44% Hadoop Hive 22% Cassandra Microsoft Azure 39% Spark SQL 11% 15% Redis 7% VMware 22% Hortonworks 9% Amazon EMR 9% Oracle NoSQL 7% Google Cloud 18% Apache Solr 8% Google Cloud… 6% Salesforce 14% Cloudera CDH 7% HBase 6% Oracle 10% Oracle BDA 7% DynamoDB 5% Apache Sqoop 7% Couchbase 4% IBM 8% 7% IBM BigInsights Google Cloud… 3% 6% SAP HANA 6% SAP Cloud Platform Big Data… DocumentDB Cloudera Impala 2% Rackspace 6% 5% MapR 6% SimpleDB 2% RedHat OpenStack 4% Azure HDInsights 6% MarkLogic 1% DataStax Enterprise RedHat OpenShift 3% Apache Storm 5% 1% Google Cloud DataProc 5% Aerospike 1% Digital Ocean 2% Apache Drill 4% Riak 0% CenturyLink 1% Apache Phoenix 3% Scylla 0% Linode 0% Presto 2% ArangoDB 0% Pivotal HD 2% Other (please… 2% Other (please specify): 6% GemFireXD 1% None 32% None 13% Other (please specify): 2% Not sure 10% Not sure 3% None 32% Not sure 12% Source: Progress DataDirect’s Data Connectivity Outlook Survey 2018 © 2018 Progress Software Corporation and/or its subsidiaries or affiliates.
    [Show full text]