Choosing a Database- As-A-Service an Overview of Offerings by Major Public Cloud Service Providers
Total Page:16
File Type:pdf, Size:1020Kb
CHOOSING A DATABASE- AS-A-SERVICE AN OVERVIEW OF OFFERINGS BY MAJOR PUBLIC CLOUD SERVICE PROVIDERS Warner Chaves Principal Consultant, Microsoft Certified Master, Microsoft MVP With Contributors Danil Zburivsky, Director of Big Data and Data Science Vladimir Stoyak, Principal Consultant for Big Data, Certified Google Cloud Platform Qualified Developer Derek Downey, Practice Advocate, OpenSource Databases Manoj Kukreja, Big Data and IT Security Specialist, CISSP, CCAH and OCP When it comes to running your data in the public cloud, there is a range of Database-as-a-Service (DBaaS) offerings from all three major public cloud providers. Knowing which is best for your use case can be challenging. This paper provides a high-level overview of the main DBaaS offerings from Amazon, Microsoft, and Google. After reading this white paper, you’ll have a high-level understanding of the most popular data repositories and data analytics service offerings from each vendor, you’ll know the key differences among the offers, and which ones are best for each use case. With this information, you can direct your more detailed research to a manageable number of options. www.pythian.com | White Paper 1 This white paper does not discuss private cloud providers or colocation environments, streaming, data orchestration, or Infrastructure-as-a-Service (IaaS) offerings. This paper is targeted to IT professionals with a good understanding of databases and also business people who want an overview of data platforms in the cloud. WHAT IS A DBAAS OFFERING? A DBaaS is a database running in the public cloud. Three things define a DBaaS: • The service provider installs and maintains the database software, including backups and other common database administration tasks. The service provider also owns and manages the operating system, hypervisors, and bare metal hardware. • Application owners pay according to their usage of the service. • Usage of the service must be flexible—users can scale up or down on demand and also create and destroy environments on demand. These operations should be possible through code with no provider intervention. FOUR CATEGORIES OF DBAAS OFFERINGS To keep things simple, we’ve created four categories of DBaaS offerings. Your vehicles of choice are: • The Corollas: These are the classic RDBMS services in the cloud: Amazon Relational Database Service (RDS), Microsoft Azure SQL Database, and Google Cloud SQL. • The Formula One offerings: These special-purpose offerings ingest and query data very quickly but might not offer all the amenities of the Corollas. Options include Amazon DynamoDB, Microsoft Azure DocumentDB, Google Cloud Datastore, and Google Cloud Bigtable. • The 18-wheelers: These data warehouses of structured data in the cloud include Amazon Redshift, Microsoft Azure SQL Data Warehouse, and Google BigQuery. • The container ships: These Hadoop-based big-data systems can carry anything, and include Amazon Elastic MapReduce (EMR), Microsoft Azure HDInsight, and Google Cloud Dataproc. This category also includes the further automated offering of Azure Data Lake. The rest of this white paper discusses each category and the Amazon, Microsoft, and Google offerings within each category. We describe each offering, explain what it is well suited for, provide expert tips or additional relevant information, and provide high-level pricing information. www.pythian.com | White Paper 2 COROLLAS With the Corollas, just like with the car, you know what you’re getting, and you know what to expect. This type of classic RDBMS service gets you from point A to point B reliably. It’s not the flashiest or newest thing on the block, but it gets the job done. AMAZON RDS Amazon Relational Database Service (RDS) is the granddaddy of DBaaS offerings available on the Internet. RDS is an automation layer that Amazon has built on top of MySQL, MariaDB, Oracle, PostgreSQL, and SQL Server. Amazon has also developed its own MySQL fork called Amazon Aurora, which also lives inside RDS. RDS is an easy way to transition into DBaaS because the service mimics the on- premises experience very closely. You simply need to provision an RDS instance, which maps very closely to the virtual machine models that Amazon offers. Amazon then installs bits, manages patches and backups, and can also manage the high availability, so you do not need to plan and execute these tasks yourself. RDS is very good for lift-and-shift types of cloud migrations. It makes it easy for existing staff to take advantage of the service because it mimics the on-premises experience, be it physical or virtual. EXPERT TIP The storage is very flexible: this is both a pro and a con. The pro is that you have a lot of control over storage. The con is that there are so many storage options, you need the knowledge to choose the best one for your use case. Amazon has general storage, provisioned IOPS (input/output operations per second), and two categories of magnetic storage. The storage method you choose will depend on your particular use cases. You need to be aware that Amazon does not make every patch version of all products available on RDS. Instead, Amazon makes only some major service packs or Oracle patch levels available. As a result, the exact patch level that you have on premises might not map to a patch level on RDS. In this situation, do not move to a patch level that is below the patch level you have because that may result in product regressions. Instead, wait until Amazon has deployed a patch level higher than what you have. At this point, it should be fairly safe to start testing if you want to migrate to RDS. HOW IT’S PRICED The hourly rate for RDS depends on: • whether you have your own license or if Amazon is leasing you the license; www.pythian.com | White Paper 3 • how much compute power you choose: The number of cores, and amount of memory and temporary disk you want on this instance; • the storage you require; and • whether you pre-purchased with Reserved Instances. MICROSOFT AZURE SQL DATABASE Microsoft Azure SQL Database is a “cloud-first” SQL Server fork. The term “cloud- first” means that Microsoft now tests and deploys their code continuously with Azure SQL Database, and the code and lessons learned are implemented in the retail SQL Server product—whether the product is on premises or on a virtual machine. Even if you don’t have any investment in SQL Server, Azure SQL Database is an excellent DBaaS platform because of the investments made to support the elastic capabilities and to the ease of scaling horizontally. As you need more capacity, you just add more databases. It’s also easy to manage the databases by pooling resources, performing elastic queries, and performing elastic job executions. You could deploy your own code to do something similar in Amazon RDS , but in Azure SQL Database, Microsoft has already built it for you. In addition, Azure SQL Database makes it easy to build an elastic application on a relational service. This capability supports the Software-as-a-Service (SaaS) model, wherein you have many clients and each has a database. The SaaS provider has a data layer that is easier to manage and scale than if they were running on their own infrastructure. Unlike Amazon RDS, Azure SQL Database does not exactly map to a type of retail database, such as Oracle, SQL Server, or open-source MySQL. It is closely related to SQL Server but it’s not licensed or sold in a similar way. As a result, Azure SQL Database does not have any licensing component. At the same time, Azure SQL Database does not give you a lot of control over the hardware. With Amazon RDS, you need to select CPUs, memory, and your storage layout. Azure SQL Database does all this for you. With Azure SQL Database the only thing that you need to choose is the service tier. Your choice determines how much power your database has. There are three service tiers: basic, standard, and premium. Each of these also has some sub-tiers to increase or decrease performance. If you have many databases in Azure SQL Database, you can also choose the elastic database pool pricing option to increase your savings by sharing resources. www.pythian.com | White Paper 4 Azure SQL Database is a good choice if you already have Transact-SQL (T-SQL) skills in-house. If you have a large investment in SQL Server, Azure SQL Database is the most natural way to take advantage of DBaaS offerings in the cloud. It’s also a very good web scale relational service in its own right because of all the investments made to support the SaaS model. EXPERT TIP You do need to ensure that you do the proper SQL tuning to be able to choose the right service tier for your needs. In the past, it was more difficult to scale up because all equipment was on premises. Now, it’s very easy to increase the power of the service and therefore pay more money. However, just because scaling up is easy does not mean it’s always what you need to do. If you perform the proper SQL tuning, you will not need to pay more for raw power. HOW IT’S PRICED Azure SQL Database has a simple pricing model. You pay an hourly rate for the service tier your database is running on: Basic, Standard, or Premium. Each has a different size limit for the database and provides more performance as you go up in the tier. GOOGLE CLOUD SQL Google Cloud SQL is a MySQL managed database service that is very similar to Amazon RDS for MySQL and Amazon Aurora.