WHITEPAPER Why Choose MySQL? Why Choose MySQL?

MySQL is one of the most widely used relational databases, powering nine out of ten websites around the world. With this level of adoption, it must be a good fit for your needs, right? Not so fast. Even though it is used by many top websites, it’s not a good fit for everyone. While a 90% usage rate is an impressive usage statistic, 10% of websites DO NOT use MySQL. Why not?

MySQL’s ease of deployment, open source knowledge, scalability, and transactional capability make it a great choice in many, but not all, scenarios.

In this whitepaper, we’ll look at where MySQL makes sense, and some places where it is a less ideal choice. MySQL is based on commonly known and documented ANSI SQL 99 standards. It was designed by MySQL AB and is available in both open source and enterprise editions.

What is MySQL?

MySQL is a relational database management system. It stores data in a table format and supports queries written using Structured Query Language (SQL). SQL has been around for a long time and is well known and documented. This provides a solid knowledge base for the use of MySQL.

MySQL excels at processing transactions and can be made to be fully ACID-compliant. ACID refers to Atomicity, Consistency, Isolation and Durability. The ACID standard is a way of measuring how well a database ensures that a transaction is completed in a timely manner.

MySQL was designed to support very large databases and can be scaled to meet a high data storage and user demand. As a relational database, it stores its data in tables, using a field/value model. It relies on SQL for queries, so it has an extensive knowledge base.

Sometimes, MySQL is perceived as being too rigid in its data structures. This is due to the ability to create and enforce primary key and foreign key constraints and the requirement that the data being loaded fit the current schema. Primary and foreign key constraints would determine, for example, that you could not load a financial transaction if the account number recorded in the transaction did not exist in another table. The table recording the transaction would have account_number as a foreign key and the table recording account owners would have account_number as a primary key. The inherent properties of a primary key also mean that you could not have a duplicate value nor could the field be null in the account owner table. These keys make future searches easier since the key values are stored in a sorted format, much like locating a file in a filing cabinet is made easier since the contents as usually sorted alphabetically.

Additionally, MySQL requires that the database schema be designed prior to writing or reading data. The schema outlines the various fields that are expected in the file(s) to be loaded and defines an expected data type for each field. For example, you may have a transaction_time field which is defined as storing date/time information. Thus, a value of 09:22:44 is acceptable, but a value of “2 PM” is not. If a file being loaded contains field names that are not recognized by MySQL, that data is not loaded. In our example, if the field is named transaction_time in the database, but the data file names that field xaction_time, the data cannot be loaded.

Additionally, MySQL requires that the database schema be designed prior to writing or reading data. The schema outlines the various fields that are expected in the file(s) to be loaded and defines an expected data type for each field. For example, you may have a transaction_time field which is defined as storing date/time information. Thus, a value of 09:22:44 is acceptable, but a value of “2 PM” is not. If a file being loaded contains field names that are not recognized by MySQL, that data is not loaded. In our example, if the field is named transaction_time in the database, but the data file names that field xaction_time, the data cannot be loaded. MySQL ingests data from a variety of sources through data loading. These files can come from different places but the contents must be standardized to meet the requirements of the existing database design. MySQL can replicate data to other nodes in either a master/slave relationship or through the implementation of a clustered environment, where multiple servers are kept up to date through the application.

When data needs to be read from the database, a MySQL query is used. This query, which conforms to the ANSI SQL 99 language standards, is run against the database. Again, if tables or fields are referenced in the query that do not exist in the current schema, the query fails. Otherwise, the query returns the requested information to the application.

In MySQL (and other open source databases), as the database grows in size, the time it can take applications to search and find data can take more and more time. This impacts data performance as the number of application requests on the database scales up. Primary and foreign keys act as a shortcut of shorts to finding the relevant data for a particular application request. This speeds up responses from the database to application requests and improves overall application performance.

If MySQL receives a data feed that contains unknown or incorrectly named values, the load is rejected. For some purposes, like financial markets, this can be helpful, but it is a limitation on flexibility.

If you know the structure of your data and can define it solidly, MySQL can be a great fit. This is due to the fact that data being loaded into MySQL must meet the current schema design. The schema is matched on load and then used during querying to determine what data to find and where to look for it. This concept is referred to as “schema on read”. When you are looking for schema flexibility or “schema on load” capabilities, MySQL may not be ideal.

MySQL supports a high availability (HA) environment, where each piece of data is stored redundantly on another node in a clustered environment. This ensures that the data is still accessible, even in the case of a node failure. MySQL functions on a quorum model for high availability and will stay up and running so long as a majority of the nodes in a cluster are up and running.

Customer Relationship Management (CRM)

Let’s say that you run a CRM site, used to record and track all of your customer interactions. It is important that the information retrieved when a customer contacts you is up to date and contains all of the relevant information needed to manage their account. You also need to be able to scale up to handle data from a large number of customers and a high volume of read and write transactions.

In this case, MySQL is a good choice. One of the benefits of storing this type of data is that the interaction information is known prior to an interaction occurring. There will be identifying information, dates and timestamps of interactions, and reports on what occurred during the interaction. Given that the data being stored is predictable, it fits well into MySQL’s controlled schema. If a new interaction data point is needed, the schema can be altered to support the new field(s), but this is a non-trivial task and should be undertaken only when needed.

The relational aspect of MySQL enables it to use storage efficiently. With each interaction, there is some user information that needs to be recorded. Rather than recording the user’s name, email, phone number, and so on for each interaction, we can use a primary/foreign key relationship to manage this data more efficiently. There will be a users table which holds a primary key field called user_id. By defining this field as a primary key, it is guaranteed to always have a unique value and cannot be left empty. This prevents us from tracking information that does not come from our user community.

Next, that user_id field is added to an interaction table as a foreign key. This ensures that each interaction is associated to a known user and prevents us from reporting on an interaction for someone who is not already a known user in our database. The referential integrity constraints keep the data current and allow us to record only the smallest value to identify the user for each interaction. Additionally, if a user needs to change some aspect of their account data, for example, if they moved to a new address, the change need only be made in the users table to be reflected in all user interactions.

The other areas where MySQL excels in this instance are scalability and responsiveness during periods of high usage. MySQL is designed to scale as needed to accommodate the data it needs to consume. Many of the largest databases in existence are managed using MySQL. It can also handle the high rate of read and write requests in an environment like this. HA is also important to an environment like this, since you need to know that you can expect to always have access to relevant customer data.

The benefits of a known schema, ability to scale, and responsiveness to high user demands make MySQL a good choice in an environment like this. If the data being managed is more variable, as would be the case when a company is receiving data feeds from multiple disparate sources, MySQL may be too constrained to manage this traffic.

Social Media

Many of the largest social media applications rely on MySQL for their database. This is another case where the data that is reported on each user is well-defined. There is also a need to process a high number of transactions for a large number of concurrent users. When a user logs into the social media site, they are immediately presented with the most current information available.

The known schema is acceptable to MySQL since there is a controlled feed of data into the database as we predefined all of the fields included in our input file. The scalability of MySQL is able to manage the number of users and the still retrieve the relevant data in a timely manner. Since there are likely to be times where there is high user activity in social media (like when the Super Bowl or World Cup games are underway), the high concurrency capability of MySQL is able to handle these spikes with ease.

Many social media sites also provide recommendations for new contacts, interesting articles, and other items. MySQL can process these queries quickly and display relevant recommendations for each user.

Social media is a regular user of MySQL due to the ability to process large volumes of data efficiently. Both read and write requests are responded to quickly and with a high degree of confidence. This is another area where HA is normally required. No one wants to be the one to tell all the users that their data is unavailable, so setting up an environment that is highly available is important.

Retail

For many retailers, MySQL is a database of choice. In these situations, the well-defined schema, HA capabilities, and transaction support match their needs well.

If a company sells products that are well known and defined, working within the confines of the MySQL schema can be fine. This also prevents unknown or unexpected products from being offered. If a new product is introduced that requires schema changes, those changes can still be supported, but must be implemented before data on the new product is loaded. The relational nature of MySQL ensures that, for example, all products have a description and price associated with them.

For a larger retailer, it is important that their sales environment is always available. HA is an important aspect of their database requirements, and MySQL supports this need. It can also scale to meet the needs of a growing retailers.

Support for transactions is key to the success of any retailer, whether online or brick and mortar. For an online retailer, having accurate stock numbers is important so that you can inform a customer of current status when an order is placed. For brick and mortar environments, knowing stock and purchase and exchange data is helpful to properly set customer expectations.

Financial Services

In the world of finance, small delays in processing requests can have massive financial implications. If you are running a financial tracking application, it is important to know that your requests are both being handled in a timely manner and exactly at what time requests were handled (sometimes down to the millisecond).

When making a financial transaction, for example, submitting a sale request on an owned stock, you have an expectation that the transaction will complete quickly, accurately, and be returned by a query run against any node. MySQL handles this with ease. This is discussed in detail in a Percona webinar.

When writing to a clustered environment, that is an environment that has more than one node servicing the same database requests, you need to know that the data that is written is being updated on all nodes. This provides a sense of comfort when the database is queried since you know that you will return the most current data. MySQL can handle the initial write request quickly and ensure that the new data is replicated to all other nodes in the cluster.

Start Up Company

You’ve come up with a great idea for the next hot application. Of course, you’ll need a database as the backend. With that database, you’ll store all of the information on your users and how the various exchanges that they have with your application. You also need to analyze traffic and usage patterns to improve the application over time.

Due to a low Total Cost of Ownership (TCO), high existing skill set, and ease of deployment, MySQL is the likely choice. In fact, running a MySQL cloud variant, like Amazon RDS, Amazon Aurora, Microsoft Azure, or Google Cloud could be an ideal fit.

The low TCO is due to the fact that MySQL is an open source product, so you don’t need to worry about license fees, nor do you need to worry about violating any license agreements. It runs on commodity hardware, so you can run it on lots of different hardware. If you’re running one of the cloud offerings, the database is tuned for basic needs and provides for easy HA and backups to ensure your business success.

As you grow, MySQL can grow along with you. Running in the cloud provides the easiest path to a more robust environment at reasonable cost. Even with some of the limitations regarding schema changes, MySQL is a common choice for many startups.

What about MariaDB?

MariaDB is a fork of MySQL Community Edition. Some of the original developers of MySQL created MariaDB in response to Oracle’s acquisition of MySQL.

While MariaDB is intended as a drop-in replacement for MySQL, it diverges enough from the original code to warrant categorization as a unique database product. It expands many capabilities of MySQL to include, for example, the MariaDB Column Store storage engine and the Cassandra storage engine. These additions bring capabilities that are not available in MySQL.

The features and capabilities of MariaDB are guided by the MariaDB Foundation, a consortium of users and developers. The Foundation’s role is to oversee the development and growth of MariaDB.

MariaDB is based on MySQL and it carries forward much of the current features set of MySQL. There are some differences that are worth noting. One example is that, as of MariaDB 10.1, the storage format for JavaScript Object Notation (JSON) data is different than the format for MySQL 5.7. If a user wants to replicate JSON data between MariaDB and MySQL, they must them to a different format or run statement-based replication jobs via SQL. MariaDB is provided as open source , as is the MariaDB ColumnStore storage engine, which is designed for use in big data applications. MariaDB also offers a database proxy technology called MaxScale that lets querying be split across multiple MariaDB servers. MaxScale is available under a Business Source License that charges a price for deployments with more than three servers. MariaDB is one of six database options offered by Amazon Web Service (AWS) Remote Data Services (RDS). The others are Amazon Aurora, MySQL, PostgreSQL, Microsoft SQL Server, and Oracle Database. Companies that choose to use MariaDB make the decision based on several criteria. Some prefer the idea of working with an organization other than Oracle, and MariaDB offers a viable alternative. MariaDB is often viewed as being more open with its development plans and has quicker response to changes. MariaDB offers additional storage engines, including the option to use MariaDB ColumnStore within a “standard” database, and that can be appealing to some. MariaDB uses the Galera model to establish its clustering, which has been viewed as being easier to setup and maintain than Oracle’s model. However, with the advent of MySQL group replication, some of those distinctions are becoming blurred. Additionally, with the alternative storage engines, users of MariaDB often see improved performance in their database.

Use Cases

The use cases for MariaDB are very similar to those for MySQL. For companies that are looking to use one or more of the additional storages supplied with MariaDB, it may be a better choice. Both databases benefit from a deep knowledge base and large array of connectors, plug ins, and accessory tools. One of the primary reasons for creating MariaDB was a dissatisfaction with a large company (Oracle) taking on an open source tool. As MariaDB grows, this becomes less of an issue for them.

MariaDB is the default database in the LAMP stack supplied by and SUSE , and the cloud stacks offered by Pivotal Cloud Foundry and Rackspace, making it an easy choice for some. Wrap-up

MariaDB and MySQL are two variants on a theme. As MariaDB moves forward, it is likely that it will be less of a drop-in replacement for MySQL due to incompatibilities like the JSON storage concept noted above. Additionally, with the introduction of the Business Source Licensing model for MaxScale, some are concerned that a more restrictive license model may be planned moving forward.

One company may choose MariaDB and another MySQL for many of the same reasons that some people choose to buy a Ford and others a Chevrolet. Both do many of the same things, but there may be a feature that is only available on one over the other, like the use of the ColumnStore engine in MariaDB, that tip the scales. Corporate preference also can play a role in this decision since Oracle is a much larger organization than MariaDB and MariaDB maintains a sole focus on this product.

MySQL has numerous features that make it a good choice for many needs. The ease of deployment and scalability, plus the extensive knowledge pool, make it a great option for many companies. If you are primarily doing transactional work, it fills that need at a low cost. From a data safety and security standpoint, MySQL can be configured to create a robust and protected data environment; in fact, many large companies rely on it due to its security capabilities. Additionally, MySQL enables an HA environment to be built at a reasonable cost and with a practical amount of effort.

However, MySQL does have its detractors. Developers often find the rigidity of its schema too confining. If you are loading information that contains varying levels of detail into a database, a “schema on load” database such as MongoDB might be a better fit. Additionally, if you work regularly with data that requires a more flexible storage format, such as support for non-standard data types, it may not be ideal.

One of the strongest points of MySQL is its longevity and recognition within the market. There are lots of people who know enough SQL to use this database, and that is both a positive and negative. It is helpful that many people know the language, but you must be careful to work with people who are well versed in the application. This is similar to traveling to a foreign country with only a rudimentary knowledge of the language; you may find that you can order a coffee easily, but you may be perplexed if you have a medical emergency. If you choose to go with MySQL, it is important to choose a team that is well versed in the capabilities and limitations of this database.

Percona Can Help

Managing your organization’s database operations, on-premises or in the cloud, requires in-depth knowledge of potential issues plus diligent, dedicated practice. The type of open source database you use can affect your application performance. When it comes to choosing an open source database, you need to know which one will best serve your applications and services, your environment, and your customers.

Percona Support services are accessible 24x7x365 online or by phone to ensure that your database installation is running optimally. We can also provide onsite or remote Percona Consulting for current or planned projects, or in emergency situations. Every engagement is unique and we will work with you to create the most effective solution for Contactyour business. Us Percona Managed can help Services you maximize can fully yourmanage database your existing performance database infrastructure with open source whether database it is hosted support, on premise managedor at a colocation services, facility and or consultingif you purchase services. services from a cloud provider or database-as-a-service provider. To learn more about how Percona can help, and for pricing information, please contact us at +1-888-316-9775For more information, (USA), +44 contact 203 608 us 6727at +1-888-316-9775 (Europe), or email us(USA), at [email protected] +44 203 608 6727. (Europe) or have us reach out to you.