International Conference on Applied Information and Communication Technologies (AICT2013), 25.-26. April, 2013, Jelgava, Latvia Different solutions of MySQL in the cloud – security and possibilities

Zigmunds Bulins, Vjaceslavs Sitikovs

Institute of Applied Software, Riga Technical University, Meza 1/3, Riga, LV-1048, Latvia [email protected], [email protected]

Abstract: is a good way to raise productivity of offered service without investments into new infrastructure, training of the personnel or software acquisition. This technology expands potential possibilities of existing information systems. In recent years cloud computing grew from good business concept to one of the most demanded industry in information technologies. The paper contains a short review of different providers which uses MySQL as a basis. Technical nuances, potential problems and risks related to migration of the existing MySQL databases to the new environment are reviewed. In the paper we try to review the actual possibilities of the new platform and compare the cloud DBaaS (database as a service) solutions which are implemented with MySQL database management system widely used in the web.

Keywords: MySQL, DBaaS, , Google Cloud SQL, ClearDB. Introduction Cloud computing is a dynamic method of increasing productivity of service or possibilities without investments in new infrastructure, training of the personnel or software licensing. This expands possibilities of existing information systems. In recent years cloud computing grew from good concept business to one of the most quickly developing industries of information technologies (Chandra, Mondal, 2011). In the last two years there was quite a lot of activity around the cloud databases on the stage – Google, and Xeround (Xeround, 2012a) companies announced their DBaaS based on the MySQL database. We will review three of them – Google Cloud SQL (Google, 2012b), Xeround, ClearDB (ClearDB, 2012). The Google Cloud SQL Google Cloud SQL is a web service which allows creating, forming and using relational databases with App Engine (Google, 2012a) applications. This is completely self-managed service which supports and manages databases, allowing developers to concentrate on implementation of applications and necessary services. Offering functionality of the MySQL database, service allows moving easily the data, applications and services to the cloud and out of it. It allows increasing mobility of data and provides faster entering the market because there is an ability to quickly scale an existing database. To guarantee service availability for critical applications and services, Google Cloud SQL replicates data in different geographical areas for ensuring high availability of data. Main features of the Google Cloud SQL service are (Google, 2012b):  Ease of use – a rich graphical user interface allows for creating, configuring, managing, and monitoring the database instances;  Fully managed – no worrying about tasks such as replication, patch management, or other database management chores - all these tasks are provided by “cloud”;  Highly available – to meet the critical availability needs of today's applications and services, features like replication across multiple geographic regions are built in, so the service is available even if a datacenter becomes unavailable;  Integrated with Google App Engine and other Google services – make it possible to work across multiple products easily, get more value from the data, move the data into and out of the “cloud”, and get better performance. If we compare the Google Cloud SQL with the others similar services on the market like Amazon EC2 (Amazon, 2013), Windows Azure (Microsoft, 2013), Xeround then two possible models of the DBaaS service are met:  Virtual images of the configured database instances which are running on the virtualized hardware;  Distributed and automatically managed database which is not linked to specific location in the cloud. With Google Cloud SQL we get the second option. Limitation of the Google Cloud SQL The Google Cloud SQL represents MySQL DBMS placed in a “cloud”. Google Cloud SQL provides all functions which are offered by MySQL DBMS, but with several limitations. There are the following main restrictions of cloud service (Google, 2012b):  The size of a separate instance of a database is limited by 10 gigabytes;  User defined functions (UDF) are not supported; http://aict.itf.llu.lv 225 International Conference on Applied Information and Communication Technologies (AICT2013), 25.-26. April, 2013, Jelgava, Latvia

 Replication functionality is not available for configuring and setting up;  File based functions are blocked (such as DATA INFILE, LOAD_FILE etc.). As the size of an instance is limited this service is applicable generally only for applications of small and medium business (as it was described in the Google Cloud SQL service description). Also this restriction can concern multimedia and similar applications which store binary data in a database and, as a rule, it uses a lot of disk space. Absence of replication support is the minor shortcoming as Google Cloud SQL itself implements this mechanism and tracks its correct operation. One more restriction which isn't specified is an absence of federated table support. If migrated application that already is using this mechanism, it is necessary to alter architecture of application in case of transfer it to Google Cloud SQL platform. Most likely the new platform automatically solves the problem because of which remote tables were used. If not, it will be necessary to find other solution within the new environment. One more technical restriction of cloud DBMS is that the platform is intended for applications with a low to average level of intensity of data recording (Google, 2012b). That is, applications with intensive data recording will not work effectively in the environment of Google Cloud SQL as the replication engine is used. In DBMS instance access control there is a new layer – a Google API project layer. Knowledge of DBMS access codes is not enough for acquiring access to a database. It is necessary to have also access to the project within which DBMS was created. The project and access to it are managed by the service of Google API Console (Google, 2012d) in a Google account. Available tools for the service interaction Unlike stand-alone MySQL DBMS, in Google Cloud SQL there is no possibility to connect to the database directly from any computer. Connection is carried out or by means of the web browser, or using the special Command Line Tool program. In addition it is possible to use the SQuirrel SQL application (Universal, 2012), which actually uses the aforementioned program tool to execute commands on Google Cloud SQL (Fig. 2). In order to connect to the Google Cloud SQL database from a certain computer, it is necessary to generate an access key in settings of the service account and later submit it to the program tool on first use – this mechanism allows the recipient of service to supervise access to instances of databases. In a given context threats relates to the service provider and not to the Google Cloud SQL user, but that’s doesn't free user from potential risks. The question of trust consists in issue that data takes place in the environment belonging and supervised by the third parties where there is no possibility to track their actions. That is not applicable for public government institutions or systems with confidential data (changes in the way how data is stored in the DBMS are necessary, to use Google Cloud SQL environment). Connection is carried out or by means of a web browser (Fig. 1), or using the special Command Line tool (Fig.2).

Fig. 1. Web browser access.

Fig. 2. Command line tool.

http://aict.itf.llu.lv 226 International Conference on Applied Information and Communication Technologies (AICT2013), 25.-26. April, 2013, Jelgava, Latvia

Xeround In contrast to Google Cloud SQL, which actually uses the almost not changed MySQL version 5.5 (no modifications were made, just some features were disabled), Xeround is built on the MySQL Storage Engine Architecture, acting as a pluggable storage engine (Fig. 3). Relying on this architecture, and the MySQL query language support, Xeround patented storage engine seamlessly replaces current MySQL database (Xeround, 2012b).

Fig. 3. Xeround structure (Xeround, 2012b). Xeround’s two tier architecture is comprised of Access Nodes and Data Nodes. Data Nodes are responsible for storing the data, while Access Nodes receive application requests, communicate with Data Nodes, perform computations and deliver request results. Xeround stores data in virtual partitions that are not bound to the underlying hardware infrastructure. Each partition is replicated to the different Data Nodes located on separate servers, providing high availability and full resiliency (Xeround, 2012b). The background of the Xeround cloud database service is the MySQL server of version 5.1 (as of January 2013, using our test account). This actually is not widely advertised on the company’s website, but for some certain segment of customers that can be an important issue as there are quite many improvements in the SQL syntax, optimizer and other places of the server (MySQL, 2012). But the main advantage over the Google Cloud SQL is that Xeround database is not vendor locked – can be run on any cloud platform and any stack (Xeround, 2012b). It is possible to connect directly to the Xeround cloud database using any tool available on the market – there is nothing to describe in contrast to Google Cloud SQL which uses its own tools. ClearDB Another available option for the cloud database as a service is the ClearDB which can be used on different cloud platforms such as Heroku (Heroku, 2012) or Amazon (Amazon, 2013). ClearDB is similar to the Xeround solution, however it does not invent new storage engine, but instead enables scaling and durability for the unchanged MySQL functionality. ClearDB creates “multi-master” and “multi-master with multi-replica” MySQL configurations in geographical regions that are important to the customer to provide applications with a fully redundant solution that can survive outages, network failures and even natural disasters (ClearDB, 2012). ClearDB uses a combination of advanced replication techniques, advanced cluster technology, and layered web services to provide you with a MySQL database that is "smarter" than usual. We also use things like mixed binary replication logging and auto-increment offset seeding so that it is possible to continue using MySQL's non-determanistic and time-based functions such as UUID(), NOW() as well as auto-increment keys in the tables (ClearDB, 2012). With the ClearDB solution it is also possible to connect to the database using any available tool as in case of Xeround. There are also few feedbacks found on the internet about the issues with Xeround DBaaS, when users switched to ClearDB in place of the first one because of the query execution issues (Stack Overflow, 2012) – this can be a positive sign for the ClearDB. Technological risks in the cloud With growth of number of users and companies which store the data in "clouds", more frequently the questions about safety come up (Subashini, Kavitha, 2010). Despite all activity around cloud computing, business sector’s clients still don't wish to place their systems to the cloud environments. Security is the main reason which detains rapid development of the market of cloud computing (Marstona et al 2011). Also questions of a privacy of data and problem of its protection continue to influence the market of cloud computing (Subashini, Kavitha, 2010; Mansfield-Devine, 2008). Recent IDCI survey shows that 74% of technical directors and managers of http://aict.itf.llu.lv 227 International Conference on Applied Information and Communication Technologies (AICT2013), 25.-26. April, 2013, Jelgava, Latvia information technology sector noted security as the main challenge which keeps them from adoption of existing systems to model of cloud service (Subashini, Kavitha, 2010). In the case of database migration to the environment of a Google Cloud SQL there are new security (Fig. 4) and trust challenges for users. Security challenges are generally related to new elements of infrastructure and the platform.

Fig. 4. Threats by location. Cloud computing threats can be grouped in 5 main classes (Fig. 5) as described farther. Functional Threats of Cloud Components This type of attack is associated with multiple layers of the "clouds", the main principle of security is that the total level of security is determined by the security of the weakest element (Subashini, Kavitha, 2010). So, denial-of-service attack (DoS attack) on a proxy-server setting in front of the cloud will block access to the whole "cloud", despite the fact that all works smoothly within the "cloud". In a similar way, SQL injection, which occurs on the application server, will provide access to data storage systems, regardless of access rules in data storage layer (Korzhov, 2010).

Fig. 5. Simple threats classification. Attacks on a Client These types of attacks have worked out in a web environment, but they are just as relevant in cloud environments as users connect to the cloud through a web browser. Attacks include such types as Cross Site Scripting (XSS), DoS attacks, interception of web sessions, stealing passwords, "the man in the middle” and others. Protection against these types of attack is traditionally a strong authentication using an encrypted connection with mutual authentication. But not all creators of "clouds" can afford such expensive and often inconvenient means of protection (Chonka et al 2011). Virtualization Threats and Attacks on a Hypervisor In IT, the hypervisor, in a different way is called “virtual machine manager”. Since the platform for the cloud elements, usually is a virtual environment, the attack on virtualization threatens the entire cloud as a whole (Lombardia, Di Pietro, 2011). This type of attack is unique to cloud computing (Rosenthal et al, 2010). At the moment, there are known few real attacks on hypervisors, but it is possible that the amount of such attacks will rise in the future (Rutkowska, Tereshkin, 2008). Threat of a “Cloud” Complexity Monitoring the events in the "cloud" and management of them is also a security issue. How do we ensure that all resources are counted and that there is no rogue virtual machine that perform third-party processes and do not interfere in mutual configuration of the layers and elements of the "cloud" (Kritsonis, 2011)? This type of threat associated with the processing of the cloud as a whole and the search for fraud and other irregularities in the "cloud" structure, which can lead to unnecessary expenditure on maintaining the "health" of the information system (Korzhov, 2010). The level of this type of threat is the highest, and it is assumed that it is impossible to create a universal remedy to protect against that - for each "cloud" individual protection system must be built.

http://aict.itf.llu.lv 228 International Conference on Applied Information and Communication Technologies (AICT2013), 25.-26. April, 2013, Jelgava, Latvia

Attacks on a management systems A large number of virtual machines that are used in the "clouds", especially in public clouds, require a management system that can reliably control the creation, transfer and utilization of virtual machines. The interference in the management system can lead to ghost virtual machines, blocking some of the machines and the substitution of elements or layers in the cloud to the rogue. All this allows an attacker to gain access to the data of the "cloud" or to gain control over part or the whole "cloud" (Paquettea et al, 2010). The comparison To summarize the short reviews of the DBaaS solutions we will present the comparison table, which contains several criteria (Table 1). Table 1 Simple comparison of the DBaaS. Solution Google Cloud SQL Xeround ClearDB Base MySQL version 5.5 5.1 5.5 Uses replication Yes No No Implemented as a storage engine No Yes No SSL support Yes (only platform internal connections) Yes Yes Direct external connections No Yes Yes Is tied to a certain platform Yes, Google No No Can be easily moved to another No Yes No provider/platform Conclusion and future work With release of the new Google Cloud SQL service providing the cloud version of MySQL 5.5 DBMS and other DBaaS such as Xeround and ClearDB, many new possibilities appear for migration of existing systems to the new architecture with rather small losses upon transition. Carrying out the analysis of possibility of transition to cloud model of a database, it is necessary to consider some risk factors described in this article and to ensure that existing system corresponds to restrictions of service and won't exceed them soon. Transition to a cloud database gives high scalability of system in context of accessibility as the supplier of service provides distribution of data between many geographical locations that accelerates data access to the end users. There are different cloud databases which are implemented on the basis of MySQL server distribution, but each of them uses different technique to implement the cloud platform support. If we talk about Google then we are tied to a certain platform, a single provider and also some technical limitations (which were described in this article). If you are searching for a simple solution, which means just to move the data to the cloud database without any changes – then ClearDB is right choice, as it provides the original and latest MySQL 5.5 version in the cloud. Xeround solution is different from the ClearDB technically (as described in sections before), but it has a possibility to move data from one cloud provider to another one – what can be important if geographical location of the data is important for the company or application (it’s host agnostic, you can easily move your data anytime). Acknowledgements This work was funded by the ERDF (ERAF) project No. 2011/008/2DP/2.1.1.1.0/10/APIA/VIAA/018 "Development of Insurance Distributed Software Based on Intelligent Agents, Modelling, and Web Technologies" and we would like to acknowledge that support. References Amazon, 2013. Amazon Elastic Compute Cloud (Amazon EC2). Available: http://aws.amazon.com/ec2/. Last accessed 20th Jan 2013. Chandra, S.M., Mondal, A., 2011. Identification of a company’s suitability for the adoption of cloud computing and modeling its corresponding Return on Investment. In Mathematical and Computer Modeling. 53 (3- 4), 504–521. Chonka, A., Xiang, Y., Zhou, W., Bonti, A., 2011. Cloud security defence to protect cloud computing against HTTP-DoS and XML-DoS attacks. In Journal of Network and Computer Applications. 34 (4), 1097– 1107. ClearDB, 2012. ClearDB –The Safer, Reliable MySQL Cloud Database For Your Applications. Available: http://www.cleardb.com/better.view. Last accessed 31 Jan 2013. Google, 2012a. Google App Engine. Available: https://developers.google.com/appengine/. Last accessed 20th Jan 2013. http://aict.itf.llu.lv 229 International Conference on Applied Information and Communication Technologies (AICT2013), 25.-26. April, 2013, Jelgava, Latvia

Google, 2012b. Google Cloud SQL. Available: https://developers.google.com/cloud-sql/. Last accessed 20th Jan 2013. Google, 2012c. Command Line Tool. Available: https://developers.google.com/cloud-sql/docs/commandline. Last accessed 20th Jan 2013. Google, 2012d. Dashboard. Available: https://code.google.com/apis/console. Last accessed 20th Jan 2012. Heroku, 2012. Heroku - Cloud Application Platform. Available: http://www.heroku.com/. Last accessed 31 Jan 2013. Korzhov, V., 2010. Clouds: Legends and Myths (in Russian). Available: http://www.anti-malware.ru/node/2333. Last accessed 20th Jan 2013. Kritsonis, T., 2011. Security Risks in the Cloud – Reality, or A Broken Record? In Infosecurity. 8 (1), 20-23. Lombardia, F., Di Pietro, R., 2011. Secure virtualization for cloud computing. In Journal of Network and Computer Applications. 34 (4), 1113–1122. Mansfield-Devine, S., 2008. Cloud Security: Danger in the clouds. In Network Security. 2008 (12), 9-11. Marstona, S., Lia, Z., Bandyopadhyaya, S., Zhanga, J., Ghalsasib, A., 2011. Cloud computing — The business perspective. In Decision Support Systems. 51 (1), 176-189. Microsoft, 2013. A rock-solid cloud platform for blue-sky thinking. Available: http://www.windowsazure.com/en-us/. Last accessed 20th Jan 2013. MySQL, 2012. MySQL 5.5 Reference Manual: What Is New in MySQL 5.5. Available: http://dev.mysql.com/doc/refman/5.5/en/mysql-nutshell.html. Last accessed 20th Jan 2013. Paquettea, S., Jaegerb, P.T., Wilson, S.C., 2010. Identifying the security risks associated with governmental use of cloud computing. In Government Information Quarterly. 27 (3), 245–253. Rosenthal, A., Mork, P., Hao Li, M., Stanford, J., Koester, D., Reynolds, P., 2010. Cloud computing: A new business paradigm for biomedical information sharing. In Journal of Biomedical Informatics. 43 (1), 342– 353. Rutkowska, J., Tereshkin, A., 2008. Bluepilling the Xen Hypervisor. Available: http://invisiblethingslab.com/resources/bh08/part3.pdf. Last accessed 20th Jan 2013. Stack Overflow, 2012. heroku – Xeround vs ClearDB hosted MySQL pros and cons?. Available: http://stackoverflow.com/questions/10099239/xeround-vs-cleardb-hosted-mysql-pros-and-cons. Last accessed 31 Jan 2013. Subashini, S., Kavitha., V., 2011. A survey on security issues in service delivery models of cloud computing. In Journal of Network and Computer Applications. 34 (1), 1-11. Universal, 2012. Squirrel SQL Client Home. Available: http://squirrel-sql.sourceforge.net/. Last accessed 20th Jan 2013. Xeround, 2012a. DBaaS – ‘Worry-Free’ Cloud Database for MySQL Applications. Available: http://xeround.com/mysql-cloud-db-overview/dbaas/. Last accessed 20th Jan 2013. Xeround, 2012b. MySQL scalability and high availability - cloud DB from Xeround. Available: http://xeround.com/mysql-cloud-db-overview/. Last accessed 20th Jan 2013.

http://aict.itf.llu.lv 230