
School of Computing and Communications A Modeling Language for Multi-tenant Data Architecture Evolution in Cloud Applications Assylbek Sagitzhanuly Jumagaliyev M.Sc., International Information Technologies University, 2014 B.Sc., Suleyman Demirel University, 2011 Thesis submitted for the degree of Doctor of Philosophy 29 April 2019 Abstract Multi-tenancy enables efficient resource utilization by sharing application resources across multiple customers (i.e., tenants). Hence, applications built using this pat- tern can be o↵ered at a lower price and reduce maintenance e↵ort as less application instances and supporting cloud resources must be maintained. These properties en- courage cloud application providers to adopt multi-tenancy to their existing appli- cations, yet introducing this pattern requires significant changes in the application structure to address multi-tenancy requirements such as isolation of tenants, exten- sibility of the application, and scalability of the solution. In cloud applications, the data layer is often the prime candidate for multi-tenancy, and it usually comprises acombinationofdi↵erentcloudstoragesolutionssuchasblobstorage,relational and non-relational databases. These storage types are conceptually and tangibly di- vergent, each requiring its own partitioning schemes to meet multi-tenancy require- ments. Currently, multi-tenant data architectures are implemented using manual coding methods, at times following guidance and patterns o↵ered by cloud prov- iders. However, such manual implementation approach tends to be time consum- ing and error prone. Several modeling methods based on Model-Driven Engineer- ing (MDE) and Software Product Line Engineering (SPLE) have been proposed to capture multi-tenancy in cloud applications. These methods mainly generate cloud deployment configurations from an application model, though they do not automate implementation or evolution of applications. This thesis aims to facilitate development of multi-tenant cloud data architec- tures using model-driven engineering techniques. This is achieved by designing and implementing a novel modeling language, CadaML,thatprovidesconceptsandnota- tions to model multi-tenant cloud data architectures in an abstract way. CadaML also provides a set of tools to validate the data architecture and automatically produce corresponding data access layer code. The thesis demonstrates the feasibility of the modeling language in a practical setting and adequacy of multi-tenancy implemen- tation by the generated code on an industrial business process analyzing application. Moreover, the modeling language is empirically compared against manual implemen- tation methods to inspect its e↵ect on developer productivity, development e↵ort, reliability of the application code, and usability of the language. These outcomes i provide a strong argument that the CadaML modeling language e↵ectively mitigates the high overhead of manual implementation of multi-tenant cloud data layers, sig- nificantly reducing the required development complexity and time. ii Declaration I declare that the work in this thesis is my own work and has not been submitted either in whole or in part for the award of a higher degree elsewhere. Any sections of the thesis which have been published are clearly identified. Thesis originally submitted on 29 April 2019 Recommended corrections submitted on 19 June 2019 ......................................................... Assylbek Sagitzhanuly Jumagaliyev ©Copyright by Assylbek Sagithzanuly Jumagaliyev. All rights reserved. iii Acknowledgements IwouldliketoexpressmysincerelygratitudetoYehiaElkhatibforhisgreatpatience, profound expertise, inspirational encouragement, generous professional support, and perceptive advice throughout my PhD. He has successfully guided me through the long and sometimes hard roads of PhD life. I have learnt so much from him and I feel lucky to have him as my advisor, mentor and friend. Besides my supervisor, I would like to thank my examiners, Francois Taiani and Sumi Helal for their objective feedback, interesting suggestions, and constructive discussions we had during my viva. My sincere thank also goes to my former supervisor, Jon Whittle, for his sup- port and help at the beginning of my PhD, and for his contribution to my initial development as a researcher. Ithankmywonderfulcolleaguesandfellowofficemateswhohavebeenhelpful and supportive. Specifically, a big thank to Abdessalam Elhabbash for reading my thesis and providing constructive feedback on some chapters. In addition, I am grateful to Zheng Wang, Rotsos Charalampos, Amit Chopra, Pete Sawyer, Gerald Kotonya, Jaejoon Lee, Ethem Bagci, Faiza Samreen, William Simm, Debbie Stubbs, Claire Oulton, Alex Powers, and Umar Armaya’U. Special thanks go to my friends in Lancaster who have made my life enjoyable and unforgettable. I would like to thank Shahin, Afaf, Huyem, Aiman, Pang, Pear, Boyeun, Victoria, Peng, Hamed and Hunter. Last but not least, I would like to thank my parents, Sholpan and Sagitzhan, for their encouragement and prayers. I would also like to thank my brothers, Askhat and Bakytzhan, and sisters, Shynar, Gaukhar, and Raushan, for their support and presence. iv To my parents, Sholpan and Sagitzhan, along with brothers and sisters who are always supportive in all my endeavors. v Contents Abstract i Declaration iii Acknowledgments iv Table of Contents vi List of Acronyms x List of Figures xiii List of Tables xv 1 Introduction 1 1.1 Cloud Computing . 2 1.1.1 Cloud Service Models . 2 1.1.2 Multi-tenancyinSaaS . 4 1.1.3 Domain-Specific Languages (DSLs) . 5 1.2 Motivation . 6 1.3 Challenges . 7 1.4 Problem Statement . 8 1.5 ResearchAim&Objectives . 10 1.6 Research Methodology . 11 1.7 Contributions ............................... 12 1.8 Thesis Organization . 14 1.9 Publications . 15 2RelatedWork 16 2.1 Overview . 16 2.2 ManualApproaches............................ 17 2.2.1 Schema-extensionTechniques . 18 2.2.2 Metadata-Driven Techniques . 19 vi CONTENTS 2.2.3 Multi-tenancyEnablementLayer . 20 2.3 MDEBasedApproaches ......................... 22 2.3.1 UnifiedModelingLanguageExtensions . 22 2.3.2 Domain-Specific Languages . 23 2.4 SPLEBasedApproaches . 24 2.4.1 Feature Modeling Based Techniques . 24 2.4.2 Orthogonal Variability Modeling Language (OVM) Based Tech- niques . 26 2.5 Hybrid&OtherModelingApproaches . 27 2.6 Summary . 29 3 Proposed Solution 31 3.1 Requirements . 32 3.1.1 Concepts and Terminology Requirements . 32 3.1.2 Meta-modeling Language Requirements . 33 3.2 Methodology . 34 3.3 DomainAnalysis ............................. 35 3.3.1 Cloud Data Storage Types . 35 3.3.2 DataArchitecturePartitioningSchemes . 38 3.3.3 Current Cloud Application Modeling Languages . 38 3.3.4 DomainAnalysisOutput. 39 3.4 Design . 40 3.4.1 Language Exploitation versus Language Invention . 40 3.4.2 Design Specification . 41 3.4.3 CadaML Meta-model . 41 3.5 Implementation . 43 3.5.1 Graphical Domain-Specific Language (DSL) Implementation FrameworksandTools ...................... 44 3.5.2 Comparing Graphical DSL Frameworks . 46 3.5.3 Graphical Editor for CadaML ................... 48 3.5.4 Validation Rules and Constraints . 51 3.5.5 Code Generation . 52 3.6 Reflection on Requirements . 56 3.7 Summary . 58 4 Application & Qualitative Evaluation of CadaML 59 4.1 Evolving from Single- to Multi-tenancy . 60 4.1.1 Manual Evolution . 60 4.1.2 Modeling using CadaML ...................... 61 4.2 Industrial Case Study: Background . 64 vii CONTENTS 4.2.1 Overview . 65 4.2.2 EvolutionMotivation&Challenges . 66 4.3 IndustrialCaseStudy: Implementation . 67 4.3.1 Comparing the Data Partitioning Schemes . 68 4.3.2 EvolvingtheDataArchitecture . 69 4.3.3 EvolvingtheApplicationArchitecture . 70 4.4 Qualitative Evaluation . 72 4.4.1 Evaluation Methodology . 72 4.4.2 Evaluation of the Feasibility . 75 4.4.3 Evaluation of the Adequacy of Multi-tenancy Implementation 76 4.5 Discussion . 77 4.5.1 Reflection on Challenges . 77 4.5.2 Limitations ............................ 79 4.6 Summary . 80 5ExperimentalEvaluation 81 5.1 EvolvingtheApplication. 81 5.2 Experimental Design . 82 5.2.1 Experiment Procedure . 83 5.2.2 Participant Recruitment . 84 5.2.3 Experimental Task . 84 5.2.4 Exit Interview Questions . 85 5.3 Participant Expertise . 86 5.4 ParticipationAllocation . 88 5.5 Modeling in CadaML ............................ 89 5.6 Evaluation of Productivity . 91 5.7 ErrorAnalysis............................... 94 5.8 Exit Interview Results . 95 5.8.1 Productivity . 96 5.8.2 Quality of the Generated Code . 97 5.8.3 Usability of CadaML ........................ 98 5.9 Discussion . 99 5.9.1 Ease of Exploitation . 99 5.9.2 Productivity of Developers . 99 5.9.3 Code Generator . 100 5.10 Threats to Validity . 101 5.10.1 ConstructValidity . 101 5.10.2 Internal Validity . 101 5.10.3 ExternalValidity . 101 viii CONTENTS 5.10.4 Conclusion Validity . 102 5.11 Summary . 102 6 Discussions & Conclusions 104 6.1 Thesis Summary . 104 6.2 Contributions ............................... 105 6.3 Reflection on Research Objectives . 107 6.4 Limitations . 108 6.5 Discussion . 109 6.6 Future Work . 113 Bibliography 113 A Comparing Data Storage Services of Cloud Providers 129 B Comparing Data Storage APIs of Cloud Providers 136 B.1 Relational Databases . 136 B.1.1 AlibabaApsaraDB ........................ 136 B.1.2 Amazon Relational Database Service (RDS) . 137 B.1.3 Azure Structured Query Language (SQL) Databases . 137 B.1.4 Creating connection and interacting with a database using Java Database Connectivity (JDBC) Application
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages164 Page
-
File Size-