Migrating Web Applications from Sql to Nosql Databases
Total Page:16
File Type:pdf, Size:1020Kb
MIGRATING WEB APPLICATIONS FROM SQL TO NOSQL DATABASES by RAHMA S. AL MAHRUQI A thesis submitted to the School of Computing in conformity with the requirements for the degree of Doctor of Philosophy Queen’s University Kingston, Ontario, Canada January 2020 Copyright © Rahma S. Al Mahruqi, 2020 Abstract In the Big Data era, data is emerging dramatically and the structure of data is becoming increasingly flexible. Non-Relational databases, such as the NoSQL class of Databases, are playing a major role as an enabler technology to manipulate such data. Non-Relational databases, have overcome many of the limitations of the relational databases especially those that relate to enable systems to scale up to serve more customers producing terabytes of data. More businesses are willing to migrate their legacy relational database systems to ones that use NoSQL Databases. In this thesis, we present a semi-automated approach to migrate highly dynamic SQL- based web applications to ones that use document-oriented NoSQL databases such as Mon- goDB using source analysis and transformation techniques. We outline a set of source trans- formation steps that can be used to migrate existing web applications database from SQL one to document-oriented NoSQL database. We demonstrate our semi-automated frame- work on the analysis and migration of three existing web applications to extract, classify, translate, migrate and optimize the queries and migrate the PHP code to interact with the migrated database. i There are two parts to this approach; the migration of schema and data, and the migra- tion of the actual application code with embedded queries. Our approach provides contri- butions to the second part, migrating and optimizing the embedded SQL queries to inter- act with the new database system and changing the application code to use the translated queries. ii Co-Authorship The published paper resulting from this thesis have been co-authored with my supervisors Dr. Thomas R. Dean and Dr. Manar H. Alalfi. I am the primary author of the paper. Rahma S. Al Mahruqi, Manar H. Alalfi, and Thomas R. Dean, "A semi-automated framework for migrating web applications from SQL to document oriented NoSQL database, CASCON ’19: Proceedings of the 29th Annual International Conference on Computer Sci- ence and Software Engineering, Toronto, November 2019, Pages 44–53. iii Acknowledgments First and foremost, I thank God (Allah) the Most Gracious and Merciful for blessing me with guidance, strength, patience and perseverance to complete this work. I would like to express my gratitude to all those who make it possible for me to com- plete this thesis. I want to thank my supervisors Prof. Thomas R. Dean and Prof. Manar H. Alalfi for their advice, support, and encouragements. They always provided valuable direction and feedback, and gave me an important lesson on academic research. It was a great pleasure for me to conduct this thesis under their supervision. I would also like to thank the members of my oral defense committee, Dr. Ettore Merlo, Dr. James Cordy, Dr. Michael Greenspan, and Dr. Hossam Hassanein for their time and insightful questions and comments. My appreciation and gratefulness to all my colleagues and to the friendly members of the School of Computing who have helped in one way or another along the way. In particular, I would like to thank Marwa Afifi, Mariam AlMazour and Ayesha Baber for their sincere friendship. As well, I would like to acknowledge the School of Computing at Queen’s University for providing me with the opportunity and environment for pursuing my research and stud- ies. Special appreciation is due to Mrs. Debby Robertson, our graduate assistant. Thank you for being always there and for your endless encouragement. Special thanks go to Dr. Wendy Powley for advocating and inspiring women in computing. iv To the soul of my mother and father who did everything they could for me and help me to achieve my dreams and I am sure if they were here, they would have been so proud of their daughter. I further wish to thank my family and friends for their support and encouragement. I also appreciate the support and encouragement my husband, Omar, had given me over the past several years. He was instrumental in me taking this leap in the beginning and for the past 20 years has done nothing but encourage me in whatever goals I set for myself. Finally, I would like to thank my lovely kids; Hafsah, Sara, Abdullah and Maeen; my source of enthusiasm for always encouraging me and telling me that it is never too late for learning. Thank you for your understanding and patience when I was away from you. I thank my brother, Ahmed; my role model, and my youngest brother Mohamed and my sisters; Azza, Nasra, Fatma and Raheema; my father-in-law and mother-in-law, uncles, aunts, and cousins for their continuous prayers and support. I thank my home country Oman; Sultan Qaboos University and the Ministry of Higher Education for their support to pursue my degree and for funding my study. v Statement of Originality I, Rahma S. Al Mahruqi, certify that the research work presented in this thesis is my own and was conducted under the supervision of Dr. Thomas R. Dean, and Dr. Manar H. Alalfi. All references to the work of other people are properly cited. vi Contents Abstract i Co-Authorship iii Acknowledgments iv Statement of Originality vi Contents vii List of Tables xi List of Figures xii Chapter 1: Introduction 3 1.1 Problem Statement . 3 1.2 Motivation . 4 1.3 Research Statement . 6 1.4 Objectives . 7 1.5 Contributions . 7 1.6 Organization of the Thesis . 8 Chapter 2: Background 9 2.1 Introduction . 9 2.2 Databases . 10 2.3 Database Management Systems . 11 2.4 Data Storage Evolution in Web Applications . 11 2.5 The Relational Model . 13 2.6 NoSQL Databases . 14 2.6.1 Classification of NoSQL Systems . 16 2.6.2 Characteristics of NoSQL Systems . 18 2.6.3 NoSQL Use Cases . 21 vii 2.6.4 Challenges for NoSQL Adoption . 24 2.6.5 MongoDB . 26 2.6.6 NoSQL Usage Statistics . 28 2.6.7 Web Application Migration to NoSQL . 29 2.7 The Need for Automation . 31 2.7.1 Source Transformation . 32 2.8 Query and Application Migration Challenges . 33 2.9 Related Work . 34 2.9.1 Schema and Data Migration . 35 2.10 Query and Application Migration . 37 2.10.1 Research-based Tools . 37 2.10.2 Industry-based Products . 39 2.11 Conclusion . 40 Chapter 3: Proposed Migration Approach 42 3.1 Introduction . 42 3.2 Proposed Migration Approach . 42 3.3 Conclusion . 45 Chapter 4: Schema and Data Migration 46 4.1 Introduction . 46 4.2 Data Model Design . 48 4.3 Designing The Schema . 50 4.3.1 Schema Conversion . 51 4.3.2 Data Migration . 52 4.4 Migration Strategy . 53 4.5 Data Migration . 54 4.5.1 Data Migration Tools Overview . 55 4.6 Schema and Data Migration Experiments . 57 4.6.1 PHPBB3 Schema and Data Migration . 57 4.7 WordPress Schema and Data Migration . 64 4.7.1 WordPress Schema Conversion . 64 4.8 Data Migration Considerations . 67 4.9 Data Migration Tools . 68 4.9.1 Data Migration Tools Evaluation . 69 4.9.2 Data Migration Tools Comparison . 71 4.10 Conclusion . 73 Chapter 5: Manual Migration 76 5.1 Introduction . 76 5.2 PHPBB Code Migration . 78 viii 5.3 SCARF Manual Migration . 81 5.4 Generic Code Migration Cases . 91 5.5 Manual Migration Caveats . 99 5.6 Conclusion . 100 Chapter 6: Query Migration and Optimization 102 6.1 Introduction . 102 6.2 Query Extraction Phase . 102 6.3 Query Classification Phase . 104 6.4 Query Translation and Migration Phase . 107 6.4.1 Naive Queries Translation . 108 6.4.2 Migration Cases Examples . 110 6.4.3 Dual Queries Translation . 112 6.4.4 SCARF Migration Issues . 114 6.4.5 PHPBB Migration Issues . 117 6.5 Query Migration Evaluation . 123 6.5.1 Query Evaluation Before Indexing . 123 6.5.2 Query Evaluation After Indexing . 126 6.6 Query Optimization Phase . 128 6.6.1 Query Optimization Techniques . 129 6.6.2 Filter Option Optimization . 132 6.6.3 Table Order Optimization . 134 6.6.4 Query Optimization Evaluations . 142 6.7 Conclusion . 144 Chapter 7: Application Migration 146 7.1 Introduction . 146 7.2 Application Migration Overview . 147 7.2.1 Application Migration Approach . 149 7.2.2 Application Migration Steps . 153 7.3 Phase 1 - Backward Tracing Search . 154 7.3.1 Step 1: Identify and change calls to MySQL . 154 7.3.2 Step 2: Pre-process source files to gather global variables . 155 7.3.3 Step 3: Find functions used to launch queries . 155 7.3.4 Step 4: Search SQL statements using backward tracing from calls to MySQL . 156 7.3.5 Step 5: Identify the prototype of SQL statements . 161 7.4 Phase 2 - Migration of SQL sentences . 161 7.4.1 Step 1: Query classification . 162 7.4.2 Step 2: Migrate query by pattern matching . 162 ix 7.5 Simple Query Translation Example . 164 7.6 Backward Tracing Translation . 167 7.6.1 Backward Tracing Overview . 167 7.6.2 SCARF Use Cases Examples . 170 7.6.3 PHPBB3 Backward Tracing Examples . ..