2.4 Database Replication
Total Page:16
File Type:pdf, Size:1020Kb
Master’s Thesis Optimizing team collaboration web applications through local storage and synchronization Dennis Andersen & Andreas Nilsson Department of Computer Science Faculty of Engineering LTH Lund University, 2013 ISSN 1650-2884 LU-CS-EX: 2013-34 Optimizing team collaboration web applications through local storage and synchronization Dennis Andersen Andreas Nilsson [email protected] [email protected] September 9, 2013 Master's thesis work carried out at RefinedWiki AB. Supervisor: Björn Johnsson, [email protected] Examiner: Per Andersson, [email protected] Abstract Team collaboration software is a great way for people to share their work. But often the need arises for accessing the work while on the go using limited net- work connections. The ongoing development in the area of HTML5 and web applications has opened up many new possibilities, especially for storing data locally. In this thesis we extensively test and compare these new solutions to find the best alternative. Along with testing and comparing techniques for database repli- cation and synchronization we hope to combine the two, enhancing the user experience by optimizing the team collaboration web application. Keywords: Database, replication, HTML5, local storage, optimization ii Acknowledgements We would like to thank the people over at RefinedWiki for providing us with a great work- ing environment, along with feedback and support. We would also like to thank our su- pervisor Björn Johnsson for proofreading this report and giving us feedback. iii iv Contents 1 Introduction 1 1.1 Goals ..................................... 1 1.2 Outline .................................... 2 1.3 Division of labor ............................... 3 2 Background 5 2.1 Team collaboration .............................. 5 2.1.1 Web clients .............................. 5 2.1.2 Atlassian Confluence ......................... 6 2.2 HTML5 overview .............................. 7 2.2.1 The HTML5 specification ...................... 8 2.2.2 HTML5 performance features .................... 8 2.3 Client-side storage .............................. 9 2.3.1 Before HTML5 ........................... 9 2.3.2 HTML5 Storage solutions ...................... 10 2.3.3 Web Storage ............................. 12 2.3.4 Web SQL Database ......................... 12 2.3.5 Indexed Database API ........................ 14 2.3.6 FileSystem API ............................ 15 2.4 Database replication ............................. 16 2.4.1 Database Replication Terms ..................... 16 2.4.2 Master-Slave and Group ....................... 17 2.4.3 Lazy and Eager replication ...................... 17 2.4.4 Partial and Full replication ...................... 17 2.4.5 Conflicts ............................... 17 2.4.6 Combining characteristics ...................... 18 2.4.7 Two-tier replication ......................... 18 2.4.8 Transaction-Level Result-Set Propagation ............. 19 2.4.9 Three-tier replication ........................ 21 v CONTENTS 3 Defining the model 23 3.1 Client-side storage .............................. 23 3.1.1 Our requirements .......................... 23 3.1.2 Method ................................ 24 3.1.3 Implementation and results ..................... 26 3.2 Database Replication ............................. 33 3.2.1 Our requirements .......................... 33 3.2.2 Method ................................ 34 3.2.3 Results ................................ 39 4 Implementation 41 4.1 Overview of functionality .......................... 41 4.2 Confluence Architecture ........................... 42 4.3 Implementation Design ........................... 42 4.3.1 Server ................................. 42 4.3.2 REST-API .............................. 45 4.3.3 Client ................................. 46 5 Result 49 5.1 Use cases ................................... 49 5.1.1 Use case 1 .............................. 49 5.1.2 Use case 2 .............................. 50 5.1.3 Use case 3 .............................. 51 5.1.4 Use case 4 .............................. 51 5.2 Result of use cases .............................. 52 6 Conclusion & Future work 55 6.1 Future Work ................................. 56 Appendix A Test resources 65 A.1 Reading values of increasing size ....................... 65 A.2 Writing values with increasing size ...................... 65 A.3 Fetching entries from storage ......................... 65 A.4 Loading a page with storage of increasing size ................ 66 vi Chapter 1 Introduction Team collaboration software is an important tool for companies and organizations to man- age large projects. Market reports show that the team collaboration and web conference software market is growing each year [1]. Often these types of software have implemented a web interface which allow for project members to collaborate from anywhere in the world and at any time through their computers or mobile phones [2, 5]. However, since the soft- ware requires an internet connection to work, there is an immediate loss in performance due to response times and the fact that all the relevant data is stored on a remote server. With the new HTML5 standard being developed along with new Application Program- ming Interface specifications for web applications, there exists new ways to store persistent data inside the browser. The use of team collaboration software in large companies can be essential, something that's verified by the many thousand companies using solutions such as Atlassian Confluence [4]. The problem with implementing a local storage solution in a team collaboration soft- ware as opposed to other software types is that the data is shared among many users, mean- ing the data can quickly become outdated. Compare this to a software that deal with micro blogging, bulletin boards or mail clients where the user is responsible for its own data; where it is far more easy to synchronize, edit locally and upload changes, since it mostly affects the user that is making the changes. In a team collaboration software environment, the data you are editing might be simultaneously edited by several other users. 1.1 Goals The purpose of this thesis is to combine state of the art browser technologies for storing data locally with an efficient database replication algorithm, modified for our needs. Doing so we hope to construct a powerful web application capable of efficiently using a web-based team collaboration system with a minimal load on the server. Our goals are to achieve the 1 1. Introduction following: • Response times should be reduced to half of their original values. • To handle several users using and editing the same content by investigating and im- plementing the handling of conflicts. • Users should be able to synchronize and store large amount of data entries (at least 200) from the software. In order to achieve this, extensive testing should be conducted in order to derive the best solution. When evaluating the client-side storage the following aspects are of interest: Performance: The overall performance should be good. Reading, writing and searching the local data should be done efficiently. Size capability: Supporting the storage of large amount of data is important, as a large team collaboration instance can contain thousands of entries. Browser support: The solution should be able to run on as many browsers and versions as possible in order to cover as many users as possible. In regards to synchronizing and maintaining the database the following aspects are of im- portance: Partial database: The solution should be able to synchronize specified parts of the database as opposed to the entire database. Conflicts: When synchronizing the solution should be able to detect and handle conflicts. Synchronization: Since performance is important, the number of synchronization occa- sions should be as few as possible while still meeting the two criteria above. Extra data: The additional amount of data needed locally in order to maintain the database and its synchronization should be kept as low as possible, prioritizing the collabora- tion data. 1.2 Outline The thesis is structured according to the following outline: Chapter 2 Serves as an introduction to the relevant theory and technology explored and tested. The chapter is meant to offer the necessary information needed to understand the problems and solutions presented in this thesis. Chapter 3 This chapter studies the relevant solutions in greater detail by testing them ac- cording to our requirements and specifications. They are measured against each other and compared. This is done in order to derive the best route to go when implement- ing a prototype solution. 2 1.3 Division of labor Chapter 4 Presents the implementation and structure of the prototype that was imple- mented. An overview of the functionality and design is presented and the different components are described. Chapter 5 In this section several use cases are presented that is used to test the imple- mented prototype. The results are presented and compared to the original instance. Chapter 6 Here the results previously presented is evaluated. The implementation is an- alyzed according to the goals set up and how well it performed. Furthermore the implementation is evaluated as to what flaws or deficiencies are still present and how it could be improved upon in the future. 1.3 Division of labor Both authors have worked along side each other on RefinedWiki and as such both has been involved in all aspects of the thesis. Coding