Rethinking the Architecture of the Web
Total Page:16
File Type:pdf, Size:1020Kb
Rethinking the Architecture of the Web A Dissertation Presented by Liang Zhang to The College of Computer and Information Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science Northeastern University Boston, Massachusetts July 2016 To my family i Contents List of Figures v List of Tables viii Acknowledgments ix Abstract of the Dissertation x 1 Introduction 1 1.1 Contributions . .5 1.2 Outline . .7 2 Background 8 2.1 Web browsers . .8 2.1.1 Dynamic web pages . .9 2.1.2 Browser plug-ins . 10 2.1.3 HTML5 . 11 2.1.4 Mobile web browsers . 12 2.2 JavaScript . 13 2.2.1 The web and beyond . 13 2.3 Web servers . 14 2.3.1 Web application servers . 15 2.3.2 Privacy . 15 2.4 Cloud services . 16 2.4.1 Content distribution networks (CDNs) . 17 3 Maygh: Rethinking web content distribution 19 3.1 Motivation . 19 3.2 Maygh potential . 21 3.3 Maygh design . 22 3.3.1 Web browser building blocks . 22 3.3.2 Model, interaction, and protocol . 23 3.3.3 Maygh client . 25 3.3.4 Maygh coordinator . 28 ii 3.3.5 Multiple coordinators . 28 3.4 Security, privacy, and impact on users . 30 3.4.1 Security . 30 3.4.2 Privacy . 31 3.4.3 Impact on users . 32 3.4.4 Mobile users . 32 3.5 Implementation . 33 3.6 Evaluation . 33 3.6.1 Client-side microbenchmarks . 34 3.6.2 Coordinator scalability . 36 3.6.3 Trace-based simulation . 38 3.6.4 Small-scale deployment . 43 3.7 Summary . 44 4 Priv.io: Rethinking web applications 45 4.1 Motivation . 45 4.2 Overview . 47 4.2.1 Cost study . 48 4.3 Design . 51 4.3.1 Assumptions . 51 4.3.2 Attribute-based encryption . 52 4.3.3 Priv.io building blocks . 53 4.3.4 Priv.io operations . 54 4.4 Third-party applications . 57 4.4.1 Application API . 57 4.4.2 Managing applications . 59 4.4.3 Security and privacy . 59 4.4.4 Limitations . 60 4.4.5 Demonstration applications . 60 4.5 Discussion . 62 4.6 Evaluation . 64 4.6.1 Microbenchmarks . 66 4.6.2 User-perceived performance . 67 4.6.3 Small-scale deployment . 69 4.7 Summary . 69 5 Picocenter: Rethinking computation 71 5.1 Motivation . 72 5.2 Long-lived mostly-idle applications . 74 5.3 Picocenter Architecture . 75 5.3.1 Interface to cloud tenants . 75 5.3.2 Challenges . 76 5.3.3 Architecture overview . 76 5.3.4 Hub . 77 5.3.5 Worker . 79 iii 5.4 Quickly Swapping in Processes . 80 5.4.1 Swapping out applications . 81 5.4.2 Swapping in applications . 81 5.5 Implementation and Discussion . 84 5.5.1 Implementation . 84 5.5.2 Swapping implementation . 85 5.5.3 Deployment issues . 86 5.6 Evaluation . 88 5.6.1 Evaluation setup . 89 5.6.2 Microbenchmarks . 89 5.6.3 Swapping performance . 90 5.6.4 ActiveSet . 93 5.6.5 Cost . 96 5.7 Summary . 97 6 Related Work 98 6.1 Content distribution . 98 6.1.1 Optimizing CDNs . 98 6.1.2 Non-CDN approaches . 99 6.2 Privacy . 100 6.2.1 Enhancing web browsers . 100 6.2.2 Web services . 100 6.2.3 Decentralized network systems . 102 6.3 Computation . 102 6.3.1 Hardware virtualization . 102 6.3.2 Software environments . 103 6.3.3 Pre-paging and migration . 104 6.3.4 Checkpoint and restore . 104 6.3.5 Code offloading . 104 7 Conclusion 106 7.1 Summary . 106 7.2 Extensions . 108 A Supporting ActiveSet in CRIU 110 Bibliography 112 iv List of Figures 3.1 Overview of how content is delivered in Maygh, implemented in JavaScript (JS). A client requesting content is connected to another client storing the content with the coordinator’s help. The content is transferred directly between clients. 24 3.2 Maygh messages sent when fetching an object in Maygh between two clients (peer- ids pid1 and pid2). pid1 requests a peer storing content-hash obj-hash1, and is given pid2. The two clients then connect directly (with the coordinator’s assistance, using STUN if needed) to transfer the object. 26 3.3 Overview of how multiple coordinators work together on Maygh. The mapping from objects to lists of peers is distributed using consistent hashing, and the coordinator each peer is attached to is also stored in this list. The maximum number of lookups a coordinator must do to satisfy a request is two: one to determine a peer storing the requested item, and another to contact the coordinator that peer is attached to. 29 3.4 Average response time versus transaction rate for a single coordinator. The coordina- tor can support 454 transactions per second with under 15ms latency. ..