Formal Foundations for Provably Safe Web Components
Total Page:16
File Type:pdf, Size:1020Kb
Formal Foundations for Provably Safe Web Components By: Michael Stefan Herzberg A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy The University Of Sheffield Faculty of Engineering Department of Computer Science Submission Date: 11th December, 2019 Abstract One of the cornerstones of modern software development that enables the creation of sophisticated software systems is the concept of reusable software components. Especially the fast-paced and business-driven web ecosystem is in need of a robust and safe way of reusing components. As it stands, however, the concepts and functions needed to create web components are spread out, immature, and not clearly defined, leaving much room for misunderstandings. To improve the situation, we need to look at the core of web browsers: the Document Object Model (DOM). It represents the state of a website with which users and client-side code (JavaScript) interact. Being in this central position makes the DOM the most central and critical part of a web browser with respect to safety and security, so we need to understand exactly what it does and which guarantees it provides. A well- established approach for this kind of highly critical system is to apply formal methods to mathematically prove certain properties. In this thesis, we provide a formal analysis of web components based on shadow roots, highlight their short-comings by proving them unsafe in many circumstances, and propose suggestions to provably improve their safety. In more detail, we build a formalisation of the Core DOM in Isabelle/HOL into which we introduce shadow roots. Then, we extract novel properties and invariants that improve the often implicit assumptions of the standard. We show that the model complies to the standard by symbolically evaluating all relevant test cases from the official compliance suite successfully onour model. We introduce novel definitions of web components and their safety and classify the most important DOM API accordingly, by which we uncover surprising behavior and shortcomings. Finally, we propose changes to the DOM standard by altering our model and proving that the safety of many DOM API methods improves while leading to a less ambiguous API. i Declaration I, the author, confirm that the Thesis is my own work. I am aware of the University’s Guidance on the Use of Unfair Means (www.sheffield.ac.uk/ssid/unfair-means). This work has not been previously been presented for an award at this, or any other, university. Publications arising from this thesis: 1. Achim D. Brucker and Michael Herzberg. ‘A Formal Semantics of the Core DOM in Isabelle/HOL’. in: Companion Proceedings of the The Web Conference 2018. WWW 1́8. Lyon, France: International World Wide Web Conferences Steering Committee, 2018, pp. 741–749. isbn: 978-1-4503-5640-4. doi: 10.1145/3184558.3185980 2. Achim D. Brucker and Michael Herzberg. ‘Formalizing (Web) Standards - An Application of Test and Proof’. In: Tests and Proofs - 12th International Conference, TAP 2018, Held as Part of STAF 2018, Toulouse, France, June 27-29, 2018, Proceedings. Ed. by Catherine Dubois and Burkhart Wolff. Vol. 10889. Lecture Notes in Computer Science. Springer, 2018, pp. 159–166. doi: 10.1007/978-3- 319-92994-1_9 3. Achim D. Brucker and Michael Herzberg. ‘A Formally Verified Model of Web Components’. In: Formal Aspects of Component Software - 16th International Con- ference on Formal Aspects of Component Software, 23-25 October 2019, Amsterdam, Proceedings. 2019 4. Achim D. Brucker and Michael Herzberg. ‘The Core DOM’. in: Archive of Formal Proofs (2018). https://www.isa-afp.org/entries/Core_DOM.html, Formal proof development. A snapshot taken at thesis submission time can be found under [21]. issn: 2150-914x 5. Achim D. Brucker and Michael Herzberg. ‘Shadow DOM: A Formal Model of the Document Object Model with Shadow Roots’. In: Archive of Formal Proofs (28th Sept. 2020). Formal proof development. Submitted. A snapshot taken at thesis submission time can be found under [21]. issn: 2150-914x 6. Achim D. Brucker and Michael Herzberg. ‘A Formalization of Web Components’. In: Archive of Formal Proofs (28th Sept. 2020). Formal proof development. Submitted. A snapshot taken at thesis submission time can be found under [21]. issn: 2150-914x 7. Achim D. Brucker and Michael Herzberg. ‘The Safely Composable DOM’. in: Archive of Formal Proofs (28th Sept. 2020). Formal proof development. Submitted. A snapshot taken at thesis submission time can be found under [21]. issn: 2150-914x iii 8. Achim D. Brucker and Michael Herzberg. ‘Shadow SC DOM: A Formal Model of the Safelty Composable Document Object Model with Shadow Roots’. In: Archive of Formal Proofs (28th Sept. 2020). Formal proof development. Submitted. A snapshot taken at thesis submission time can be found under [21]. issn: 2150-914x 9. Achim D. Brucker and Michael Herzberg. ‘A Formalization of Safely Composable Web Components’. In: Archive of Formal Proofs (28th Sept. 2020). Formal proof development. Submitted. A snapshot taken at thesis submission time can be found under [21]. issn: 2150-914x iv Acknowledgements I would like to express my sincere gratitude to: • My supervisor Achim Brucker, who spent countless hours introducing me to Isa- belle/HOL and the world of academia, and who never grew tired of just another round of revising drafts. • My thesis panel, Anthony Simons and Eleni Vasilaki, who gave valuable suggestions throughout my PhD and helped keeping me on track. • The Isabelle group at the University of Sheffield, who always gave valuable technical and moral support and broadened my understanding of the many uses of the tool. • The Department of Computer Science of the University of Sheffield, whose scholar- ship supported me throughout my research and enabled me to travel to conferences. • All the anonymous reviewers of my conference papers, who gave helpful feedback. • All my friends and family, who provided all kinds of support and encouragement. v Contents 1 Introduction 1 1.1 Motivation . 1 1.2 Contributions . 3 1.3 Thesis Structure . 4 1.4 Externally Reviewed Parts . 4 2 Background 7 2.1 Isabelle/HOL . 7 2.1.1 Basic Definitions . 7 2.1.2 Extending Logics . 8 2.1.3 Monads . 9 2.1.4 Locales . 10 2.1.5 Code Generator . 12 2.2 Web Standards . 13 2.2.1 Document Object Model (DOM) . 13 2.2.2 Hypertext Markup Language (HTML) and JavaScript . 14 3 A Formal Core DOM: fDOM 15 3.1 The Core DOM Data Model . 15 3.1.1 Classes as Datatypes . 17 3.1.2 Pointer datatypes . 21 3.2 Defining Operations and Queries on Node-Trees . 24 3.3 Well-Formedness of the DOM Heap . 30 3.3.1 Node Sharing . 30 3.3.2 The Owner Document . 31 3.3.3 Restricting DOMs to Trees . 32 3.3.4 Pointer Validity . 32 3.3.5 Heaps are Strongly Typed . 33 3.3.6 Well-Formed Heaps . 33 3.4 Reasoning over the DOM . 33 3.4.1 Locality of Heap Modifications . 34 3.4.2 Exceptions and Non-Termination . 37 3.4.3 Functional Correctness . 38 3.4.4 Well-Formedness of the Heap Methods . 44 4 A DOM with Shadow Roots 47 4.1 Motivating Example . 48 vii Contents 4.2 Data Model and Basic Accessors . 52 4.3 Graph Relations . 56 4.4 Heap Invariants . 57 4.5 Interfacing Shadow Trees: Slotting . 59 4.6 Flattening Shadow Trees . 63 5 Compliance to the Standard 65 5.1 Formal Software Standards . 65 5.2 Selecting Test Cases . 67 5.3 Translating Test Cases . 70 5.4 Beyond Compliance: Enhancing the Specification . 71 6 A New Notion of Web Components 77 6.1 A Formal Definition of Web Components . 77 6.1.1 Definition . 77 6.1.2 Different Kinds of DOM-Components . 78 6.1.3 Properties of DOM-Components . 79 6.2 Definition of DOM-Component Safety . 81 6.2.1 Strong DOM-Component Safety . 82 6.2.2 Weak DOM-Component Safety . 83 6.3 Component Safety Classification of the DOM Methods . 84 6.3.1 Getters and Simple Recursive DOM Methods . 84 6.3.2 Constructors . 86 6.3.3 Shadow Root Methods . 88 6.3.4 Heap-Modifying and Global Methods . 89 7 Beyond the Standard: Safe Web Components 91 7.1 Making the DOM Safely Composable . 91 7.1.1 Data Model . 92 7.1.2 Heap Invariants . 92 7.1.3 Methods . 93 7.2 A New Kind of Component . 95 7.3 General Properties of SCDOM-Components . 98 7.4 Updated DOM Method Classification . 99 8 Related Work 103 8.1 Formal Models of Software Standards . 103 8.2 Program Verification . 104 8.3 Formal Approaches to Object-Orientation . 105 8.4 Iframes and Shadow Roots . 105 8.5 Components in General . 106 9 Conclusion and Future Work 107 Bibliography 109 viii List of Figures 2.1 A Simple DOM Instance . 13 3.1 Alternative, Purely Functional Data Model . 16 3.2 Core DOM IDL Specification . 18 3.3 Type Hierarchy . 18 3.4 Data Model Attribute Example . 19 3.5 Example of a Formal DOM Instance . 23 3.6 Formal Types of the DOM API . 25 3.7 Description of Pre-Insertion Validity . 27 3.8 InsertBefore Example . 28 3.9 Disconnected DOM Trees . 31 3.10 Formalisation of Ensuring Pre-Insertion Validity . 42 4.1 Running Example: Fancy Tab . 48 4.2 Two Variants of Fancy Tab . 50 4.3 Shadow Roots Restrict Node Iterators . 51 4.4 Internal DOM Structure of Shadow Roots . 51 4.5 Updated Data Model with Shadow Roots . 52 4.6 Visualisation of Slotting . 62 5.1 Test and Proof for Software Standards . 66 5.2 Translating a sub-test from the W3C test suite into HOL.