"Automated Security Analysis of Web Application Technologies"
Total Page:16
File Type:pdf, Size:1020Kb
Saarland University Faculty of Mathematics and Computer Science Department of Computer Science Automated Security Analysis of Web Application Technologies Malte Horst Arthur Skoruppa Dissertation zur Erlangung des Grades des Doktors der Ingenieurwissenschaften der Fakultät für Mathematik und Informatik der Universität des Saarlandes Saarbrücken, 2017 ii Tag des Kolloquiums: 14. Dezember 2017 Dekan: Prof. Dr. Frank-Olaf Schreyer Prüfungsausschuss Vorsitzender: Prof. Dr. Christian Rossow Berichterstattende: Prof. Dr. Michael Backes Prof. Dr. Andreas Zeller Akademischer Mitarbeiter: Dr. Robert Künnemann iii Zusammenfassung Das Web hat sich zu einem komplexen Netz aus hochinteraktiven Seiten und Anwendungen entwickelt, welches wir täglich zu kommerziellen und sozialen Zwecken einsetzen. Dementsprechend ist die Sicherheit von Webanwendungen von höchster Relevanz. Das automatisierte Auffinden von Sicherheitslücken ist ein anspruchsvolles, aber wichtiges Forschungsgebiet mit dem Ziel, Entwickler zu unterstützen und das Web sicherer zu machen. In dieser Arbeit nutzen wir statische Analysemethoden, um automatisiert Lücken in JavaScript- und PHP-Programmen zu entdecken. JavaScript ist clientseitig die wichtigste Sprache des Webs, während PHP auf der Serverseite am weitesten verbreitet ist. Im ersten Teil nutzen wir eine Reihe von Programmtransformationen und Informationsflussanalyse, um den JavaScript Helios Wahl-Client zu untersuchen. Helios ist ein modernes Wahlsystem, welches auf konzeptueller Ebene eingehend analysiert wurde und dessen Implementierung als sehr sicher gilt. Wir enthüllen zwei schwere und bis dato unentdeckte Sicherheitslücken. Im zweiten Teil präsentieren wir ein Framework, das es Entwicklern er- möglicht, PHP Code auf frei modellierbare Schwachstellen zu untersuchen. Zu diesem Zweck konstruieren wir sogenannte Code-Property-Graphen und im- portieren diese anschließend in eine Graphdatenbank. Schwachstellen können nun als geeignete Datenbankanfragen formuliert werden. Wir zeigen, wie wir herkömmliche Schwachstellen modellieren können und evaluieren unser Frame- work in einer groß angelegten Studie, in der wir hunderte Sicherheitslücken identifizieren. v Abstract The Web today is a complex universe of pages and applications teeming with interactive content that we use for commercial and social purposes. Accordingly, the security of Web applications has become a concern of utmost importance. Devising automated methods to help developers to spot security flaws and thereby make the Web safer is a challenging but vital area of research. In this thesis, we leverage static analysis methods to automatically discover vulnerabilities in programs written in JavaScript or PHP. While JavaScript is the number one language fueling the client-side logic of virtually every Web application, PHP is the most widespread language on the server side. In the first part, we use a series of program transformations and information flow analysis to examine the JavaScript Helios voting client. Helios is a state- of-the-art voting system that has been exhaustively analyzed by the security community on a conceptual level and whose implementation is claimed to be highly secure. We expose two severe and so far undiscovered vulnerabilities. In the second part, we present a framework allowing developers to analyze PHP code for vulnerabilities that can be freely modeled. To do so, we build so- called code property graphs for PHP and import them into a graph database. Vulnerabilities can then be modeled as appropriate database queries. We show how to model common vulnerabilities and evaluate our framework in a large-scale study, spotting hundreds of vulnerabilities. vii Acknowledgments First of all, I wish to express my deep gratitude to my advisor Michael Backes. He has created this amazing working place and given me the opportunity to work in the midst of this warm and welcoming group, which has been continu- ously flourishing over the past years. Despite the growth of the group, Michael never lost sight of any of his students and always tried to bring out the best of each and everyone of us. It was a pleasure learning from him, on a scientific level and beyond. Thank you Michael. I would also like to thank all my former and current colleagues for creating the enjoyable and inspiring working atmosphere inherent to this group, which I feel very fortunate to have had the chance to be a part of. Particular thanks go to Matthias Berg who introduced me to the group. Had it not been for the work with Matthias during my studies, I would probably not stand here. Certainly, Bettina Balthasar also deserves a special thank you for her long- lasting dedication and for taking care of all the bits and pieces vital to the proper workings of the group, always in her delightfully cheerful manner. During my time as a PhD student, I have had the chance to work with brilliant people and I would like to thank all of my co-authors for our fruitful collaborations. In particular, I thank David Pfaff, Ben Stock, and Fabian Yamaguchi for numerous captivating discussions during my research that have been of tremendous help for the successful elaboration of our papers. Additionally, I would like to thank Andreas Zeller, Christian Rossow, and Robert Künnemann for agreeing to be part of my thesis committee. Andreas, as my second referee; Christian, as the committee chair; and Robert, in his words, as my protocol droid. Thank you all! I also wish to extend my gratitude to my friends for their refreshing company and for helping to take my mind off work. Last, but not least, I deeply thank my wonderful family, who has stood behind me throughout all these years with their unconditional love and faith in me. Your support is invaluable. Thank you for everything. Contents 1 Introduction1 2 Web Application Vulnerabilities5 2.1 General Overview.........................6 2.2 Vulnerabilities Threatening the Server.............8 2.2.1 SQL Injections......................9 2.2.2 Command Injections................... 10 2.2.3 Code Injections...................... 12 2.2.4 Arbitrary File Accesses.................. 13 2.3 Vulnerabilities Threatening the Client.............. 15 2.3.1 Cross-Site Scripting (XSS)................ 16 2.3.2 Session Fixation...................... 18 2.3.3 Sensitive Data Exposure................. 20 2.4 Characteristics of Vulnerable Code............... 22 3 Program Representations 25 3.1 Syntactical Representations................... 27 3.1.1 Parse Trees........................ 27 3.1.2 Abstract Syntax Trees.................. 29 3.2 Modeling Control Flow...................... 33 3.2.1 Control Flow Graphs................... 33 3.2.2 Dominator and Post-dominator Trees.......... 36 3.3 Statement Dependencies for Intraprocedural Analysis..... 39 3.3.1 Control Dependencies................... 39 3.3.2 Data Dependencies.................... 40 3.3.3 Program Dependence Graphs.............. 42 3.4 Call Graphs for Interprocedural Analysis............ 43 3.5 Discussion............................. 44 4 Information Flow Analysis of JavaScript 47 4.1 Challenges in the Analysis of the JavaScript Helios Client... 51 4.1.1 Reification of security properties............. 51 4.1.2 JavaScript is highly dynamic............... 52 4.1.3 HTML DOM and browser API............. 52 4.1.4 Included libraries..................... 53 4.1.5 Problem summary.................... 54 4.2 The Helios Voting System.................... 54 x Contents 4.2.1 A short review of the Helios protocol.......... 55 4.2.2 The Helios voting booth................. 56 4.3 Implementation-level Analysis.................. 59 4.3.1 Code transformations................... 60 4.3.2 A unified model for HTML and JavaScript components 65 4.3.3 Intermediate representation............... 67 4.3.4 System dependence graphs................ 68 4.3.5 Slicing........................... 69 4.3.6 Confidentiality and integrity analysis using information flow............................ 70 4.4 Vulnerabilities........................... 72 4.4.1 Arbitrary script execution................ 73 4.4.2 Leaking the vote..................... 76 4.5 Discussion and Takeaways.................... 77 4.6 Related Work........................... 79 4.6.1 Conceptual Attacks on Helios.............. 79 4.6.2 Practical Attacks on Helios............... 81 4.6.3 Static Analysis of JavaScript............... 81 4.7 Summary............................. 83 5 A Framework for Static PHP Code Analysis 85 5.1 Code Property Graphs...................... 89 5.2 Methodology........................... 92 5.2.1 Conceptual Overview................... 92 5.2.2 Graph Traversals..................... 96 5.2.3 Modeling Vulnerabilities................. 98 5.3 Implementation.......................... 105 5.4 Evaluation............................. 106 5.4.1 Dataset.......................... 106 5.4.2 Findings.......................... 108 5.5 Discussion and Limitations.................... 114 5.6 Related Work........................... 116 5.6.1 Discovery of Vulnerabilities in PHP Code........ 116 5.6.2 Flaw Detection Using Query Languages and Graphs.. 118 5.7 Summary............................. 119 6 Conclusion 121 A Refactoring JavaScript Libraries for the Helios Voting Booth125 A.1 Canonical Refactorings...................... 125 A.2 Modeling jQuery Ajax Functions................ 127 Contents xi A.2.1 $.get............................ 127 A.2.2 $.getJSON......................... 128 A.2.3 $.post........................... 129 A.3 Query Object Plugin....................... 130 A.4 Modeling the