Differential Fuzzing the Webassembly
Total Page:16
File Type:pdf, Size:1020Kb
Master’s Programme in Security and Cloud Computing Differential Fuzzing the WebAssembly Master’s Thesis Gilang Mentari Hamidy MASTER’S THESIS Aalto University - EURECOM MASTER’STHESIS 2020 Differential Fuzzing the WebAssembly Fuzzing Différentiel le WebAssembly Gilang Mentari Hamidy This thesis is a public document and does not contain any confidential information. Cette thèse est un document public et ne contient aucun information confidentielle. Thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Technology. Antibes, 27 July 2020 Supervisor: Prof. Davide Balzarotti, EURECOM Co-Supervisor: Prof. Jan-Erik Ekberg, Aalto University Copyright © 2020 Gilang Mentari Hamidy Aalto University - School of Science EURECOM Master’s Programme in Security and Cloud Computing Abstract Author Gilang Mentari Hamidy Title Differential Fuzzing the WebAssembly School School of Science Degree programme Master of Science Major Security and Cloud Computing (SECCLO) Code SCI3084 Supervisor Prof. Davide Balzarotti, EURECOM Prof. Jan-Erik Ekberg, Aalto University Level Master’s thesis Date 27 July 2020 Pages 133 Language English Abstract WebAssembly, colloquially known as Wasm, is a specification for an intermediate representation that is suitable for the web environment, particularly in the client-side. It provides a machine abstraction and hardware-agnostic instruction sets, where a high-level programming language can target the compilation to the Wasm instead of specific hardware architecture. The JavaScript engine implements the Wasm specification and recompiles the Wasm instruction to the target machine instruction where the program is executed. Technically, Wasm is similar to a popular virtual machine bytecode, such as Java Virtual Machine (JVM) or Microsoft Intermediate Language (MSIL). There are two major implementations of Wasm, correlated with the two most popular web browsers in the market. These two are the V8 engine by Chromium project and the SpiderMon- key engine by Mozilla. Wasm does not mandate a specific implementation over its specification. Therefore, both engines may employ different mechanisms to apply the specification. These dif- ferent implementations may open a research question: are both engines implementing the Wasm specification equally? In this thesis, we are going to explore the internal implementation of the JavaScript engine in regards to the Wasm specification. We experimented using a differential fuzzing technique, in which we test two JavaScript engines with a randomly generated Wasm program and compares its behavior. We executed the experiment to identify any anomalous behavior, which then we analyzed and identified the root cause of the different behavior. This thesis covers the WebAssembly specification extensively. It discusses several foundational knowledge about the specification that is currently lacking in references. This thesis also presents the instrumentation made to the JavaScript engine to perform the experiment, which can be a foundation to perform a similar experiment. Finally, this thesis analyzes the identified anomaly found in the experiment through reverse engineering techniques, such as static and dynamic analysis, combined with white-box analysis to the JavaScript engine source code. In this experiment, we discovered a different behavior of the JavaScript engine that is ob- servable from the perspective of the Wasm program. We created a proof-of-concept to demonstrate the different behavior that can be executed in the recent web browser up to the writing of this thesis. This experiment also evaluated the implementation of both JavaScript engine on the Wasm specification to conclude that both engines implement the specification faithfully. Keyword webassembly, fuzzing, c++, compiler, testing, programming language, static analysis, dynamic analysis ii Abstrait Auteur Gilang Mentari Hamidy Titre Fuzzing Différentiel le WebAssembly Programme d’Études Double Diplôme de Master Filière d’Attachement Security and Cloud Computing (SECCLO) Encadrants Académiques Prof. Davide Balzarotti, EURECOM Prof. Jan-Erik Ekberg, Aalto University Categorie La Thèse de Master Date 27 July 2020 Pages 133 Langue Anglais Abstrait WebAssembly, familièrement connu sous le nom de Wasm, est une spécification pour une représentation intermédiaire qui convient à l’environnement Web, en particulier du côté client. Il fournit une abstraction de la machine et des jeux d’instructions indépendants du matériel, où un langage de programmation de haut niveau peut cibler la compilation sur le Wasm au lieu d’une architecture matérielle spécifique. Le moteur JavaScript implémente la spécification Wasm etrecom- pile l’instruction Wasm en l’instruction machine cible où le programme est exécuté. Techniquement, Wasm est similaire à un bytecode de machine virtuelle populaire, comme Java Virtual Machine (JVM) ou Microsoft Intermediate Language (MSIL). Il existe deux implémentations majeures de Wasm, en corrélation avec les deux navigateurs Web les plus populaires. Ces deux sont le V8 de Chromium et le SpiderMonkey de Mozilla. Wasm n’exige pas une implémentation spécifique sur sa spécification. Par conséquent, les deux moteurs peuvent utiliser des mécanismes différents pour appliquer la spécification. Ces différentes implé- mentations peuvent ouvrir une question de recherche: les deux moteurs implémentent-ils également la spécification Wasm? Dans cette thèse, nous allons explorer l’implémentation interne du moteur JavaScript par rapport à la spécification Wasm. Nous avons expérimenté en utilisant une technique de fuzzing différentiel, dans laquelle nous testons deux moteurs JavaScript avec un programme Wasm généré de manière aléatoire et comparons son comportement. Nous avons exécuté l’expérience pour identifier tout comportement anormal, puis nous avons analysé et identifié la cause profondedes différents comportements. Cette thèse couvre largement la spécification WebAssembly. Il aborde plusieurs connaissances fondamentales sur la spécification qui manquent actuellement de références. Cette thèse présente également l’instrumentation faite au moteur JavaScript pour effectuer l’essai, qui peut être une base pour effectuer une essai similaire. Enfin, cette thèse analyse l’anomalie identifiée trouvée dans l’essai grâce à des techniques d’ingénierie inverse, telles que l’analyse statique et dynamique, combinées avec une analyse en boîte transparente au code source du moteur JavaScript. Dans cette essai, nous avons découvert un comportement différent du moteur JavaScript qui est observable au point de vue du programme Wasm. Nous avons créé une preuve de concept pour démontrer les différents comportements qui peuvent être exécutés dans le navigateur Web récent jusqu’à l’écriture de cette thèse. Cette essai a également évalué l’implémentation des deux moteurs JavaScript sur la spécification Wasm pour conclure que les deux moteurs implémentent fidèlement la spécification. Mots Clés webassembly, fuzzing, c++, compilateur, essai, langage de programmation, analyse dynamique, analyse statique iii Kupersembahkan karya terbaik ini untuk Ibunda tercinta Ayahanda tersayang Naini, Rosma, Ema, Anizar, Abuzar Adit, Faris, Aziz, Tawfiq, Dimas, Joan Icha, Ina, Neyga, Rima, Azza, Ahda, Iid, Levin, Dinda, Ajeng, Leo, Upiek, dan semua keluargaku And also dedicated to my best friend Enrico Budianto, whom I always wished I could still have a long endless discussion together about our shared interests. Let this be a prayer that I can continue what he left. And to all my teachers. Acknowledgement The education of the author was supported by the European Union through the Erasmus Mundus Joint Master’s Degree (EMJMD) Master’s programme in Secu- rity and Cloud Computing (SECCLO) for the intake year 2018 (Project Reference: 586541-EPP-1-2017-1-FI-EPPKA1-JMD-MOB). The education of the author was conducted at Aalto University, Finland, and EURECOM, France. This thesis work was supported through an internship in the Software and System Security (S3) Group at EURECOM, France, from February 2020 to July 2020. Forewords I give all praise to Allah that I can complete this master’s thesis successfully. It has been a marvelous journey in the last two years in Finland and France. Before, I did not imagine that I would study computer security. I always thought that this field was challenging, and I did not dare to work in this field. Yet, getting admitted to the Security and Cloud Computing (SECCLO) program was a sign that people are confident with my ability. I learned various new knowledge in this program. From a mathematical foundation of cryptography to binary program exploits. Without enrolling in SECCLO, I would not have been able to learn that knowledge by myself. I met people who share the same interests as me. We did many things together. We met every week to study cryptography together. We traveled together for our summer school. And we got to enjoy a small farewell dinner before our path diverged. I want to say thank you to all SECCLO 2018 batch to the time we shared for the last two years. I extend my gratitude to my classmate Christian Yudhistira for being a very nice roommate in France. I also thank you to Eljon Harlicaj, a SECCLO 2019 student, for many random conversations we had in the past one year. I got a very fantastic opportunity to work on a research assignment at Aalto University. I finally got my chance to get my hands dirty in working withcom- pilers, and I mastered the compiler more than I could learn in