Lehrstuhl Für Sicherheit in Der Informatik Data-Only Malware
Total Page:16
File Type:pdf, Size:1020Kb
FAKULTAT¨ FUR¨ INFORMATIK DER TECHNISCHEN UNIVERSITAT¨ MUNCHEN¨ Lehrstuhl f¨urSicherheit in der Informatik Data-only Malware Sebastian Wolfgang Vogl Vollst¨andigerAbdruck der von der Fakult¨atf¨urInformatik der Technischen Universit¨at M¨unchen zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.) genehmigten Dissertation. Vorsitzender: Univ.-Prof. Dr. Uwe Baumgarten Pr¨uferder Dissertation: 1. Univ.-Prof. Dr. Claudia Eckert 2. Univ.-Prof. Dr. Thorsten Holz, Ruhr-Universit¨atBochum Die Dissertation wurde am 09.02.2015 bei der Technischen Universit¨atM¨unchen eingereicht und durch die Fakult¨atf¨urInformatik am 02.07.2015 angenommen. Acknowledgements Over the past years, I have received support and encouragement from many smart and amazing people. I want to seize this opportunity to express my sincere appreciation and gratitude to all of them. First and foremost, I would like to extend my thanks to my advisor and supervisor, Prof. Dr. Claudia Eckert, for providing me with the opportunity to write this thesis and for her outstanding mentoring during this time. Her unwavering support, continuous encouragement, and guidance greatly helped my research and this dissertation. Similarly, I want to thank my second advisor, Prof. Dr. Thorsten Holz, for his assistance, valuable advice, and crucial contribution to my research, which advanced and improved my thesis substantially. Additionally, I am very grateful to Prof. Dr. Michael Gerndt and Prof. Dr. Jonathon Giffin for providing me with the possibility of studying at Georgia Tech and for their support, encouragement, and guidance throughout the process. Next, I would like to thank my former and current colleagues at the IT security research groups in Munich and Bochum for the interesting discussions, the collaboration, the support of my work, the excellent atmosphere, and the pleasant evenings: Dr. Christian Schneider, Dr. Jonas Pfoh, Thomas Kittel, George Webster, Tamas Lengyel, Fatih Kilic, Julian Kirsch, Robert Gawlik, and Behrad Garmany. I am also grateful to the great people I met during my studies in Munich and in Atlanta for making studying an unforgettable experience and expanding my horizon. Most importantly, Tobias R¨ohm, Sepp Tremmel, Markus Graßl, Ferdinand Beyer, Felix Weninger, Bulli Bertolotti, Paolo Manenti, and Peter Ligeiro. My thanks also go to the extraordinary students that contributed to my projects: Lorenz Panny, Christian von Pentz, and Jonas Jelten. Thanks, too, to my closest friends for always being there for me, for their general awesomeness, and for helping me to keep my sanity: Felix R¨omisch, Thomas Zirngibl, Philip Lembcke, Alexander Lehmann, Melanie Lehmann, Felix Abele, and Dominik Zaun. Finally yet importantly, I would like to thank my family. My parents who opened up this path for me with their unlimited support, encouragement, and love. My sister for always looking out for her little brother. My grandparents for enriching my life, introducing me to Bud Spencer, and helping me to put things in perspective. Ronnie for his meticulous help with grammar and text comprehension. Family Meier for their support, understanding, and encouragement. Anna and Markus for becoming part of my life. And my love, Elisabeth, for being you, which is more than I ever dreamed of. iii Abstract Protecting the integrity of code is generally considered as one of the most effective approaches to counteract malicious software (malware). However, the fundamental problem with code-based detection approaches is that they rely on the false assumption that all malware consists of executable instructions. This makes them vulnerable to data-only malware, which, in contrast to traditional malware, does not introduce any additional instructions into the infected system. Instead, this malware form solely relies on the instructions that existed before its presence to perform malicious computations. For this purpose, data-only malware employs code reuse techniques such as return-oriented programming to combine existing instructions into a new malicious program. Due to this approach, the malware itself will consist solely of control data, enabling it to evade all existing code-based detection mechanisms. Despite this astonishing capability and the obvious risks associated with it, data-only malware has not been studied in detail to date. For this reason, the dimensions of the danger of this potential future threat remain as yet unknown. To remedy this shortcoming, we will in this work provide the first comprehensive study of data-only malware. We will begin by conducting a detailed analysis of data-only malware to determine the capabilities and limitations of this new malware form. In the process, we will show that data-only malware is not only on a par with traditional malware, but even surpasses it in its level of stealth and its ability to evade detection. To demonstrate this, we will present detailed proof of concept implementations of sophisticated data-only malware that are capable of infecting current systems in spite of the numerous protection mechanisms that they at present employ. Having shown that data-only malware is a serious and realistic threat, we evaluate the effectiveness of existing defense mechanisms with regard to data-only malware in the second part of this thesis. The goal of our analysis is hereby to determine whether there already exist effective countermeasures against data-only malware or if this new malware form poses an immediate danger to current systems due to the lack of such. In the course of our analysis, we identify hook-based detection mechanisms as the only potentially effective existing countermeasure against data-only malware. To validate this hypothesis, we follow our initial analysis with a detailed study of current hook-based detection mechanisms. In the process, we discover that hook-based detection mechanisms rely on the false assumption that an attacker can only modify persistent control data in order to install hooks. This oversight enables data-only malware to evade existing mechanisms by v targeting transient control data such as return addresses instead. To illustrate this, we present a new hooking concept that we refer to as dynamic hooking. Instead of changing control data directly, the key idea behind this concept is to manipulate non-control data in such a way that it will trigger a vulnerability at runtime, which then overwrites transient control data, resulting in the invocation of the hook. Due to this approach, dynamic hooks are hidden within non-control data, which makes it significantly more difficult to detect them and enables them to evade all existing hook-based detection mechanisms. Since our analysis of existing malware defense mechanisms yielded the result that even hook-based defense mechanisms are unable to detect data-only malware, we will deal with countermeasures against this malware form in the third and final part of the thesis. For this purpose, we first introduce a virtual machine introspection-based framework for malware detection and removal called X-TIER. X-TIER enables security applications to inject kernel modules from the hypervisor into a running virtual machine and to execute them securely within the guest. In the process, the modules can access any kernel function and any kernel data structure without loss of security. In addition, the modules can transfer arbitrary information to the hypervisor. Consequently, X-TIER effectively enables hypervisor-based security applications to circumvent the semantic gap, which constitutes the key problem that all security applications on the hypervisor-level face. By combining strong security guarantees with full access to the state of the virtual machine, our framework can provide an excellent basis for countermeasures against data-only malware. Based on our framework we finally present three concrete detection mechanisms for data-only malware. Each of these mechanisms puts to use one of the inherent dependencies of data-only malware, which we identified during our initial analysis of this malware form, against the malware itself. This results in effective countermeasures that can, particularly when used in combination, provide strong initial defenses against data-only malware. vi Zusammenfassung Die Integrit¨atdes Systemcodes zu sch¨utzen,gilt allgemein als eine der effektivsten Meth- oden um Infektionen durch Schadsoftware zu verhindern. Das fundamentale Problem solcher codebasierten Erkennungsmethoden ist jedoch, dass sie auf der falschen Annahme basieren, dass jede Schadsoftware aus ausf¨uhrbaren Maschineninstruktionen besteht. Dadurch sind derartige Erkennungsmechanismen anf¨alligf¨ur rein datenbasierte Schadsoft- ware, die im Gegensatz zu traditioneller Schadsoftware keine zus¨atzlichen Instruktionen in das System einschleust. Stattdessen, verwendet diese Schadsoftwareart zur Ausf¨uhrung ausschließlich Instruktionen, die sich bereits vor der Infektion auf dem System befunden haben. Dazu f¨ugtdie rein datenbasierte Schadsoftware bestehende Instruktionen mit Hilfe sogenannter Code-Reuse-Techniken wie Return-Oriented Programming zu einem neuen Schadprogramm zusammen. Die resultierende Schadsoftware besteht dabei auss- chließlich aus Kontrolldaten, was es ihr erm¨oglicht allen existierenden codebasierten Erkennungsverfahren zu entgehen. Trotz dieser erstaunlichen F¨ahigkeit und dem damit verbundenem Risiko, wurde rein datenbasierte Schadsoftware in der Forschung bisher nur unzureichend betrachtet. Aus diesem Grund ist derzeit v¨olligunklar,