Generic Detection of Code Injection Attacks Using Network-Level Emulation
Total Page:16
File Type:pdf, Size:1020Kb
Generic Detection of Code Injection Attacks using Network-level Emulation Michalis Polychronakis Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of Crete Heraklion, October 2009 Generic Detection of Code Injection Attacks using Network-level Emulation A dissertation submitted by Michalis Polychronakis in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of Crete The dissertation of Michalis Polychronakis is approved: Committee: Evangelos P. Markatos Professor, University of Crete – Thesis Advisor Angelos Bilas Associate Professor, University of Crete Vasilios A. Siris Assistant Professor, Athens Univ. of Economics and Business Angelos Keromytis Associate Professor, Columbia University Maria Papadopouli Assistant Professor, University of Crete Athanasios Mouchtaris Assistant Professor, University of Crete Sotiris Ioannidis Associate Researcher, FORTH-ICS Department: Dimitris Plexousakis Professor, University of Crete – Chairman of the Department Heraklion, October 2009 Abstract Code injection attacks against server and client applications have become the primary method of malware spreading. A promising approach for the detection of previously unknown code injection attacks at the network level, irrespective of the particular ex- ploitation method used or the vulnerability being exploited, is to identify the malicious code that is part of the attack vector, also known as shellcode. Initial implementations of this approach attempt to identify the presence of shellcode in network inputs using detection algorithms based on static code analysis. However, static analysis cannot effectively handle malicious code that employs advanced obfuscation methods such as anti-disassembly tricks or self-modifying code, and thus these detection methods can be easily evaded. In this dissertation we present network-level emulation, a generic code injection attack detection method based on dynamic code analysis using emulation. Our pro- totype attack detection system, called Nemu, uses a CPU emulator to dynamically analyze valid instruction sequences in the inspected traffic. Based on runtime behav- ioral heuristics, the system identifies inherent patterns exhibited during the execution of the shellcode, and thus can detect the presence of malicious code in arbitrary in- puts. We have developed heuristics that cover the most widely used shellcode types, including self-decrypting and non-self-contained polymorphic shellcode, plain or meta- morphic shellcode, and memory-scanning shellcode. Network-level emulation does not rely on any exploit or vulnerability specific signatures, which allows the detection of previously unknown attacks. At the same time, the actual execution of the attack code on a CPU emulator makes the detector robust to evasion techniques like indirect jumps and self-modifications. Furthermore, each input is inspected autonomously, which makes the approach effective against targeted attacks. Our experimental evaluation with publicly available shellcode construction en- gines, attack toolkits, and real attacks captured in the wild, shows that Nemu is more robust to obfuscation techniques compared to previous proposals, while it can effectively detect a broad range of different shellcode implementations without any prior exploit-specific information. At the same time, extensive testing using benign generated and real data did not produce any false positives. To assess the effectiveness of our approach under realistic conditions we deployed Nemu in several production networks. Over the course of more than one year of con- tinuous operation, Nemu detected more than 1.2 million attacks against real systems. We provide a thorough analysis of the captured attacks, focusing on the structure and operation of the shellcode, as well as the overall attack activity in relation to the different targeted services. The large and diverse set of the detected attacks combined with the zero false positive rate over the whole monitoring period demonstrate the effectiveness and practicality of our approach. Finally, we identify challenges faced by existing network trace anonymization schemes for safely sharing attack traces that contain self-decrypting shellcode. To alleviate this problem, we present an anonymization method that identifies and prop- erly sanitizes sensitive information contained in the encrypted part of the shellcode that is otherwise not exposed on the wire. Thesis Advisor: Prof. Evangelos Markatos iv ÈeÖÐÐhÝh µ eÒaÒØÐÓÒ dikØÙakôÒ efaÖÑÓgôÒ eÔijè×eic kakìbÓÙÐÓÙ kôdika ´ Çi code injection attacks aÔÓØeÐÓÔÒ ÔÐèÓÒ ØhÒ kÔÖia ÑèjÓdÓ didÓ×hc kakìbÓÙÐÓÙ ÐÓgi×ÑikÓÔ ´ Ç malwareµº µ ÔÓÙ ÔeÖièÕeØai ×ØhÒ eÔÐje×h eÐÒai Ñia ÓÔi×Ñìc ØÓÙ kakìbÓÙÐÓÙ kôdika ´ eÒØ shellcode ÙÔÓ×ÕìÑeÒh ÔÖÓ×èggi×h gia ØhÒ aÒÐÕÒeÙ×h ÔÖÛØÓeÑfaÒiÞìÑeÒÛÒ eÔijè×eÛÒ ×ØÓ diadÐk¹ ØÙÓº AÖÕikèc ÙÐÓÔÓi ×eic aÙØ c Øhc ÑejìdÓÙ ba×ÐÞÓÒØai ×ØhÒ ØeÕÒik Øhc ×ØaØik c aÒÐÙ×hc kôdikaº Ï×Øì×Ó¸ Ói ØeÕÒikèc aÙØèc deÒ eÐÒai aÔÓØeÐe×ÑaØikèc ×ØÓÒ eÒØÓÔi×¹ Ñì kakìbÓÙÐÓÙ kôdika ÔÓÙ ÕÖh×iÑÓÔÓieÐ eÜeÐigÑèÒec ØeÕÒikèc aÔìkÖÙÝhc ìÔÛc Ó aÙ¹ ØÓØÖÓÔÓÔÓiÓÔÑeÒÓc kôdikacº Ïc ÑèÖÓc Øhc ÔÖÓ×Ôjeiac eÜeÔÖe×hc Ñiac aÔÓØeÐe×ÑaØik c ÑejìdÓÙ aÒÐÕÒeÙ×hc ÔÖÛØÓeÑfaÒiÞìÑeÒÛÒ eÔijè×eÛÒ¸ ÔÖÓØeÐÒÓÙÑe ØhÒ ØeÕÒik Øhc eÜÓÑÓÐÛ×hc kôdika ×¹ µº À ØeÕÒik ba×ÐÞeØai ×Øh dÙÒaÑik Ó eÔÐÔedÓ ØÓÙ dikØÔÓÙ ´ Ø network-level emulation aÒÐÙ×h kôdika ÕaÑhÐÓÔ eÔiÔèdÓÙ Ñe Øh ÕÖ ×h eÒìc eÜÓÑÓiÛØ keÒØÖik c ÑÓÒdac eÔeܹ ´ µº À ÙÐÓÔÓÐh×h eÒìc ×Ù×Ø ÑaØÓc aÒÐÕÒeÙ×hc eÔijè×eÛÒ ÔÓÙ eÖga×Ðac CPU emulator ¸ aÒaÐÔei dÙÒaÑik ègkÙ¹ ØhÒ ÔaÖaÔÒÛ ÑèjÓdÓ¸ ØÓ ÓÔÓÐÓ ÓÒÓÑÞÓÙÑe ÕÖh×iÑÓÔÓieÐ Nemu Öec akÓÐÓÙjÐec eÒØÓÐôÒ ÔÓÙ ÔeÖièÕÓÒØai ×Øa dedÓÑèÒa dikØÔÓÙ ÙÔì aÒÐÙ×hº ÉÖh×i¹ ÑÓÔÓiôÒØac eÙÖh×Øikèc ÑejìdÓÙc aÒÐÕÒeÙ×hc ÕaÖakØhÖi×ØikôÒ ×ÙÑÔeÖifÓÖôÒ ÔÓÙ ekdh¹ ÐôÒÓÒØai kaØ ØhÒ ekØèÐe×h ØÓÙ kakìbÓÙÐÓÙ kôdika¸ ØÓ ×Ô×ØhÑa ÑÔÓÖeÐ Òa aÒiÕÒeÔ×ei ØhÒ ÔÔaÖÜh diafÓÖeØikôÒ ØÔÔÛÒ kakìbÓÙÐÓÙ kôdika ×e dedÓÑèÒa dikØÔÓÙº Çi eÙÖh×Øikèc ÑèjÓdÓi ÔÓÙ èÕÓÙÑe aÒaÔØÔÜei aÒiÕÒeÔÓÙÒ Ñe akÖÐbeia ØÓÙc ÔiÓ eÙÖèÛc diadedÓÑèÒÓÙc ØÔÔÓÙc eÔijè×eÛÒ ìÔÛc Ói ÔÓÐÙÑÓÖfikèc kai Ói ÑeØaÑÓÖfikèc eÔijè×eicº À ØeÕÒik de ba×ÐÞeØai ×Øh ÕÖ ×h ÙÔÓgÖafôÒ¸ ÓÔìØe ÑÔÓÖeÐ Òa aÒiÕÒeÔei eÔijè¹ ×eic ÔÓÙ deÒ ØaÒ ÔÖÓhgÓÙÑèÒÛc gÒÛ×Øècº ÌaÙØìÕÖÓÒa¸ h ÔÖagÑaØik ekØèÐe×h ØÓÙ kakìbÓÙÐÓÙ kôdika Øhc eÔÐje×hc ×ØÓÒ eÜÓÑÓiÛØ kaji×Ø Øh ÑèjÓdÓ aÒjekØik ×e eܹ eÐigÑèÒec ØeÕÒikèc aÔìkÖÙÝhc kôdikaº EÔiÔÐèÓÒ¸ kje eÐ×ÓdÓc eÐègÕeØai aÙØìÒÓÑa¸ gegÓÒìc ÔÓÙ kaji×Ø Øh ÑèjÓdÓ aÔÓØeÐe×ÑaØik ×ØhÒ aÒÐÕÒeÙ×h ×ØÓÕeÙÑèÒÛÒ eÔijè¹ ×eÛÒº À ÔeiÖaÑaØik aÜiÓÐìgh×h Øhc ÑejìdÓÙ Ñe èÒa ÑegÐÓ eÔÖÓc deigÑØÛÒ ÔÖagÑaØikôÒ eÐÒai ÔiÓ aÒjekØikì ×e eÜeÐigÑèÒec ØeÕÒikèc ×Ù×kìØi×hc jè×eÛÒ èdeiÜe ìØi ØÓ eÔi Nemu ×e ×ÔgkÖi×h Ñe ÔÖÓhgÓÔÑeÒec ÑejìdÓÙcº EkØeÒeÐc dÓkiÑèc Ñe ÔÖagÑaØik kai ØeÕÒhØ dedÓÑèÒa èdeiÜaÒ ìØi h ÔÖÓØeiÒìÑeÒh ÑèjÓdÓc deÒ ÔaÖgei e×faÐÑèÒec aÒiÕÒeÔ×eicº Gia Òa ekØiÑ ×ÓÙÑe ØhÒ aÔÓØeÐe×ÑaØikìØhØa Øhc ÔÖÓ×èggi× c Ñac ÙÔì ÔÖagÑaØikèc ×ÙÒj kec¸ ØÓ ×Ô×ØhÑa egkaØa×Øjhke ×e dÐkØÙa ÓÖgaÒi×ÑôÒ ìÔÓÙ eÜèØaÞe Øa ÔÖagÑaØik dedÓÑèÒa ÅeØ aÔì èÒa kai ÔÐèÓÒ ÕÖìÒÓ ×ÙÒeÕÓÔc ÐeiØÓÙÖgÐac¸ ØÓ aÒÐÕÒeÙ×e ÔeÖi×¹ dikØÔÓÙº Nemu ×ìØeÖec aÔì ½º¾ ekaØÓÑÑÔÖia eÔijè×eic eÒaÒØÐÓÒ ÔÖagÑaØikôÒ ÙÔÓÐÓgi×ØôÒ ×Øa ÔaÖaÔÒÛ dÐkØÙaº ÈaÖÓÙ×iÞÓÙÑe Ñia ekØeÒ aÒÐÙ×h ØÛÒ eÔijè×eÛÒ ÔÓÙ aÒiÕÒeÔjhkaÒ¸ e×ØiÞÓÒ¹ Øac ×Øh dÓÑ kai Øh ÐeiØÓÙÖgÐa ØÓÙ kakìbÓÙÐÓÙ kôdika Øhc eÔÐje×hc¸ kajôc kai ×Øh ×ÙÒÓÐik dÖa×ØhÖiìØhØa ×e ×Õè×h Ñe Øic dikØÙakèc ÙÔhÖe×Ðec ÔÓÙ dèÕjhkaÒ eÔijè×eicº EÔìÔØhc: ÃajhghØ c EÙggeÐÓc ÅaÖkaØÓc vi Acknowledgments I want to thank many people who in one way or another have contributed to this work by sharing time, ideas, knowledge, experience, enthusiasm, drinks, and love. Without their help, this thesis simply would never have finished. I am grateful to my advisor Prof. Evangelos Markatos for being a great mentor and a real teacher. Since the days I began working at FORTH as an undergraduate, his endless energy and positive attitude always gave me the strength to go on. I am also indebted to Kostas Anagnostakis for his invaluable advice and 24/7 support. A huge thanks to them for providing me with such a great research experience. Above all, I am really lucky to have made two true friends. The members of my committee—Angelos Bilas, Vasilis Siris, Angelos Keromytis, Sotiris Ioannidis, Maria Papadopouli, and Athanasios Mouchtaris—have provided valuable suggestions and feedback. I thank them for the time they devoted for re- viewing my thesis and for agreeing to serve on the committee on a very short notice. The years at CSD and the DCS Lab at FORTH-ICS are unforgettable. The fun I had with Manos Moschous, Giorgos Dimitriou, Spiros Antonatos, Dimitris Koukis, Elias Athanasopoulos, Dimitris Antoniadis, Christos Papachristos, Perik- lis Akritidis, Manolis Stamatogiannakis, Antonis Papadogiannakis, Iasonas Polakis, Manos Athanatos, Vasilis Papas, Alexandros Kapravelos, Giorgos Vasiliadis, Nikos Nikiforakis, Michalis Foukarakis, and all the other colleagues at the lab was unprece- dented. Thank you guys! I am particularly grateful to Niels Provos who encouraged me to pursue an in- ternship at Google, and has ever since been providing invaluable knowledge and wise guidance. I would also like to thank Panayiotis Mavrommatis, Therese Pasquesi, and all my friends and colleagues in Mountain View. A big shout out to my friends Chrisa Farsari, Nikos Spernovasilis, Kristi Plousaki, Antonis Fouskis, Eva Syntichaki, Giorgos Lyronis, Lena Sarri, Theodoros Tziatzios, Nikos Thanos, Eleni Milaki, Chara Chrisoulaki. I am grateful to my parents,