Robust and Efficient Malware Analysis and Host-Based Monitoring

ROBUST AND EFFICIENT MALWARE ANALYSIS AND HOST-BASED MONITORING A Thesis Presented to The Academic Faculty by Monirul I. Sharif In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the School of Computer Science Georgia Institute of Technology December 2010 ROBUST AND EFFICIENT MALWARE ANALYSIS AND HOST-BASED MONITORING Approved by: Wenke Lee, Advisor Santosh Pande School of Computer Science School of Computer Science Georgia Institute of Technology Georgia Institute of Technology Jonathon Giffin, Co-advisor Douglas Blough School of Computer Science School of Electrical and Computer Georgia Institute of Technology Engineering Georgia Institute of Technology Mustaque Ahamad Date Approved: September 2010 School of Computer Science Georgia Institute of Technology To my parents, and to those who dedicate their lives for the betterment of others, expecting little to nothing in return. iii ACKNOWLEDGEMENTS It is a pleasure to thank those without whose help and support this thesis would not have been possible. I owe my deepest gratitude to my advisor Wenke Lee for his uncomparable guidance and support over the years. Because of his thirst for excellence, steady patience and belief in my abilities, I was able to have enough freedom to choose and focus on challenging yet impactful problems that I felt deeply passionate towards. I am grateful to my co-advisor, Jonathon Giffin, who helped me build several skills necessary to be a good researcher and with whom I had the pleasure of working on several interesting research problems. I would love to thank my other committee members Mustaque Ahamad, Santosh Pande and Douglas Blough for their valuable comments and support regarding the thesis and my research. I would like to express my gratitude and best regards to Weidong Cui, whose support, collaboration and guidance was essential for my research and the thesis. I am indebted to many of my colleagues and friends of my research group at Georgia Tech. It was my pleasure to be able to work with my friend and colleague Andrea Lanzi, whose time here at Georgia Tech resulted in a period of great teamwork and collaboration on many interesting projects. I would love to thank Paul Royal whom I had the pleasure to work with on many problems. I had the opportunity to colloborate with many of my friends in the group, including Kapil Singh, Bryane Payne and Martim Carbone. Besides them, I am thankful to all the other friends in my research group for their support and for providing a wonderful environment that made work and research an enjoyment. I would not have succeeded in finishing my PhD in the way I have without the support from my community and friends. Among many, I would like to specially iv thank Nova, Arshad, Moin, Farzana, Lopa, Shajib, Abir, Towhid, and Shefaet. My life in Atlanta was always full of joy and happiness for all of my friends. I am deeply grateful to my family for their love and support. Without any doubt, the unwavering and unconditional love and inspiration of my parents throughout my life has lead me to be were I am. My father, who is a constant source of inspiration and confidence, was the role model I followed throughout the life. My mother's affection and view of life has made me see the world from a different perspective and has given me the courage to go through the struggling periods of my PhD. Words are not sufficient to express how grateful I always am to my brother Roni and my sister Lipa. Even though they were far away during my PhD, I would feel them always close, being there for me at the moments I would need them the most. I would like to thank my wife, Farhana Aleen, whose support, inspiration and love was absolutely essential for success of the work done for PhD dissertation. At times, our common big dreams in life would sometimes help fight and get through the most challenging moments in my PhD. Above all, I thank the One Whom I am ever grateful for bestowing guidance, intellect, knowledge, patience, wisdom, and essentially everything virtuous that might exist in me. Lastly, I offer my regards and blessings to all of those who I have not mentioned, but who have supported me in any respect during my PhD student life, which lead me to produce this dissertation. v TABLE OF CONTENTS DEDICATION .................................. iii ACKNOWLEDGEMENTS .......................... iv LIST OF TABLES ............................... xi LIST OF FIGURES .............................. xii SUMMARY .................................... xiii I INTRODUCTION ............................. 1 1.1 Motivation and Goals . 1 1.1.1 Malware Detection Process . 3 1.1.2 Limitations of Current Malware Analysis Approaches . 4 1.1.3 Limitations in System Monitoring Techniques . 6 1.2 Thesis Overview . 6 II BACKGROUND AND RELATED WORK ............. 10 2.1 Malware and Evolution of Their Defenses . 10 2.1.1 Earlier Attacks on Malware Detection . 11 2.1.2 Recent Forms of Attacks . 12 2.1.3 Code Obfuscation . 13 2.1.4 Analysis Evasion . 15 2.2 Malware Analysis . 15 2.3 Host-Monitoring and Anti-malware Security Tools . 17 2.3.1 Traditional Active Monitoring Inside the Host . 19 2.3.2 Passive External Monitoring Approaches . 21 2.3.3 Active External Monitoring . 22 2.3.4 Hypervisor-based Active Monitoring Inside Host . 23 III ENABLING STATIC MALWARE ANALYSIS ........... 25 3.1 Motivation . 25 vi 3.2 The Eureka Framework . 27 3.3 Previous Work . 27 3.4 Coarse-grained Execution-based Unpacking . 29 3.4.1 Heuristics-based unpacking . 30 3.4.2 Statistics-based unpacking . 32 3.5 API Resolution Techniques . 35 3.5.1 Background: standard API resolution . 36 3.5.2 Resolving obfuscated APIs without the import tables and IAT 36 3.6 Evaluation Metrics . 41 3.7 Experimental Results . 43 3.7.1 Benign dataset evaluation: Goat test . 44 3.7.2 Malicious data set evaluation . 45 3.8 Summary . 47 IV ROBUST DYNAMIC MALWARE ANALYSIS ........... 50 4.1 Motivation . 50 4.2 Previous Work . 51 4.3 A Formal Framework . 52 4.3.1 Abstract Model of Program Execution . 53 4.3.2 Transparent Malware Analysis . 53 4.3.3 Requirements for Transparency . 56 4.3.4 Fulfilling the Requirements . 59 4.4 Implementation . 62 4.4.1 Environment . 62 4.4.2 Analyzer Architecture . 63 4.4.3 Using Intel VT Extensions for Malware Analysis . 64 4.4.4 Maintaining Transparency . 67 4.4.5 Potential Attacks . 68 4.4.6 Architectural Limitations . 70 vii 4.5 Summary . 71 V CONDITIONAL CODE OBFUSCATION .............. 72 5.1 Motivation . 72 5.2 Previous Work . 74 5.3 Conditional Code Obfuscation . 77 5.3.1 Overview . 78 5.3.2 General Mechanism . 80 5.3.3 Automation using Static Analysis . 81 5.3.4 Consequences to Existing Analyzers . 85 5.3.5 Brute Force and Dictionary Attacks . 86 5.4 Implementation Approach . 87 5.4.1 Analysis and Transformation Phase . 89 5.4.2 Encryption Phase . 91 5.4.3 Run-time Decryption Process . 92 5.5 Experimental Evaluation . 92 5.6 Discussion . 95 5.6.1 Strengths . 96 5.6.2 Weaknesses . 97 5.7 Summary . 99 VI REVERSING EMULATION-BASED OBFUSCATION ...... 100 6.1 Motivation . 100 6.2 Previous Work . 103 6.2.1 Malware Obfuscation . 103 6.2.2 Reverse Engineering Known Languages . 104 6.2.3 Reverse Engineering Inputs and Protocols . 104 6.3 Background . 104 6.3.1 Using Emulation for Obfuscation . 105 6.3.2 Emulation Techniques . 105 viii 6.4 Reverse Engineering of Emulation . 109 6.4.1 Abstract Variable Binding . 110 6.4.2 Identifying Candidate VPCs . 118 6.4.3 Identifying Emulation Behavior . 119 6.4.4 Extracting Syntax and Semantics . 121 6.5 Implementation . 122 6.5.1 Dynamic Tracing . 123 6.5.2 Clustering . 124 6.5.3 Behavioral Analysis . 124 6.6 Evaluation . 125 6.6.1 Synthetic Tests . 125 6.6.2 Real (Unpacked) Programs . 129 6.6.3 Emulated Malware . 131 6.7 Discussion . 134 6.8 Summary . 136 VII ROBUST AND EFFICIENT MONITORING ............ 138 7.1 Motivation . 138 7.2 Previous Work . 140 7.2.1 Out-of-VM Approaches . 140 7.2.2 Hardware Virtualization . 141 7.2.3 In-lined Monitoring Approaches . 142 7.3 Efficiency and Security Requirements . 143 7.4 Secure In-VM Monitoring . 146 7.4.1 Overall Design . 147 7.4.2 Security Monitor Functionality . 156 7.4.3 Security Analysis . 158 7.5 Implementation . 159 7.5.1 Initialization Phase . 160 ix 7.5.2 Run-time Memory Protection . 163 7.6 Experimental Evaluation . 165 7.6.1 Monitor Invocation Overhead . 165 7.6.2 Security Application Case Studies . 166 7.7 Summary . 170 VIIICONCLUSION AND FUTURE WORK ............... 172 8.1 Summary . 172 8.2 Future Work . 175 8.3 Closing Remarks . 177.

Load more