Classification of Malware Using Reverse Engineering and Data

Total Page:16

File Type:pdf, Size:1020Kb

Classification of Malware Using Reverse Engineering and Data CLASSIFICATION OF MALWARE USING REVERSE ENGINEERING AND DATA MINING TECHNIQUES A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master of Science Ravindar Reddy Ravula August, 2011 CLASSIFICATION OF MALWARE USING REVERSE ENGINEERING AND DATA MINING TECHNIQUES Ravindar Reddy Ravula Thesis Approved: Accepted: ______________________________ ______________________________ Advisor Department Chair Dr. Kathy J. Liszka Dr. Chien-Chung Chan ______________________________ ______________________________ Committee Member Dean of the College Dr. Chien-Chung Chan Dr. Chand K. Midha ______________________________ ______________________________ Committee Member Dean of the Graduate School Dr. Zhong-Hui Duan Dr. George R. Newkome _____________________________ Date ii ABSTRACT Detecting new and unknown malware is a major challenge in today’s software security profession. A lot of approaches for the detection of malware using data mining techniques have already been proposed. Majority of the works used static features of malware. However, static detection methods fall short of detecting present day complex malware. Although some researchers proposed dynamic detection methods, the methods did not use all the malware features. In this work, an approach for the detection of new and unknown malware was proposed and implemented. 582 malware and 521 benign software samples were collected from the Internet. Each sample was reverse engineered for analyzing its effect on the operating environment and to extract the static and behavioral features. The raw data extracted from the reverse engineering was preprocessed and two datasets are obtained: dataset with reversed features and dataset with API Call features. Feature reduction was performed manually on the dataset with reversed features and the features that do not contribute to the classification were removed. Machine learning classification algorithm, J48 was applied to dataset with reversed features to obtain classification rules and a decision tree with the rules was obtained. To reduce the tree size and to obtain optimum number of decision rules, attribute values in the dataset with reversed features were discretized and iii another dataset was prepared with discretized attribute values. The new dataset was applied to J48 algorithm and a decision tree was generated with another set of classification rules. To further reduce the tree and number of decision rules, the dataset with discretized features was subjected to a machine learning tool, BLEM2 which is based on the rough sets and produces decision rules. To test the accuracy of the rules, the dataset with decision rules from BLEM2 was given as input to J48 algorithm. The same procedure was followed for the dataset with API Call features. Another set of experiments was conducted on the three datasets using Naïve Bayes classifier to generate training model for classification. All the training models were tested with an independent training set. J48 decision tree algorithm produced better results with DDF and DAF datasets with accuracies of 81.448% and 89.140% respectively. Naïve Bayes classifier produced better results with DDF dataset with an accuracy of 85.067%. iv AKNOWLEDGMENTS I would like to express my sincere gratitude to people who are the reason for making this research possible. I want to express my heartiest thanks to Dr. Kathy J. Liszka for giving me the opportunity to work on the thesis. Her invaluable guidance and support at every stage has led to successful conclusion of the study. I would like to thank Dr. Chien Chung Chan for his expert advice in data mining and the insightful suggestions that have been very helpful in the study. In addition, I want to thank Dr. Zhong Hui Duan for taking time and willing to be on the thesis committee. I want to convey my special thanks to my parents, sister, brother, brother-in-law, cousin and friends for their love and continuous encouragement. Their blessings and moral support have been invaluable at every stage of my life. Thank you all for standing by me at all times. v TABLE OF CONTENTS Page LIST OF TABLES…..……………………………………………………………….....viii LIST OF FIGURES...………………………………………………………………….....ix CHAPTER I. INTRODUCTION………………………………………………………………...1 II. LITERATURE REVIEW…………………………………...................................4 III. TYPES OF MALWARE AND ANTI-MALWARE DEFENSE TECHNIQUES…………………………………………………………………..15 3.1 Malware Types……………………………………………………………….15 3.1.1 Virus………………………………………………………………….15 3.1.2 Worm………………………………………………………………...18 3.1.3 Backdoor……………………………………....................................19 3.1.4 Trojan Horse………………………………………………………....19 3.1.5 Rootkit………………………………………....................................20 3.1.6 Spyware………………………………………………………………21 3.1.7 Adware…………………………………………………………….....21 3.2 Antivirus Detection Techniques………..……………………………...…….21 3.2.1 Signature Based Detection…………...……………………………....22 vi 3.2.2 Heuristic Approach………………………….....................................23 3.2.3 Sandbox Approach…………………………………………………...23 3.2.4 Integrity Checking…………………………………………………...24 IV. REVERSE ENGINEERING……………………………………………………..25 4.1 Controlled Environment……………………………………………….……..25 4.2 Experimental Setup…………………………………………………………..27 4.3 Static Analysis……………………………………………………………….28 4.3.1 Cryptographic Hash Function………………………...……………...28 4.3.2 Packer Detection……………………………………………………..29 4.3.3 Code Analysis………………………………………..………………31 4.4 Dynamic Analysis……………………………………………………………33 4.4.1 File System Monitor…………………………………………………33 4.4.2 Registry Monitor……………………………………………………..34 4.4.3 API Call Tracer………………………………………………………36 V. DATA MINING……………………………………………………………….…38 5.1 System Design……………………………………………………………….38 5.2 KDD Process…………………………………………………………………40 5.2.1 Target Data………………………………………………………...…44 5.2.2 Preprocessing………………………………………………………...45 5.2.3 Transformation……………………………………………………….45 5.2.4 Data Mining………………………………………………………….47 5.2.5 Interpretation/Evaluation…………………………………………….51 VI. RESULTS AND DISCUSSIONS………………………………………………..52 vii 6.1 Experiment 1: Classification of DRF…………………………………..…….52 6.2 Experiment 2: Classification of DDF…………………………………….….56 6.3 Experiment 3: Classification of DDF using BLEM2………………………...59 6.4 Experiment 4: Classification of DDF from BLEM2 using J48..…………….62 6.5 Experiment 5: Classification of DAF………………………………………..64 6.6 Experiment 6: Classification of DAF using BLEM2……..……………….…68 6.7 Experiment 7: Classification of DAF from BLEM2 using J48……………...68 6.8 Accuracies……………………………………………………………………72 6.9 Pattern in API Call Frequencies……………………………………………...75 VII. CONCLUSIONS AND FUTURE WORK………………………………………77 7.1 Conclusions…………………………………………………………………..77 7.2 Future Work……………………………………………………………….…78 REFERENCES…………………………………………………………………………..79 APPENDICES…………………………………………………………………………...83 APPENDIX A. DATASETS…………………………………………………….83 viii LIST OF TABLES Table Page 5.1 Attributes in DRF…………………………………………………………………….44 5.2 Attributes in DRF after Transformation…………………………………………..…46 5.3 Discretized Values………………………………………………………………...…47 6.1 Decision rules for DRF for the decision label “YES”………….……………………54 6.2 Decision rules for DRF for the decision label “NO”……………….………………..55 6.3 Decision rules for DDF for the decision label “YES”…………….…………………57 6.4 Decision rules for DDF for the decision label “NO”……………….………………..58 6.5 BLEM2 rules for DDF for the decision label “YES”………………………………..59 6.6 BLEM2 rules for DDF for the decision label “NO”………………………………....61 6.7 Decision rules for DDF from BLEM2 for the decision label “YES”………………..63 6.8 Decision rules for DDF from BLEM2 for the decision label “NO”………………....63 6.9 Decision rules for DAF for the decision label “YES”……………….………………65 6.10 Decision rules for DAF for the decision label “NO”…..………….………………..66 6.11 Decision rules for DAF from BLEM2 for the decision label “YES”………………69 6.12 Decision rules for DAF from BLEM2 for the decision label “NO”………………..70 6.13 Testing set Results against Training Models from Experiments 1, 2 and 3………..73 6.14 Testing set Results against Training Models from Experiments 4 and 5…………..73 6.15 Experimental Results from Naïve Bayes Classifier………………………………...73 ix A1: An Instance for Attributes File Name, File Size and MD5 Hash in DRF…………..84 A2: An Instance for Attributes Packer, File Access, Directory Access and Internet Access in DRF……………………………………………………………………………………84 A3: API Calls Accessed By the Trojan…………………………………………………..85 A4: DLLs Accessed By the Trojan………………………………………………………85 A5: Registry Keys Added By the Trojan………………………………………………...86 A6: Registry Keys Modified By the Trojan……………………………………………...86 A7: Registry Keys Deleted By the Trojan……………………………………………….88 A8: URL References Made By the Trojan………………………………………………88 A9: Programming Language used, Strings and Decision label of the Trojan…………...89 A10: An Instance of DRF Dataset after Preprocessing………………………………….89 A11: An Instance of DDF Dataset……………………………………………………….90 A12: An Instance of DAF Dataset……………………………………………………….90 x LIST OF FIGURES Figure Page 3.1 Typical Malware Signature…………………………………………………………..22 4.1 Snapshot Manager……………………………………………………………………27 4.2 Normal PE File……………………………………………………………………....30 4.3 Packed PE File……………………………………………………………………….30 4.4 PEiD………………………………………………………………………………….31 4.5 IDA Pro Disassembler……………………………………………………………….32 4.6 File Monitor………………………………………………………………………….34 4.7 Registry Monitor……………………………………………………………………..35 4.8 Registry Key Changes Made by a PE………………………………………………..36 4.9 Maltrap……………………………………………………………………………….37
Recommended publications
  • New Malware Analysis Method on Digital Forensics
    ISSN (Print) : 0974-6846 Indian Journal of Science and Technology, Vol 8(17), DOI: 10.17485/ijst/2015/v8i17/77209, August 2015 ISSN (Online) : 0974-5645 New Malware Analysis Method on Digital Forensics Sunghyuck Hong and Sungjin Lee* Division of Information and Communication, Baekseok University, Korea; [email protected], [email protected] Abstract Recently, Internet usage and development of web technique are getting increased rapidly. The number of Malware occurrence has rapidly increased and new or various types of Malware have been advanced and progressed, so it is time to require analysis for malicious codes in order to defense system. However, current defense mechanisms are always one step behind of Malware attacks and there is not much research on Malware analysis. The behavior of Malware is similar Therefore, we propose to a new approach for Malware analysis method based on registry analysis. to common applications. It is difficult to detect Malware by its behavior. Malware’s registry must be analyzed to detect. Keywords: Digital Forensics, Malware, Network Security, Registry Analysis 1. Introduction cutable code, scripts, active content and other software4. Malware is often disguised as, or embedded in, non- Malware, short for malicious software, is any software used malicious files. As of 2011 the majority of active Malware to disrupt computer operation, gather sensitive informa- threats were worms or Trojans rather than viruses5. In law, tion or gain access to private computer systems. It can Malware is sometimes known as a computer contaminant, appear in the form of executable code, scripts, active con- as in the legal codes of several U.S.
    [Show full text]
  • What Are Kernel-Mode Rootkits?
    www.it-ebooks.info Hacking Exposed™ Malware & Rootkits Reviews “Accessible but not dumbed-down, this latest addition to the Hacking Exposed series is a stellar example of why this series remains one of the best-selling security franchises out there. System administrators and Average Joe computer users alike need to come to grips with the sophistication and stealth of modern malware, and this book calmly and clearly explains the threat.” —Brian Krebs, Reporter for The Washington Post and author of the Security Fix Blog “A harrowing guide to where the bad guys hide, and how you can find them.” —Dan Kaminsky, Director of Penetration Testing, IOActive, Inc. “The authors tackle malware, a deep and diverse issue in computer security, with common terms and relevant examples. Malware is a cold deadly tool in hacking; the authors address it openly, showing its capabilities with direct technical insight. The result is a good read that moves quickly, filling in the gaps even for the knowledgeable reader.” —Christopher Jordan, VP, Threat Intelligence, McAfee; Principal Investigator to DHS Botnet Research “Remember the end-of-semester review sessions where the instructor would go over everything from the whole term in just enough detail so you would understand all the key points, but also leave you with enough references to dig deeper where you wanted? Hacking Exposed Malware & Rootkits resembles this! A top-notch reference for novices and security professionals alike, this book provides just enough detail to explain the topics being presented, but not too much to dissuade those new to security.” —LTC Ron Dodge, U.S.
    [Show full text]
  • Tools and Techniques for Malware Detection and Analysis
    Tools and Techniques for Malware Detection and Analysis Sajedul Talukder Department of Mathematics and Computer Science Edinboro University, PA, USA [email protected] Abstract—One of the major and serious threats that the in volume (growing threat landscape), variety (innovative ma- Internet faces today is the vast amounts of data and files which licious methods) and velocity (fluidity of threats). These are need to be evaluated for potential malicious intent. Malicious evolving, becoming more sophisticated and using new ways to software, often referred to as a malware that are designed by target computers and mobile devices. McAfee [6] catalogs over attackers are polymorphic and metamorphic in nature which have 100,000 new malware samples every day means about 69 new the capability to change their code as they spread. Moreover, the threats every minute or about one threat per second. With the diversity and volume of their variants severely undermine the effectiveness of traditional defenses which typically use signature increase in readily available and sophisticated tools, the new based techniques and are unable to detect the previously unknown generation cyber threats/attacks are becoming more targeted, malicious executables. The variants of malware families share persistent and unknown. The advanced malware are targeted, typical behavioral patterns reflecting their origin and purpose. unknown, stealthy, personalized and zero day as compared to The behavioral patterns obtained either statically or dynamically the traditional malware which were broad, known, open and can be exploited to detect and classify unknown malware into one time. Once inside, they hide, replicate and disable host their known families using machine learning techniques.
    [Show full text]
  • Malicious Software
    Malicious Software Virus - a piece of software that can “infect” other programs by modifying them in the file system; the modification includes injecting the original program with a routine to make copies of the virus program, which can then go on to infect other programs. Virus code can be prepended or appended to an executable program, or it can be embedded in some other fashion. The key to its operation is that the infected program, when invoked, will first execute the virus code and then execute the original code of the program. Example: sality PE infector http://www.symantec.com/content/en/us/enterprise/media/security_response/whitepapers/sality_peer_to_peer_viral_network.pdf Encrypted virus: A portion of the virus creates a random encryption key and encrypts the remainder of the virus. The key is stored with the virus. When an infected program is invoked, the virus uses the stored random key to decrypt the virus. When the virus replicates, a different random key is selected. Execution Flow of an Infected File Types of Viruses: • Boot sector infector: Infects a master boot record or boot record and spreads when a system is booted from the disk containing the virus. Example: stoned, Michelangelo, mebroot, rovnix • File infector: Infects files that the operating system or shell consider to be executable. • Macro virus: Infects files with macro code that is interpreted by an application. Example: Melissa, iloveyou Macro viruses were popular until about the year 2000 when Windows executables became the prominent form. Virus prevalence by type However, in 2014, there were some new macro viruses created and social engineering was used to turn off security controls which disabled macros by default: http://blog.dynamoo.com/ show macros 2014 Macro Virus in a MS Office document used Social Engineering to get macros turned on to distribute the Napolar virus.
    [Show full text]
  • K-Tracer: a System for Extracting Kernel Malware Behavior
    K-Tracer: A System for Extracting Kernel Malware Behavior Andrea Lanzi1,2 Monirul Sharif1 Wenke Lee1 1School of Computer Science, College of Computing, Georgia Institute of Technology, USA {msharif, wenke}@cc.gatech.edu 2Dipartimento di Informatica e Comunicazione, Universita` degli Studi di Milano, Italy [email protected] Abstract install backdoors, and maliciously manipulate other system level data. Kernel rootkits can easily hide activities from Kernel rootkits can provide user level-malware programs user-level programs and at the same time cripple kernel- with the additional capabilities of hiding their malicious ac- level security programs. tivities by altering the legitimate kernel behavior of an op- Since rootkits are not used alone, discovering their ma- erating system. While existing research has studied rootkit licious behaviors is essential in understanding the impli- hooking behavior in an effort to help develop defense and cations of malware that utilizes them. Determining that remediation mechanisms, automated analysis of the actual a rootkit has a backdoor capability that waits for network malicious goals and capabilities of rootkits has not been packets on a specific port with specific contents can enable adequately investigated. In this paper, we present an ap- the identification of infected hosts in a network by way of proach based on a combination of backward slicing and examining the network traffic. Network perimeter defense chopping techniques that enables automatic discovery of systems can also use this information to stop further mali- the system data manipulation behaviors of rootkits. We have cious activities on infected hosts. Rootkits that hide pro- built a system called K-Tracer that can dynamically analyze cesses with specific names can help identify the malicious Windows kernel-level code and extract malicious behaviors programs that are being hidden by the rootkit.
    [Show full text]
  • Introduction to Malware
    The Dark side of the Internet: Introduction to Malware By: Brian Pohlman Master of Science in Information Security Lewis University December 2009 1 (Page intentionally left blank) 2 ABSTRACT The Internet can now be accessed almost anywhere by various means, especially through mobile devices. These devices allow users to connect to the Internet from anywhere there is a wireless network supporting that device’s technology. Services of the Internet, including email and the web, may be available. The Internet has also become a large market for companies [1]; some of the biggest companies today have grown by taking advantage of the efficient nature of low-cost advertising and commerce through the Internet. It is the fastest way to spread information to a vast number of people simultaneously. The Internet has also revolutionized shopping. For example, a person can order a CD online and receive it in the mail within a couple of days, or they can download it directly and have it in a matter of minutes. The Internet has also greatly facilitated personalized marketing which allows a company to market a product to a specific person or a specific group of people more so than any other advertising medium. Examples of personalized marketing include online communities such as MySpace, Facebook, and Twitter. The low-cost and nearly instantaneous sharing of ideas, knowledge, and skills has made collaborative work much easier. Not only can a group cheaply communicate and share ideas, but the wide reach of the Internet allows such groups to easily form in the first place. The Internet also allows users to remotely access other computers and information easily, wherever they may be located in the world.
    [Show full text]
  • Analysis and Defense of Emerging Malware Attacks
    ANALYSIS AND DEFENSE OF EMERGING MALWARE ATTACKS A Dissertation by ZHAOYAN XU Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Chair of Committee, Guofei Gu Committee Members, Jyh-Charn Liu Riccardo Bettati Weiping Shi Head of Department, Nancy Amato August 2014 Major Subject: Computer Engineering Copyright 2014 Zhaoyan Xu ABSTRACT The persistent evolution of malware intrusion brings great challenges to current anti-malware industry. First, the traditional signature-based detection and preven- tion schemes produce outgrown signature databases for each end-host user and user has to install the AV tool and tolerate consuming huge amount of resources for pair- wise matching. At the other side of malware analysis, the emerging malware can detect its running environment and determine whether it should infect the host or not. Hence, traditional dynamic malware analysis can no longer find the desired malicious logic if the targeted environment cannot be extracted in advance. Both these two problems uncover that current malware defense schemes are too passive and reactive to fulfill the task. The goal of this research is to develop new analysis and protection schemes for the emerging malware threats. Firstly, this dissertation performs a detailed study on recent targeted malware attacks. Based on the study, we develop a new technique to perform effectively and efficiently targeted malware analysis. Second, this disserta- tion studies a new trend of massive malware intrusion and proposes a new protection scheme to proactively defend malware attack. Lastly, our focus is new P2P malware.
    [Show full text]
  • A Transparent Malware Analysis Tool for Integrating Dynamic and Static Examination
    Scholars' Mine Masters Theses Student Theses and Dissertations Spring 2010 EtherAnnotate: a transparent malware analysis tool for integrating dynamic and static examination Joshua Michael Eads Follow this and additional works at: https://scholarsmine.mst.edu/masters_theses Part of the Computer Sciences Commons Department: Recommended Citation Eads, Joshua Michael, "EtherAnnotate: a transparent malware analysis tool for integrating dynamic and static examination" (2010). Masters Theses. 4762. https://scholarsmine.mst.edu/masters_theses/4762 This thesis is brought to you by Scholars' Mine, a service of the Missouri S&T Library and Learning Resources. This work is protected by U. S. Copyright Law. Unauthorized use including reproduction for redistribution requires the permission of the copyright holder. For more information, please contact [email protected]. ETHERANNOTATE: A TRANSPARENT MALWARE ANALYSIS TOOL FOR INTEGRATING DYNAMIC AND STATIC EXAMINATION by JOSHUA MICHAEL EADS A THESIS Presented to the Faculty of the Graduate School of MISSOURI UNIVERSITY OF SCIENCE AND TECHNOLOGY in Partial Fulfillment of the Requirements for the Degree MASTER OF SCIENCE IN COMPUTER SCIENCE 2010 Approved by Dr. Ann Miller, Advisor Dr. Bruce McMillin Dr. Daniel Tauritz Copyright 2010 Joshua Michael Eads All Rights Reserved iii ABSTRACT Software security researchers commonly reverse engineer and analyze current malicious software (malware) to determine what the latest techniques malicious at- tackers are utilizing and how to protect computer systems from attack. The most common analysis methods involve examining how the program behaves during ex- ecution and interpreting its machine-level instructions. However, modern malicious applications use advanced anti-debugger, anti-virtualization, and code packing tech- niques to obfuscate the malware’s true activities and divert security analysts.
    [Show full text]
  • Guide to Malware Incident Prevention and Handling for Desktops and Laptops
    NIST Special Publication 800-83 Revision 1 Guide to Malware Incident Prevention and Handling for Desktops and Laptops Murugiah Souppaya Karen Scarfone C O M P U T E R S E C U R I T Y NIST Special Publication 800-83 Revision 1 Guide to Malware Incident Prevention and Handling for Desktops and Laptops Murugiah Souppaya Computer Security Division Information Technology Laboratory Karen Scarfone Scarfone Cybersecurity Clifton, VA July 2013 U.S. Department of Commerce Cameron F. Kerry, Acting Secretary National Institute of Standards and Technology Patrick D. Gallagher, Under Secretary of Commerce for Standards and Technology and Director Authority This publication has been developed by NIST to further its statutory responsibilities under the Federal Information Security Management Act (FISMA), Public Law (P.L.) 107-347. NIST is responsible for developing information security standards and guidelines, including minimum requirements for Federal information systems, but such standards and guidelines shall not apply to national security systems without the express approval of appropriate Federal officials exercising policy authority over such systems. This guideline is consistent with the requirements of the Office of Management and Budget (OMB) Circular A-130, Section 8b(3), Securing Agency Information Systems, as analyzed in Circular A- 130, Appendix IV: Analysis of Key Sections. Supplemental information is provided in Circular A-130, Appendix III, Security of Federal Automated Information Resources. Nothing in this publication should be taken to contradict the standards and guidelines made mandatory and binding on Federal agencies by the Secretary of Commerce under statutory authority. Nor should these guidelines be interpreted as altering or superseding the existing authorities of the Secretary of Commerce, Director of the OMB, or any other Federal official.
    [Show full text]
  • Malware Analysis and Mitigation in Information Preservation
    IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 20, Issue 4, Ver. I (Jul - Aug 2018), PP 53-62 www.iosrjournals.org Malware Analysis and Mitigation in Information Preservation Aru Okereke Eze and Chiaghana Chukwunonso E. Department Of Computer Engineering, Michael Okpara University Of Agriculture, Umudike Umuahia, Abia State-Nigeria. Corresponding Author: Aru Okereke Eze Abstract: Malware, also known as malicious software affects the user’s computer system or mobile devices by exploiting the system’s vulnerabilities. It is the major threat to the security of information in the computer systems. Some of the types of malware that are most commonly used are viruses, worms, Trojans, etc. Nowadays, there is a widespread use of malware which allows malware author to get sensitive information like bank details, contact information, which is a serious threat in the world. Most of the malwares are spread through internet because of its frequent use which can destroy large information in any system. Malwares from their early designs which were just for propagation have now developed into more advanced form, stealing sensitive and private information. Hence, this work focuses on analyzing the malware in a restricted environment and how information can be preserved. So, in other to address the negative effects of malicious software, we discussed some of the malware analysis methods which was used to analyze the software in an effective manner and helped us to control them. Various malware detection coupled with malware propagation techniques were also highlighted. This work was concluded by examining malware mitigation strategies which can help us protect our system’s information.
    [Show full text]
  • GOLDENEYE: Efficiently and Effectively Unveiling Malware’S Targeted Environment
    GOLDENEYE: Efficiently and Effectively Unveiling Malware’s Targeted Environment Zhaoyan Xu1, Jialong Zhang1, Guofei Gu1, and Zhiqiang Lin2 1Texas A&M University, College Station, TX fz0x0427, jialong, [email protected] 2The University of Texas at Dallas, Richardson, TX [email protected] Abstract. A critical challenge when combating malware threat is how to effi- ciently and effectively identify the targeted victim’s environment, given an un- known malware sample. Unfortunately, existing malware analysis techniques ei- ther use a limited, fixed set of analysis environments (not effective) or employ ex- pensive, time-consuming multi-path exploration (not efficient), making them not well-suited to solve this challenge. As such, this paper proposes a new dynamic analysis scheme to deal with this problem by applying the concept of speculative execution in this new context. Specifically, by providing multiple dynamically created, parallel, and virtual environment spaces, we speculatively execute a mal- ware sample and adaptively switch to the right environment during the analysis. Interestingly, while our approach appears to trade space for speed, we show that it can actually use less memory space and achieve much higher speed than existing schemes. We have implemented a prototype system, GOLDENEYE, and evalu- ated it with a large real-world malware dataset. The experimental results show that GOLDENEYE outperforms existing solutions and can effectively and effi- ciently expose malware’s targeted environment, thereby speeding up the analysis in the critical battle against the emerging targeted malware threat. Keywords: Dynamic Malware Analysis, Speculative Execution 1 Introduction In the past few years, we have witnessed a new evolution of malware attacks from blindly or randomly attacking all of the Internet machines to targeting only specific systems, with a great deal of diversity among the victims, including government, mil- itary, business, education, and civil society networks [17,24].
    [Show full text]