Master‟s Thesis Computer Science Thesis no: MCS-2011-07 January 2011

Runtime Analysis of

Muhammad Shahid Iqbal Muhammad Sohail

School of Computing Blekinge Institute of Technology SE – 371 39 Karlskrona Sweden This thesis is submitted to the School of Computing at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Computer

Science. The thesis is equivalent to 20 weeks of full time studies.

Contact Information: Authors: Muhammad Sohail E-: [email protected]

Muhammad Shahid Iqbal E-mail: [email protected] University advisor(s): Bengt Carlsson Martin Boldt Department of Systems and Software Engineering School of Computing Internet : www.bth.se/com Blekinge Institute of Technology Phone : +46 455 38 50 00 SE – 371 39 Karlskrona Fax : +46 455 38 50 57 Sweden ii ABSTRACT

Context: Every day increasing number of are spreading around the world and infecting not only end users but also large organizations. This results in massive security threat for private data and expensive computer resources. There is lot of research going on to cope up with this large amount of malicious software. Researchers and practitioners developed many new methods to deal with them. One of the most effective methods used to capture malicious software is dynamic malware analysis. Dynamic analysis methods used today are very time consuming and resource greedy. Normally it could take days or at least some hours to analyze a single instance of suspected software. This is not good enough especially if we look at amount of attacks occurring every day. Objective: To save time and expensive resources used to perform these analyses, AMA: an automated malware analysis system is developed to analyze large number of suspected software. Analysis of any software inside AMA, results in a detailed report of its behavior, which includes changes made to file system, registry, processes and network traffic consumed. Main focus of this study is to develop a model to automate the runtime analysis of software which provide detailed analysis report and evaluation of its effectiveness. Methods: A thorough background study is conducted to gain the knowledge about malicious software and their behavior. Further software analysis techniques are studied to come up with a model that will automate the runtime analysis of software. A prototype system is developed and quasi experiment performed on malicious and benign software to evaluate the accuracy of the newly developed system and generated reports are compared with Norman and Anubis. Results: Based on thorough background study an automated runtime analysis model is developed and quasi experiment performed using implemented prototype system on selected legitimate and benign software. The experiment results show AMA has captured more detailed software behavior then Norman and Anubis and it could be used to better classify software. Conclusions: We concluded that AMA could capture more detailed behavior of the software analyzed and it will give more accurate classification of the software. We also can see from experiment results that there is no concrete distinguishing factors between general behaviors of both types of software. However, by digging a bit deep into analysis report one could understand the intensions of the software. That means reports generated by AMA provide enough information about software behavior and can be used to draw correct conclusions. Keywords: Malware Analysis, Automated malware analysis, malicious software.

ii ACKNOWLEDGMENTS

In the name of Allah the most Merciful and Beneficent

We are thankful to Almighty Allah who gave us this opportunity and strength to accomplish this study to best of our efforts. We would like to express our gratitude to our supervisor Dr. Bengt Carlsson, for both his continuous guidance throughout this period and for always finding the time. We are thankful to our co-supervisor Dr. Martin Boldt for his assistance and guidance in developing prototype of automated malware analyzer. We also like to thank Charlie Svahnberg for providing us with experiment environment and equipment. We are thankful to our friends who supported us throughout this time period and for being with us in all hard times. Last but not the least; we were forever being grateful to our parents for their countless prayers and unconditional support and for always being the best parents due to which we are able to stand here.

iii TABLE OF CONTENTS

RUNTIME ANALYSIS OF MALWARE ...... I

ABSTRACT ...... I

ACKNOWLEDGMENTS ...... III

TABLE OF CONTENTS ...... 1

LIST OF FIGURES ...... 4

LIST OF TABLES ...... 5

1 INTRODUCTION ...... 6

1.1 CHALLENGES OF THE FIELD ...... 7 1.2 STRUCTURE OF THE THESIS ...... 8

2 BACKGROUND ...... 9

2.1 RELATED WORK ...... 9 2.2 AVAILABLE SOLUTIONS ...... 11 2.2.1 Web based Analysis Tools ...... 12 2.2.2 Open source Tools ...... 13 2.2.3 GUI Automation ...... 13 2.2.4 GUI Automation Tools ...... 14 2.3 STATIC ANALYSIS VS. DYNAMIC ANALYSIS ...... 14 2.4 IDENTIFYING LEGITIMATE AND ILLEGITIMATE SOFTWARE ...... 15 2.4.1 Behavior of Legitimate Software ...... 15 2.4.2 Behavior of Illegitimate Software ...... 15 2.5 EMULATION ...... 16 2.6 VIRTUALIZATION ...... 16 2.6.1 Classification of Virtualization ...... 16 2.6.2 VMM Advantage ...... 17 2.7 DIFFERENCE BETWEEN VIRTUALIZATION AND EMULATION ...... 17 2.8 ORACLE VIRTUAL BOX ...... 17

3 RESEARCH DESIGN ...... 19

3.1 RESEARCH MOTIVATION ...... 19 3.2 AIMS AND OBJECTIVES ...... 19 3.3 RESEARCH QUESTION ...... 20 3.4 RESEARCH METHOD ...... 20

1 3.5 BACKGROUND STUDY ...... 22 3.5.1 Search Strategy ...... 22

4 CHARACTERISTICS OF MALICIOUS SOFTWARE ...... 24

4.1 MALWARE SOURCES ...... 24 4.2 OBSERVED MALICIOUS BEHAVIOR ...... 24 4.2.1 File System Activity ...... 24 4.2.2 Registry Activity ...... 25 4.2.3 Network Activity ...... 25 4.2.4 Incomplete Un-installation Process ...... 26

5 DESIGN AND IMPLEMENTATION OF AUTOMATED MALWARE ANALYZER (AMA) ...... 27

5.1.1 System Architecture ...... 27 5.1.2 Order of execution ...... 29 5.1.3 Components ...... 31 5.1.4 Reason to Choose Selected Software ...... 35 5.1.5 Communication ...... 35

6 EXPERIMENT ...... 37

6.1 GOAL ...... 37 6.2 PLANNING ...... 37 6.2.1 Experiment Environment ...... 37 6.2.2 Variable Selection ...... 38 6.2.3 Subject Selection ...... 38 6.3 EXPERIMENT DESIGN ...... 38 6.3.1 Instrumentation ...... 38 6.4 OPERATION OF EXPERIMENTS ...... 39 6.5 VALIDITY EVALUATION ...... 39

7 RESULTS ...... 41

7.1 AMA REPORT ...... 41 7.1.1 General Information ...... 42 7.1.2 File Activity ...... 42 7.1.3 Registry Activity ...... 42 7.1.4 Services ...... 42 7.1.5 Process Activity ...... 42 7.1.6 Port Activity ...... 42 7.2 EXPERIMENT RESULTS ...... 43 7.3 RESULTS FOR ILLEGITIMATE GROUP ...... 43 7.4 RESULTS FOR LEGITIMATE GROUP ...... 46 7.5 COMPARISON ...... 50

2 8 DISCUSSION ...... 55

9 CONCLUSION ...... 58

10 FUTURE WORK ...... 59

11 REFERENCES ...... 60

12 APPENDIX ...... 64

APPENDIX A: TERMINOLOGIES AND DEFINITIONS ...... 64 Virus ...... 64 EULA ...... 64 Binary ...... 64 PIS...... 64 Malware ...... 64 Worms ...... 64 Trojan horse ...... 64 ...... 65 ...... 65 Exe Packer ...... 65 MSI ...... 65 WPF ...... 66 UI Automation Framework ...... 66 Snapshot ...... 66 12.1 APPENDIX B ...... 67 12.1.1 Observed Behavior of Malware ...... 68 12.1.2 Top 10 Auto-start locations ...... 69 12.1.3 Trigger Conditions ...... 70 12.1.4 Illegitimate Signature from Anubis and Norman ...... 71 12.1.5 AMA Report Sample ...... 72

3 LIST OF FIGURES

Figure 1 Research Methodology ...... 21 Figure 2 Broader View of AMA ...... 27 Figure 3 AMA system components and work flow ...... 28 Figure 4 Sequence Diagram...... 30 Figure 5 AMA Host User Interface ...... 32 Figure 6 Auto Installer Test Run ...... 33 Figure 7 SupermonX, Auto Installer and other tools working in AMA Guest ...... 34 Figure 8 Network Traffic for Illegitimate Group...... 46 Figure 9 Network Traffic for legitimate Group ...... 49

4 LIST OF TABLES

Table 1 Database Search Results ...... 23 Table 2 Snapshot comparison results after installation (Illegitimate Group) ...... 44 Table 3 Snapshot comparison results after un-installation (Illegitimate Group) ...... 45 Table 4 Comparison result of legitimate program after installation ...... 47 Table 5 Snapshot comparison results after un-installation (Legitimate Group) ...... 48 Table 6 Report Comparison for Illegitimate Group ...... 51 Table 7 Report Comparison for Legitimate Group ...... 53 Table 8: Observed Common Behavior by Ulrich [30] ...... 68 Table 9: Top 10 Auto-start locations [59] ...... 69 Table 10: Trigger Conditions [5] ...... 70 Table 11 Illegitimate Signature from Anubis and Norman ...... 71

5 CHAPTER 1 1 INTRODUCTION

Since the emergence of internet, computers have become a vital part of our daily life. Remarkable development in the field of computer science and excessive use of technology has changed our lifestyle radically. The impact of this extensive use of technology has renovated the way we interact with each other, business dealings, banking, shopping and entertainment; all available on personal computers (PC) through internet. This massive evolution in internet technology gives birth to e-commerce that involved money transactions [38] and we are now rigorously dependent on technology. On the other hand, increasing number of people with evil intention has steps into this era. Formally known as hackers, they exploit this situation as new way of making money or growing businesses, by stealing personal information, credit card numbers and bank accounts from internet or can access your computers private resources and misuse it. This raises massive security risk to user‘s privacy and data. It also causes damage to internet infrastructure which effect quality of service and customer satisfaction. One of the recent trends on internet is collecting personal information e.g. browsing habits etc. and selling them to advertising companies, which target the audience based on their habits, has become a business in itself [38]. Software that collects personal information such as browsing habit is known as Spyware [57]. Advertising agencies distribute their ads on internet, which can be served directly to web sites as banner, but software particularly developed for this purpose is called Adware [38]. Adware displays ads through pop-up windows that irritate users. Any software that steals personal information without the consent of authentic user is generally classified as Privacy Invasive Software (PIS)1 [33]. Mainstream users are unaware of the fact that they are being monitored by malicious PIS and they are running it in background as a part of legitimate software or separately. PIS can be Trojans, , Malware, Adware or Key-logger [9]. To run business, some legitimate organizations gather information about their users, by targeting advertisement according to particular user‘s activities. This kind of software lies in the boundary of legal zone, known as gray zone2, which makes it difficult to identify and spotlight illegitimate behavior of legitimate software. Majority of users, install software without reading End User License Agreements (EULA) [28] and ignore it completely, whatever written in it, by which malicious software installed in the backend simultaneously [9]. There are numbers of solution available that can detect and remove malicious software. They work as anti-virus software and use signature based or heuristic based

1 PIS stands for Privacy Invasive Software, it generalizes all those software that lies under illegitimate 2 Grey-zone: Software those are not completely legitimate and illegitimate. They are on the border line of each other.

6 techniques to detect malicious activity [35], also known as Anti-spyware/malware. There is a tough competition between illegitimate and anti-illegitimate software. Developers on both sides are continuously trying to find out new ways of getting ahead from each other. In the first part of this study a thorough background study is performed that presented the state of the research in this area and helped us to differentiate malicious and normal software through their identified characteristics and behavior. In second part on the basis of knowledge gained from background study we developed a model to automate the runtime analysis of software. Proof of concept system has been developed and our 1st research question is answered ―To what extent can automatic runtime analysis be used to analyze software characteristics?‖ Third part of the study consists of quasi experiments that were performed to evaluate the performance of the proposed system of automated malware analysis. For the experiments reliable database of malicious and benign software is used. A report consisting of all the behavior and characteristics is generated by the system that is compared with Norman and Anubis malware analyzers, or could be used to classify the software as malicious or benign. That answered our 2nd research question ―Does automated analysis model lead to more accurate capturing of system changes compared to existing solutions?‖. This report could also be used in the software reputation system [52] that will help end user to know more about any software before installing it, whether it carries any kind risk with it so preventing the user from installing malicious software.

1.1 Challenges of the Field

Here we have described problems and challenges that people can face while developing a tool like AMA. The typical problem is integrating different software together and synchronization between them that is how different pieces of individual software could be assembled together in one system, where every piece of software could take part in work at specific time and after finishing work send results to other part and so on. When running many modules together in one system, it also becomes very important that different modules could communicate with each other. On single machine it‘s easy but when some modules are running inside virtual machine and some are on host machine it makes communication a problem. Another big issue is to handle the automated installation of software, so we can reduce user interaction and make this process fully automated. This is the most challenging and important part, handling of unpredicted behavior of installer applications needs intensive knowledge from variety of domains i.e. artificial intelligence, machine learning, human computer interaction and knowledge of algorithms. However, reacting to triggered events on

7 every step of installation is crucial task. We can‘t predict the upcoming event, which makes our task more difficult. Furthermore, differentiation between legitimate and illegitimate behavior is one of the hardest and critical point. Software companies are using advertising as a mean to generate revenue, by adding advertisement in software. Differentiation between legitimate and illegitimate software, which resides in grey zone, is difficult, a minor difference is partitioning them in specific group. Identification of such group from just behavioral analysis in automated manner is more complicated, because of the similar behavior of software, which creates, modifies, and deletes files and registry keys and generates network traffic. Whether it is legitimate or illegitimate, on first view they are behaving in same manner.

1.2 Structure of the Thesis In the beginning Chapter 1 starts with some introduction, challenges of the field, and brief overview of our thesis. Chapter 2 covers related work, available solutions, and difference between static and dynamic analysis, identifying legitimate and illegitimate software, virtualization and emulation. In beginning classification of emulators and virtualization is discussed and follows by some examples, and in the end Virtual-box is explained in details. Chapter 3 will describe our Research design, question, proposed solution with explained methodology and followed by search strategy. Chapter 4 will describe about the common behaviors of malicious software, which we can use in our result to identify related behaviors. Chapter 5 describes the overall system design, components and order of execution. Chapter 6 gives an overview of experiment design. In Chapter 7 we have mentioned our experiment results and their comparison. Then we will have discussion in chapter 8,conclusion at chapter 9 and future work at chapter 10. At the end of this thesis references in chapter 11 and Appendix at 12 contains related material and some definition and terminologies used in overall thesis.

8 CHAPTER 2 2 BACKGROUND

There is lot of research going on in the area of malware behavioral analysis and protecting end users by informing them about software before installation, in this area one of the most recent study is a ―reputation system‖ [52] proposed to inform and protect users before installing any software. It works as online web service like IMDB3 [26] which uses user rating to inform other users about any specific software. Reputation system has been developed as a counter measure against privacy invasive software (PIS) [2,33], which gives online ratings and reviews about software so user will gain knowledge about that software before installing and infecting his computer with malware. Studies on EULA [34] based analysis of malicious software has been conducted by using data mining techniques, which alert user about the hidden behavior of the software or details related to such malicious software that are going to be installed with legitimate software [28]. Due to large amount of complicated text written in EULA most users ignore to read it and accept it without having consent of what is written in it. Another similar study that uses web crawler to find malicious objects in executable found on conventional web pages on the internet. After attainment of considerably large sample they performed automated analysis and concluded that online users are not safe anywhere they were exposed to large amount of malicious software [5] while browsing internet. Currently there is no proper system or study available which can distinguish legitimate and malicious software by doing runtime analysis using snapshot to find their characteristics and behavior. In thisstudy we automated the runtime analysis of software to monitor its behavior after installation and un-installation, and then identify their behavior which considered as malicious.

2.1 Related Work There exist different tools for automatically analyzing malicious software behavior. They use different techniques to analyze malicious software by using real time monitoring. Despite some similarities our prototype has the advantage of providing detailed report of behavior with just one click of a button and it uses snapshot based technique in addition the whole analysis process is fully automated. Following are some available tools but they are not using snapshot technique in all of them and also don‘t contain analysis of uninstallation process:

3 Internet Movie Database, http://www.imdb.com/

9 The Norman Sandbox [44] simulates the entire computer and connected network. Then execute malware inside the emulated environment. In Norman it is also possible to run malware with a live internet connection. Using such kind of simulation is transparent to malware, which can‘t detect that they are being executed within a simulated environment [13]. Due to which malware can‘t interfere, infect or modify the running process, because no process is running in simulation. By not monitoring this aspect, valuable information might be missed. By using a real system environment AMA can capture this interference as well. GFISandbox formerly known as CWSandbox [13] is a popular malware behavioral analysis tool that can provide in depth details of executable, it creates a new process image for the to be analyzed malware binary and then injects the DLL into the target application‘s address space. With the help of the DLL CWSandbox perform an API hooking and send all observed behavior via the communication channel to cwsandbox.exe [13]. Then they use system call sequence analysis to observer the behavior of malware process and construct report by correlating the collected data [13]. Besides this it also uses real time monitoring and virtual environment to analyze malicious binaries. Another similar tool like GFISandbox is TTAnalyze [12], it uses API hooking with PC emulator QEMO instead of virtual machine which makes it harder for malware to detect that it is running in controlled environment [13]. By analyzing execution of emulated binary in real time it may contain errors and may differ from a real execution and it can introduce errors in execution process, which is a drawback of TTAnalyze. By watching execution of the program in the emulated system, it can generate report for tracked Windows API function that is called by the executable. Chas Tomlin‘s Litterbox [32], is using a different approach in which malware is executed in real Windows environment, but after 60 seconds of execution the host machine is forced to reboot in Linux. After booting Linux environment it mounted the windows partition and fetches all registry and complete file list [13], during the malware analysis it connects to virtual internet with IRC4 server that responds actively to every incoming request. The tool captures all packets to analyze all network traffic. Threat Expert [58] is another simulated environment based tool to analyze malicious behaviors. It places executable in a self-contained simulated environment, and execute the malicious executable to monitors its behavior. A combination of file system, registry and memory snapshots are recorded in it. Furthermore, hooks intercept the routes that are exploited by threat infection [58]. FARM (Forensic Analysis Repository for Malware) [27], it integrates open source and commercial analysis tools into an automated system in order to reduce the time spent by

4 Internet Relay Chat (IRC)

10 analysts [27]. It uses reverse engineering (RE) to get the assembly code and use automated scanners; furthermore it uses different antivirus (AV) to identify malware in runtime. It is difficult to RE all applications now days, because many malwares are encrypted and they decrypt that specific part only while executing application. Use of AV to identify activity is an option for future work for integration in AMA. FARM can be categorized as static analysis tool because of its functionality. AMAS (Automated Malware Analysis Station) [6] is a Linux based analyzer which runs executable inside virtual box and monitors runtime changes occurred in file system and registry. Whole process of analyzing contains continuous user interaction while fetching executable from source storage, running it inside virtual environment etc [6]. Capturing runtime changes generates lot of noise data which is not useful, due to which it is an added advantage in AMA that reduces this noise data by taking snapshot of current state of machine. Anubis [8] provides a web based malware analyses utility, execution of Anubis results in generation of report that contains information about actions of analyzed binary. It contains details about modification made to file system and registry and traffic generated during execution [8]. Above mentioned all analyzers contain different analysis methods, some represents static analysis techniques to identify malicious software such as EULA based identification using machine learning techniques. Whereas, some contains both static and dynamic analysis to identify malicious software such as FARM. All above mentioned solutions are either running an executable or applying RE techniques. None of them have clearly mentioned about installing and uninstalling of application automatically. Furthermore, not a single available tool provides fully automated solution that is based on snapshot technology and provides automated installation and uninstallation of windows installers for analyzing their behavior. AMA uses both methods to reduce the noise data and maintaining main focus on specific changes made by particular executable with minimum user interaction.

2.2 Available Solutions Varieties of different real time analyzers are available but they are not complete analysis system that analyzes all activities or fully automated, and not based on snapshot technology. Most of them are Linux based and are not meant for simple end users, they require in depth technical knowledge to configure analysis environment and conduct malware analysis. Some tools are given below, which can analyze malicious executable.

11 2.2.1 Web based Analysis Tools These web based interfaces are providing free analysis for malicious binaries. User just submits malicious binary into available web based analyzers, and they get behavioral report after some time of execution. Major drawbacks in such system are that there internal functionality is hidden and user doesn‘t know how they are performing analysis and limitation of uploading file greater than a specific size.

2.2.1.1 Joe Box

Joe Box [29] is runtime malware analysis system. It uses emulated, virtual and native systems environment to analyze. Report generated by Joe box provides details for behavior of the malicious program, communication that performed on internet during execution, maintains memory dump of malicious processes [29]. It provides web interface to upload malicious binaries after registration.

2.2.1.2 Comodo

Comodo [14] is provider of vast kind of security solutions that ranges from home users to business users and contains different kind of application i.e. Antivirus, internet security suite, email-messaging security, SSL certificates etc. It also provides a free online malware analyzer, in which user have to submit malicious binary to web interface and then Comodo will analyze its behavior and give back the detailed report containing different actions performed by given binary.

2.2.1.3 SysTracer

SysTracer [51] tool presented by Blue Project Software, that can scan, analyze the computer to find changed (added, modified and deleted) data into registry and files. However, this tool can take system snapshots and compare them to give difference report.

2.2.1.4 GFISandbox

GFISandbox [13] formerly known as CWSandbox is a popular malware behavioral analysis tool that can provide in depth details about executable, it creates a new process image for the to be analyzed malware binary and then injects the DLL into the target application‘s address space. With the help of the DLL CWSandbox perform an API hooking and send all observed behavior via the communication channel to cwsandbox.exe [13]. Then they use system call sequence analysis to observe the behavior of malware process and construct report by correlating the collected data.

12 2.2.1.5 Norman

The Norman Sandbox [41] simulates the entire computer and connected network. Then execute malware inside the emulated environment. It is also possible to run malware with a live internet connection. Using such kind of simulation is transparent to malware, which can‘t detect that they are being executed within a simulated environment [13].

2.2.1.6 Threat Expert

Threat Expert [58] is another simulated environment based tool to analyze malicious behaviors. It places executable in a self-contained simulated environment, and execute the malicious executable to monitors its behavior. A combination of file system, registry and memory snapshots are recorded in it. Furthermore, hooks intercept the routes that are exploited by threat infection [58].

2.2.2 Open source Tools

Open source tools that are available for analyzing malware, but they are not fully automated and doesn‘t use snapshot based techniques for analysis.

2.2.2.1 Zero Wine

Zero wine (ZW) is an open source research project that can perform dynamic analysis of malware behavior [70]. It runs malware using WINE in an emulated virtual sandboxed environment to collect information about all API‘s called by the program [70]. ZW is distributed as one QEMU virtual machine image with Debian system installed. It requires technical background for a user to run it properly to perform malware analysis. Whereas AMA requires no special technical background for operating and conducting behavioral analysis [70].

2.2.2.2 SysAnalyzer

SysAnalyzer [50] is developed by iDefence labs for automated miscode run time analysis of applications. Which collects data in real time monitoring; it can monitor and compare runtime processes, open ports, loaded drivers, and injected libraries, file modification and network traffic. It can generate lot of noise data while real time monitoring which can make identification of malicious behavior difficult and requires continuous user interaction [50].

2.2.3 GUI Automation

Graphic user interface (GUI) is a major milestone of computer software development. Because of its diversity of existing forms, complexity and increasingly

13 growing scale [43]. GUI automation is very complicated logical analysis process; therefore it is a great challenge to implement the efficient and intelligent GUI automation [43]. The specific reason is as follows [4] [3] [66]: 1. GUI software contains large number of states. 2. Various forms of synchronization and dependence exist among GUI controls. 3. GUI software contains unpredictable states with different elements.

2.2.4 GUI Automation Tools

Some GUI tools are given below which can help to automate specific software by recording mouse clicks and replay them again as automated software.

2.2.4.1 AutoIt

Autoit [10] is freely available BASIC-like scripting language designed for automating the windows graphical user interface GUI. It uses combination of simulated keystrokes, mouse movements and window control manipulation in order to automate the tasks [10]. In AutoIt user can write step by step operations that have to be performed on specific GUI. Due to the nature of our requirement that we have to install different software and they contains different behavior and different user interface (UI), which can‘t be handle by using AutoIt.

2.2.4.2 Qaliber

Qaliber [48] is an open source test automation framework which can automate testing process. It includes a tool for developing automation in .Net and a GUI tool to compose automation [48]. It is also dependent on specific GUI according to which all scripts and procedures are written to automate it.

2.2.4.3 Xnee

Same like AutoIt and Qaliber, Xnee [24] can also record and play user actions. It can be used to automate test and demonstrate programs. Unlike our auto installer all of above lack the dynamic functionality that it can handle and perform automatic actions to install most of windows installers.

2.3 Static Analysis vs. Dynamic Analysis

Static analysis is a process in which executable are analyzed by opening the file, reading it, sometime disassemble to get code, or generating control flow graphs etc. and finding anomalies to draw conclusion by these means [12]. However, dynamic analysis is a

14 process of analyzing an executable by running the program and monitors its execution in order to draw conclusions [12]. A program while execution makes different changes in system which can be adding malicious software, creating, modifying or deleting system files, registry keys and spawning new processes in infected system to access confidential data or to damage computer system. Both approaches have strength and weaknesses and both are effectively used in real world. A computer system can execute a program that creates some of instructions at runtime after running some of prior instructions and hence they are not part of the executable file that is stored file system. Such kind of software is known as self-modifying program in literature [12]. Whereas, static analysis has the advantage that it can view the complete program code and is usually faster than its dynamic counterpart. The main weakness of static analysis is that self-modifying programs and exe-packed executable are out its scope [12]. On the other side dynamic analysis can see the traces while execution of program and tack changes.

2.4 Identifying Legitimate and Illegitimate Software

This is an important and very critical part where we have to distinguish between legitimate and illegitimate software. However, generating a false alarm or identifying any legitimate software as illegitimate can raise trust issues. In current era both categories resides on border line of each other. That is why it is hard to distinguish them.

2.4.1 Behavior of Legitimate Software Software that doesn‘t install any other unwanted hidden software with it, display any advertisement while installing, doesn‘t contain any hidden process that is installed without user consent or doesn‘t perform any changes to critical areas and service of system. We can say it legitimate software, because it is not performing any illegal or malicious activity.

2.4.2 Behavior of Illegitimate Software Those software which show advertisements or install any additional software on your system can be categorize as illegitimate. Such as some software download toolbar or any third party application automatically from internet while installing software and run other applications at background. Malicious software also tries to register them automatically to startup programs to survive reboot. Furthermore, they create, modify or delete files and registry keys inside critical directories. Such symptoms are major and common behaviors of malicious programs (See section 4 for more details).

15 2.5 Emulation

Emulation is typically defined as ―A process whereby one computer is setup to permit the execution of programs written for another computer, this is done with hardware features and software‖ [25]. A personal computer (PC) emulator is software that emulates physical PC, including its processor, graphic card and other hardware instances [12].

2.6 Virtualization

Virtualization is constantly growing research topic and a technology that has already been around over 40 years in the area of mainframe computers [12]. Virtualization is commonly understood as the process of creating one or several virtual machines where ―a virtual machine is taken to be an efficient, isolated duplicate of the real machine‖ [22, 12]. However, virtual machine is simply a machine that gives a real machine like existence of physical machine in form of software. The software that manage creation and processing of virtual machine is called virtual machine monitor (VMM) or virtualizer. According to [22] a VMM has three essential characteristics. 1. The VMM provides an environment for programs that is essential identical with the original machine. 2. Programs run in this environment show at worst only minor decrease in speed. 3. The VMM is in complete control of system resources. A program that runs on virtual machine is running in exactly same manner as it is running on original machine, because the created environment is almost identical and it contains all necessary modules to run the application in same way. It is very important to keep the analysis on isolated environment to prevent any possible infection to spread around in the outer world. To achieve this isolation we used VirtualBox [45]. It provides one of the prominent and reliable virtual environments. The most significant plus point in using virtual box is that it is available as an open source tool and provides robust performance as compare to its counterparts. There are other Virtual Machine solutions available like VMware, Microsoft Virtual PC [41, 63].

2.6.1 Classification of Virtualization

There are two classifications of virtualization techniques according to their characteristics.

Virtualization vs. Para-virtualization: In virtualization it creates machines that are‖ efficient and isolated duplicate of real machine‖ [12] i.e. it is running inside a real operating system as another machine. Para-virtualization is different technique i.e. it runs directly on

16 hardware to achieve greater performance, instead of running inside operating system [12]. Xen [46] is a good example of para-virtualization

2.6.2 VMM Advantage

Virtualization technology provides many advantages beside others most important in our case includes Isolation and low overhead other advantages, which are described here in detail. Isolation: VMM can be used to create multiple virtual machines on single actual operating system, which runs on same computer hardware [12]. Although virtual machines run on same operating system and hardware, they are isolated from each other, and operating system running on one virtual machine doesn‘t notice the presence of another virtual machine on the same computer [12]. Low overhead/high performance: The overhead caused by a VMM is measured in a very small percentage because in most cases virtual hardware is directly mapped to the real hardware [12]. Most importantly instructions on virtual processor are mostly directly mapped to the real processor for directly execution, which increase the efficiency of virtual machine [12]. Moreover, virtual machine uses small amount of resources and they are configurable according to the nature of machine that means allocation of different hardware resources for example RAM can be increased or decreased depends on availability of amount of memory on real system.

2.7 Difference between Virtualization and Emulation

It is important to differentiate emulators from virtualizers here because they may look like same but they are very different in the way they work. PC emulators and virtualizers both can run unmodified operating system on another operating system but PC emulators go one step further they simulate all instruction by using software [12]. Whereas, virtualizers execute a dominant amount of instructions directly on real processor, which of course give them better performance than emulators. But emulators give a huge advantage that is they can imitate hardware that is not present actually, they can execute instructions which are written for different hardware presently they are running on, and this feature is not available in virtualizers. Bochs [31] and QEMU [19] are example of mostly used emulators and VirtualBox, VMware [64], PeerPC [47] and Virtual PC [42] are famous Virtualizers.

2.8 Oracle Virtual Box

Oracle VirtualBox [45] is a general purpose full virtualizer for x86 hardware. It provides different functionalities, such as creating multiple isolated virtual machines that can

17 run different operating systems inside them. Creating current snapshot of complete virtual hard-disk that can be restoring to its saved stage at any time. It is available as open-source that means its source can be modified according to one‘s needs. It also includes command line management interface which can do pretty much everything that can be done with GUI interface [45]. Due to these interesting functionalities we decided to use VirtualBox for our prototype, where we need a very responsive and easy to handle interface to manage virtual machine. ―Vboxmange‖ [45] command line interface of VirtualBox provide us all we need in good package, it provide us commands to manage virtual machine easily that is to start and stop virtual machine and restore virtual machine to its clean stage, which is very important in our case, except this Vboxmange commands provide access to all the abilities of powerful VirtualBox.

18 CHAPTER 3

3 RESEARCH DESIGN 3.1 Research Motivation

Malicious software is becoming increasingly destructive not only for end users but also for organizations and service providers. They invade privacy and precious computer resources causing not only performance/productivity problem but also serious security issue for user‘s secret data and computer resources. Techniques used to analyze software are mostly manual and not so effective because as detection or prevention techniques are becoming more sophisticated. These software are also using many advance techniques to escape from security applications. Now they come in many different flavors and sometime integrated inside legitimate software even some legitimate software also perform activities that are hidden or not well informed to users e.g. collection of personal data or usage of computer resources, it is becoming hard or impossible for security applications (e.g. anti- malware) to discover them. Manual analysis of these software is very time consuming and require in-depth knowledge of the malicious software behaviors that could be known to security researcher but not to system administrators or especially end users who are interested to check whether software they intended to install would be safe to use or not. There is a need for more generalized solution that can automatically analyze any number and type of software and present fully customized analysis report that could be used to classify the software as safe or unsafe i.e. to inform user about software behavior/activities. The proposed solution will automate the analysis of software at runtime. There will be no need of any user interaction, our application will automatically select the next sample install/uninstall it captures activities and generate detailed report for each of them. Any number of samples could be analyzed and detailed analysis report for each of them could be prepared.

3.2 Aims and Objectives

The of the study is to automate the runtime analysis of software to analyze the malicious software for their behavior that includes changes they have made in different areas of the system during installation and uninstallation e.g. file or registry keys creation or deletion and network traffic generation, after installation and uninstallation. The following objectives will guide us to achieve the intended aim.  A model to automate the analysis of malwares will be proposed.  Implementation of proposed model.  Experiments will be conducted to evaluate the performance of the proposed model.

19

This study will help users to distinguish legitimate and illegitimate software so they can select software that is free from any kind of malicious activity.

3.3 Research Question

1. To what extent can automatic runtime analysis be used to analyze software characteristics? 2. Does automated analysis model lead to more accurate capturing of system changes compared to existing solutions?

3.4 Research Method In this study constructive research method is used, which implies finding solution of real world problem. It consist of two phases, includes construction of some artifact, which could be practical or theoretical, and evaluation of that artifact. In this type of research fuzzy information is acquired from different sources such as literature reviews, practical experience or processes etc. which serves as theoretical knowledge. Based on the gained theoretical body of knowledge an innovative construct, framework or solution is extracted related to problem on hand [15]. The newly produced construct add to the practical and epistemic boundaries of knowledge [69]. In this study the artifact is automated analysis system to analyze software at runtime for their specific behavior and targeted knowledge is performance of the system that is how accurately the developed system could capture the behavior of analyzed software. This study consists of three parts which includes a background study, modeling of automated software behavior analysis system and at the end quasi experiments to evaluate the developed automated analysis model. All parts are presented here in detail. At first thorough analysis is performed to enhance our knowledge about malicious software that is why any software is labeled as malicious, what make them malicious that also includes their distinct behavior. Behavior includes what activities they perform which make them malicious and how they are different from legitimate software. In the next part we studied different analysis techniques used to analyze software behavior, where we thoroughly analyzed how behavior analysis techniques used to capture software behavior and more importantly how this process can be automated. Then possibility to automate the whole analysis process is investigated. Based on this study, a model is developed that can be followed to construct automated analysis system to capture software behavior.

20 Background Study

Modeling Experimentation

Implementation of Evaluation of model

Prototype Through Quasi Experiments

Figure 1 Research Methodology

In the next part of the study we have evaluated our proposed automation model by quasi experiment. A prototype system is implemented based on proposed model, as proof of the concept, which automates the runtime analysis of any software to capture behavior. Experiments are performed to evaluate the performance of the system that is how effectively it captures the software behavior. We have used quasi experiment design [1] to conduct experiment because full experiments design does not fits in this situation, we cannot perform proper randomized subject selection. These kinds of experiments are not so common and considered not so effective but they have their own benefits [1,, 65]. We have used legitimate and malware software in this experiment, malwares are obtained from malware research lab at BTH and legitimate software are downloaded from reliable online resources i.e. www.download.com [16]. Experiment results are analyzed to assess the effectiveness of the proposed automated analysis model.

21 3.5 Background Study

A thorough background study is the procedure to identify, analyze and interpret the available research related to the research questions. The purpose of this this is to search available research related to our identified questions (see Section 3.2) and then narrow down our topic for conducting research. The method used for evaluating the search in order to scrutinize the research material as well as to select the appropriate material and exclusion of unwanted material is also discussed here.

Keywords: Malware, automated analysis, runtime analysis, dynamic analysis, characteristics, behaviour, detection, malicious behaviour

3.5.1 Search Strategy

We followed these step to search from online databases and e-journals available. We had research questions already identified. So, using these research questions we extracted keywords that are directly related to research questions. In the next step, we used keywords and alternative keyword identified in previous step to search through available data sources of scientific research material (e.g. articles and journal). We used the following databases to search from:  ACM Digital Library  IEEE Xplore  Scopus  Google Scholar  Inspec/Compendex  ISI Web of Science These databases cover all the major scientific literature published till date. All keywords are combined in the search by using Boolean operators such as ‗AND‘ and ‗OR‘. We restricted our research to the literature material published between the year 2000 and 2010. The table below shows hits we found on each database.

22

Date Database Hits Reflections 20101012 IEEE Xplore 55 English 20101012 ACM Digital Library 67 English 20101012 Engineering Village 95 English 20101012 Scopus 38 English 20101012 ISI Web of Science 40 English 20101012 Google Scholar 451 English, to many hits

Table 1 Database Search Results

As our literature search resulted in large number of research material which is then filtered according to following inclusion, exclusion and appraisal criteria. We selected only that research material which is relevant to research questions under consideration. First we eliminate the duplicate search hits, we found on different databases and then we filtered material by name, which generally gives the idea about what we can expect in article under consideration. After that we moved on to study abstract and conclusions of the article which results in considerable decease in the amount of material. We excluded the research material:  Which do not discuss the topic related to any research questions  Which are not available in full text? It is also important to evaluate the quality of selected articles. This is based on structure of article, presentation of contents and understandability of research material in the article. This is done by careful reading of articles. As we started with large number of material but after applying rigorous filtering we end up with 49 articles.

23 CHAPTER 4 4 CHARACTERISTICS OF MALICIOUS SOFTWARE

Malicious software (or malware) is major threats to modern era of information technology. Due to increasing number of threats and techniques with which malwares are built, is incredibly alarming for security of computer systems and user‘s privacy. To mitigate this threat researchers have to observe malicious behavior to identify them and give remedies accordingly. Unfortunately, malware analysis has to face hiding techniques such as polymorphism and obfuscation [30]. By using such techniques malware can change their behavior or pretend to be legitimate. With the aim of identifying malicious behavior that how they behave and how they launch attacks. This understanding is important to develop effective countermeasures. Ulrich Bayer and Imam Habibi [59] have identified a collection of most common behavior from a vast number of malware analyses using Anubis [8]. The online available web interface of Anubis is responsible for receiving malware samples from all over the world. It also gets binaries from a number of security organizations and anti-malware companies [30]. They have collected different patterns of behavior for two years while Anubis has analyzed almost one million unique binaries (based on their MD5 file hashes) [30].

4.1 Malware Sources

As stated in [59], Anubis received samples from more than 120 different countries. Table 8 shown in Appendix B (See 12.1.1) shows observed behavior from above mentioned huge collection of malwares. This can help researcher to identify or cross check malicious activity more effectively and easily. By using this already conducted study we can evaluate our results that if our analyzed software contains similar common behavior or not.

4.2 Observed Malicious Behavior

In this part we will discuss about the file, registry, and network activity that was observed in [59, 60] when analyzing Anubis submissions. The goal here is to provide an insight into malware behavior that is common among variety of malware programs. An overview of behavior is shown in Appendix B (See Table 8).

4.2.1 File System Activity

From the Table 8 in Appendix B (See 12.1.1) we can see that, a large number of malicious samples (42.57% - 79.87% of all binaries) [59] make changes in file system i.e. creation, deletion and modification of files. While analyzing the created files in depth,

24 Ulrich and Habibi have found that most malwares copy it‘s executable to a known location (such as the Windows system folder). Some of them try to install their executable file in windows or its sub folders such as users account folder. This kind of activity shows that malware want to run their executable as privileged user [59]. According to [59] modification to existing files can be little less interesting, because event logger in system audit file is performing majority of the activity [59]. Whereas, according to Ulrich Bayer some malware try put their infected files inside utilities folder in system folder or in (IE). In his study Ulrich examined the deleted files in more detail, and found that most delete operation are recorded by event logger that has deleted temporary files that malware has created while execution. Also, they found that some of malwares try to delete and clear their traces from system by removing event log file [59].

4.2.2 Registry Activity

From Table 8 almost (64.71%) of malware creates registry entries. According to Ulrich, created registry key that is related to objects that are registered with Windows [59]. These entries are identified as benign. However, some malwares keep IDs of their components unchanged, these IDs are very helpful to identify malware presence of certain families. Their findings show that some of malware sample created an entry under the key SystemCertificates\TrustedPublisher\Certificates. By doing this malware try to register itself by adding its own certificate. In [59] research shows that there are some specific portions where registry changes made by most of malwares. Table 9 in Appendix B (See 12.1.2) shows an excerpt of top ten locations where malwares target to make changes [59]. An important key that contains instruction to auto-start software at startup [59] is also mentioned in table 9 (See Appendix B 12.1.2). This allows malware to keep alive and survive a reboot and run on startup. However, windows services registration keys are collected in the Services registry key section [59].

4.2.3 Network Activity

Another important way to identify presence of malicious software is through monitoring network traffic. Most malware families try to connect outside world to download or upload data from infected system or to receive commands. Many of malicious software try to connect to a HTTP5 or IRC6 server for their further communication [60]. Any unknown and unwanted traffic can give clue that there is something wrong. Then we have to go into deep detail to identify the reason of generated traffic and try to find out traces if any.

5 Hyper Text Transfer protocol (HTTP) 6 Internet Relay Chat (IRC)

25 4.2.4 Incomplete Un-installation Process

This is an alarming point to note out that if any software leave its traces in system and doesn‘t remove all parts of it or clear all entries that has been made by software at the time of installation. Most of the time hackers attach there malicious application with legitimate software, that get installed with the good software. When user uninstalls it than the legitimate software removes it‘s all traces but the malicious software that was hosted on legitimate, it still remain there. Moreover, real-time monitoring is not able to find out that hidden malware in un-installation process. By the use of snapshot we can easily track the remaining elements either it is modified, deleted or it has created system files and registry keys. Any activity that is performed in addition with the actual software can be suspicious and need further analysis in depth. Table 10 in Appendix B (See 12.1.3) shows a triggering condition that when we can suspect software that it is performing a malicious activity.

26 CHAPTER 5 5 DESIGN AND IMPLEMENTATION OF AUTOMATED MALWARE ANALYZER (AMA)

In this section, we describe the design and implementation of AMA. First, we present system architecture that is how AMA works, internal structure and operations of system. Next section contains detailed description of system components and their relations with each other. Then we describe about the virtual environment and what happing inside virtual machine. In the end we describe order of execution that is how everything is synchronized and working simultaneously without interfering each other.

5.1.1 System Architecture

AMA is fully automated tool to analyze applications for their activities during and after installation/uninstallation. The aim of analysis is to install application in isolated (virtual Machine in our case) environment during and after installation, monitor different system resources used by it, to check if it performs some kind of malicious activity or not and present a detailed results report to user. Figure 2 shows basic function of AMA, where a user want to check some application, before installing in his computer that it would be safe to install or not. A broader view of AMA is given below:

Local Local DB HD User can handle and view reports AMA Host Virtual Application Report Machine Handler Handler Handler

Virtual Box

Windows XP SP3

AMA Guest

Auto Auto Un- Windump Installer installer

Process Supermon Fport Monitor

Figure 2 Broader View of AMA

27 AMA has a very clean and simplified system Architecture. Simpler Architecture makes development easier and internal operations more transparent which results in better control over every component, eventually leads to better results. To develop AMA we used some of already available tools that are freely available to monitor file system, , kernel services, processes and open network connections. SupermonX [49] is used to create snapshot of File System, Registry and kernel services. Fport [20] is used to monitor all open TCP/IP and UDP ports and maps them to owning application. Win dump [68] is used to monitor all network traffic. We could develop our own components to perform all monitoring but it would have be the case of ―reinventing the wheels‖ because there are many small open source tools available to perform each task, so we decided to use existing tools. But selection and testing appropriate tools took much time and integration of these tools is complicated. However, we are unable to find automatic application installer that can automate the installation of application. So we have developed our own component, which is capable of installing most of the applications using known installing systems such as Microsoft installer, wise install and install shield without the need of any user interaction. Figure 3 shows the overall system design, components and work flow.

l e a a t s D c a a o H D B L

User can add Executable from 1 2 AMA Host Create Shared Start & Control Folder 8 Virtual Box And 4 Copy Executables Show 3 to it Reports Start & Control In GUI AMA Guest List Virtual Box Controller Virtual Box Reports Win XP SP3 Supermon 6 Save Snapshot1 Autoinstaller Fport

AMA Guest Get All Reports And Merge into Auto Shared Folder Shared Folder on Auto Installer Windump Single Report Un-installer on Host Host System System 5

Process 7 Supermon Fport Save Final Report Monitor Snapshot1 Snapshot2

Figure 3 AMA system components and work flow

28

AMA has two main components AMAHost and AMAGuest. As the names suggest AMAHost resides in host operating system and AMAGuest resides in guest operating system inside virtual machine. As we intended to study applications on MS Windows platform, the mostly used and attacked platform, so we decided to use Windows XP SP37 as the guest operating system and for host we also used Windows XP SP3 but it can run on any Windows flavor, based on Windows NT. AMAHost controls all the AMA operations. It is responsible for managing all tasks assigned to AMA, which includes managing applications to be analyzed, controlling virtual machine, management of report received from AMAGuest and the most importantly AMAGuest which is like central machinery of system. AMAGuest is main component doing all the magic and sending a detailed report back to AMAHost. It is internally very congested, consists of many smaller components performing their intended jobs synchronously on call. As you can see in figure 7, it has a component called SupermonX [49], which is responsible of taking snapshot of file system, registry and kernel services when it is asked to do. It has Auto Installer, which is very interesting and quite unique in it. It takes an application installer (.exe) file and installs that application by itself without the need of any user interaction which is very important and difficult part of the analysis process. Auto Installer could work with most of the widely used windows installer programs. Fport [20], which is used to capture all open TCP/IP and UDP ports and maps them to their owning application along with complete path, it is also a part of AMAGuest. AMAGuest has another component known as Win-dump8 [68], which is used to capture network packets, filter it and get statistics about network traffic. It uses Win-Pcap9 [68] which enables win-dump to capture network traffic from network interface. And at last but by no means the least, comes the Reporter, component that gathers all information scattered in many reports and log files, generated by different components of the AMAGuest and compiles them in a nice clean and high level report and transfers it to AMAHost. The communication between AMAHost and AMAGuest is achieved through the features included in Virtual Box [45] (the virtual machine used in AMA) called shared folder and Vboxmange command line interfaces.

5.1.2 Order of execution

In this section order of execution is explained. It is important to maintain a smooth order when performing different operation inside AMA because many functions depend on

7 SP3: Service Pack 3 8 Win-Dump: Network traffic capturing utility 9 Win-Pcap: Provides interface to capture network traffic by using windump.

29 the output of the other if output is not available system breakdown could happen. Figure 4 shows sequence diagram of AMA functions.

Figure 4 Sequence Diagram

The first step is taken by user, who would like to know about any application activities, and adds (1)10 some applications to main interface of AMA, which is AMAHost, by clicking browse button and selecting application to be analyzed. After adding applications, when user press start button to start the analysis that‘s where actual functions of AMA triggered. First off all AMAHost makes a list of the applications to be analyzed and then for first application it starts virtual machine (3). After virtual machine is started AMAhost runs a command inside virtual machine to mount the shared folder and copies the first application to this folder (2). After the application is copied AMAHost again runs a command inside virtual machine OS to start AMAGuest (4) and provides name of the application as parameter. Now when AMAGuest started, AMAHost will wait until the analysis is finished and reports are available. AMAGuest starts and take the application name from argument and checks if it is available in shared folder. If application is available in shared folder it starts the analysis by triggering SupermonX which takes snapshot (6) of the file system, registry and services. After taking snapshot it starts process watcher and windump and triggers Auto Installer that will install the application, as auto installer installs application monitoring of process activities and network activities are captured and logged in log files (6). When installation finishes process watcher and windump is stopped and

10 Represents markings (Steps from 1 to 8) in figure 3 shows order of execution.

30 SupermonX is triggered once again to take second snapshot and compare the first snapshot with second and generate a report of the difference it found (6). After that reporter is called and all the reports are merged in one final report (5) and transferred to shared folder (7) then from shared folder to AMAHost (8). For application uninstallation same process is repeated except this time auto installer uninstalls the application installed in first part of the analysis and again reports are merged by reporter in the end and copied to shared folder. At the end of the analysis when all reports are available in the shared folder AMAGuest sends a signal back to AMAHost that analysis is done by making a file in the shared folder. When AMAHost gets that file in shared folder it looks for reports and copies them to specific location and reverts the virtual machine OS to its clean state. If there are other applications in the queue to be analyzed, same process is repeated for everyone. That‘s how AMA performs all the operations, which are carefully designed so no two components contradict each other

5.1.3 Components

AMA basically divided in two main components, which controls the system operation and integrates different tools together, they are located in two different locations, in the host OS and in the guest OS, in side virtual machine. The component in the host OS is named relatively AMAHost, which is main control center of AMA and responsible for management of application to analyze, controlling virtual machine and generating reports. The other component is AMAGuest as the name suggest it reside in the guest OS inside virtual machine controlling SupermonX, which creates snapshot of file system, registry and kernel services, AutoInstaller, which is developed by us to automate the application installation process, Fport it can report all open TCP/IP and UDP ports and maps them to owning application and windump which captures network traffic. In the next section all components are explain in detail. AMAHost: AMAHost is the main controller of the system. We developed this component ourselves using C#. It resides in the host OS (Windows XP) and controls all operations of AMA. Its operations consist of management of the applications, reports and the control of the virtual machine. It manages all the application given to system to analyze and show them in a list. This list can be modified by user afterwards, if he wants to remove one or more applications or add more application to analyze. AMAHost has windows form based GUI interface that is very simple and easy to use. To add applications user just need to click browse button from main form or click add button from main menu, a file browser will appear and user can add one or more application. AMAHost consist of further more components that perform their intended operation on call.

31

Figure 5 AMA Host User Interface

Application Handler: When user adds applications to analyze, Application Handler gets their full path and adds them in a list with application name and path. Reports Handler: It is responsible to get generated reports from shared folder and save them to specified directory and adds a link to report in the same list on main window of AMAHost that also has name and path of the applications to whom this analysis report belong to. Virtual Machine Handler: It is responsible of all operations which have to be done on Virtual machine mainly reverting back virtual machine OS to its clean state after the analysis of application. Other tasks include starting and stopping Virtual machine. Above handlers can be seen in broader view of AMA (See Figure 2). AMAGuest: AMAGuest is second main component of our prototype. It runs inside the virtual machine, which is guest OS (Windows XP) and controls all operation performed inside virtual machine OS. It is a console based C# application which is also developed by us. It manages SupermonX which we have used to create snapshots of file system, Windows registry and kernel services and Auto-Installer that installs any given application which is written by ourselves. It also manages Fport and windump, which are used to monitor all open TCP/IP and UDP ports and network traffic respectively and Reporter is also part of AMAGuest that collects all the information produce by SupermonX, Fport and Windump sorts this information and produce a nice high level report . AMAGuest is very important component of AMA because most of the time it is working from getting application from

32 host then taking first snapshot, installing application, taking second snapshot getting info from Fport and Windump and then making final report and sending back reports to host. The most difficult part is controlling and synchronization of all different components and running them in smooth order so every component take part in the process at a specific time, if this order is broken by any reason system will fail, resulting in wrong or uninteresting information in final report. AMAGuest consists of these subcomponents Auto-installer: We have implemented it to automate the installation of applications. It is written using C# console mode. It is very vital building block of AMA because installation of application is the process which is most important and all the monitoring focused on application installation. There are no tools available which automates the application installation. Many different kinds of installer used by different application and everyone have different behavior. We have analyzed most of the installers and developed this automation component that could handle most of the standard installation programs. Besides this it is also able to un-install application in automated manner. Figure 6 shows auto installer in running mode while installing application automatically.

Figure 6 Auto Installer Test Run

SupermonX: SupermonX [49] is an open source utility to create snapshot of the file-system, registry, and kernel drivers and services of Windows PCs. Using snapshots SupermonX can report or verify the changes in system made by program installations or activity. Figure 7 below shows the working state of all components including SupermonX.

33

Figure 7 SupermonX, Auto Installer and other tools working in AMA Guest

Fport: Fport [20] is a free tool, it can report all open TCP/IP and UDP ports and maps them to owning application. This is the same information which we can get by using ―netstat -an‖ command, but it also maps those ports to running processes with the PID, process name and path. Fport can be used to quickly identify unknown open ports and their associated applications. Windump: It is a packet capture driver for windows. It adds raw packet capturing capability to windows kernel. It also provides some functions to develop network test and monitor programs. The reason we selected windup is because of its high capturing performance and flexibility. Windump [68] includes following functions:  Capture the raw traffic from the network and pass it to a user level application.

34  Filter the incoming packets executing BPF11 pseudo-machine code. This means that the capture application can define a standard BPF program and pass it to the driver. The driver will discard the incoming packets that do not satisfy the filter.  Hold the packets in a buffer when the application is busy or it is not fast enough to sustain the flow of packets coming from the network  Collect the data from several packets and return it as a unit when the application does a read. To maintain packet boundaries, packets are encapsulated in a header (the same used by BPF) that includes a time stamp, length, and offsets for data alignment.  Write raw packets to the network  Calculate statistics on the network traffic Reporter: Reporter as name suggest responsible of gathering all information generated by different components in AMA and merge them together in a detailed high level report. This report is then transferred to shared folder.

5.1.4 Reason to Choose Selected Software

All software‘s which are part of AMA are selected after intensive research. First of all the main reason is that all tools are freely available and are effective in their corresponding work. SupermonX is selected due to it is an open source tool which is available with its source code; it can take snapshot of file system, registry keys, and services. It provides us with functionality to manage and configure snapshot regions. It generates a comparison report which is quite handy and easy to understand for humans, using this report we can effectively identify behavior of applications. Fport is selected because it provides us more detailed information than ―netstat‖ can provide i.e. it gives us full path of process connected to open port. For capturing network traffic we used Win-dump a very efficient and widely used network traffic sniffer. Using Win-dump we can capture traces if any malicious software tries to connect a foreign host. However, other available tools (See Section 2.2) are not providing full functionality that is required by us to cover all expected areas.

5.1.5 Communication

The way AMA works, it is very important that both AMAHost and AMAGuest should be able to talk to each other, to synchronize the operations performed on both sides. AMAHost needs a way to control virtual machine that is easy with quite impressive, powerful and complete command line interface available for Virtual Box. But controlling

11 BPF(Berkeley Packet Filter) provides a raw interface to data link layers, permitting raw link-layer packets to be sent and received, developed by PCAUSA [11]

35 AMAGuest inside virtual machine is the tricky part. AMAHost must have access inside the box to control and synchronize operations of whole system. The communication between AMAHost and AMAGuest is critical also because applications, we intended to run inside virtual machine are mostly suspected of being carrying dangerous application or at least a trigger to ignite some critical malfunctioning, there is always a chance that some infection could leak from virtual machine and infect host OS and connected network as well. So it is important to diminish or at least keep this risk of leakage minimum so no infection could get escape in the open network and cause damage to host or other computers connected on network. Communication problem is achieved through the feature included in Virtual Box [45] (the virtual machine used in AMA) called shared folder. Share folder resides on host OS and shared with guest OS with or without access permissions. We shared this folder with guest with only restricted permissions to transfer files to and from AMAGuest and Vboxmange commands are used to run different commands to control AMAGuest. Shared folder and Vboxmange provide simple solutions for complex problem and with almost no security risks, so propagation of malwares outside virtual machine is not an option.

36 CHAPTER 6 6 EXPERIMENT

In this chapter we will present quasi experiments performed to evaluate the results of application analysis done by automated malware analyzer (AMA) prototype. Quasi Experiment design [1] is selected due to lack of possibility to randomize subject selection. Special efforts have been made to minimize the effect to internal validity which is discussed in Section 6.5. Experiments12 are designed very carefully so that there is nothing that leads to failure of getting fruitful results. Here experiments are elaborated in enough detail so it can be verified and repeated as experiments are performed to do so.

6.1 Goal The goal of the experiment is to investigate the performance of AMA that is how accurately it captures the activities of given application samples. In this experiment sample application installed and uninstalled in virtual environment and different activities performed by the application during this time is logged and a report is prepared. Activities which are captured include files changes, registry changes, windows services, ports and network traffic. AMA analysis report is evaluated with other similar tools available online such as Norman [44] and Anubis [8]. These tools also generate almost identical reports for application submitted to them online. This assessment of reports generated by AMA and other tools helped us to evaluate the performance of AMA that is where we can put analysis performed by AMA? Is it effective and efficient enough to capture activities performed by applications during installation and uninstallation which will help us to decide whether it is a safe application or malicious?

6.2 Planning

6.2.1 Experiment Environment For conducting experiments, we set up two desktop system with Windows XP installed on them. Both systems are equipped with AMAHost and Virtual Box and inside Virtual Box we installed Windows XP and set up AMAGuest and its different components. After creating and configuring Virtual box we took a snapshot of Windows XP‘s current state inside Virtual box. This snapshot will be used for restoration of clean state after every analysis. This experiment is conducted using 50 application divided in two group, one group contains malwares and other contains safe applications. Selection of application in each group is done randomly. The experiments are conducted inside lab provided for the purpose.

12 Experiment refers to quasi experiment in this document unless specified.

37 It is an offline study and conducted by graduate students as a final thesis project at university. The project is general since it involves investigation of malwares that are selected at random and it addresses a real problem that is evaluation of malware analysis techniques.

6.2.2 Variable Selection

In this study independent variables are applications that are analyzed through AMA analysis process which could be one of the two types i.e. normal application or malware. The dependent variables are the correctness of analysis reports generated by AMA.

6.2.3 Subject Selection

Samples of applications which are used as subjects in this study collected from trusted sources. Legitimate applications are downloaded from trusted internet resources, while downloading it is been ensured that the application should be safe and contains no malicious activity but you cannot be absolutely sure about that but we tried our best to choose secure sources to download application that are well known. Getting samples of malwares is tricky part although we can get malwares from any unsafe web source by downloading applications and scan them with any security product to make sure they are malware or not and selecting known malwares from them, but since we have a research lab at university that maintains repository of malware samples so we decided to use this malware repository instead of getting them by other long way.

6.3 Experiment Design

Design principle of balancing is used in this experiment. Since same numbers of subjects are used for every treatment which consists of 25 safe applications and 25 malicious applications. Balancing simplifies the analysis of output of the experiment.

6.3.1 Instrumentation

To conduct this experiment we prepared two desktop computer systems that had Windows XP installed. Computers are equipped with AMD dual core processor with 2.1 GHz, I GB ram 80 GB hard drive Virtual box is installed and configured as required in the experiment. AMAGuest is set up inside virtual machine. A check list is prepared to control and monitor every step of the experiment. This check list is also used as guideline to conduct experiments and collect and document results, checklist ensures every step is performed as planed and nothing missed. Reports are collected and saved in a planned destination to analyze them later. Reports are analyzed manually without using any other tool.

38 6.4 Operation of Experiments

We setup computers in the lab provided in university so we can perform experiments using them. Setup includes installation of Windows XP on both host side and guest side inside Virtual Box. All the settings are checked so AMA can run on the computers. AMA requires virtual box guest additions [62] installed inside virtual box and Microsoft .Net framework 3 or later. WinPcap is installed on both systems to use WinDump for network packet capturing. Experiment was ready to be conducted once all the preparations were made. We have all the application downloaded which includes malwares and legitimate applications on both computers we added these applications to AMA and then performed analysis on them one by one. AMA analysis process consists of many small steps which includes adding malwares in the main interface of the AMAHost and starting the analysis process by clicking on ―start analysis‖ button. The first step is to run virtual machine and then inside virtual machine AMAGuest is triggered and a malware sample is passed to it by coping sample to shared folder. AMAGuest is triggered with an application sample name in its argument. AMAGuest gets the name and look for it in shared folder and if it founds application there, it copies it to working directory and starts analysis process. After completion a final report is sent back to AMAHost and it will save the report to destination. This process is followed to analyze all the given samples. At the completion of the experiment, we collected reports generated by AMA and different other statistics are recorded for each experiment. For the validation of the data we used a check list. There are different questions noted about the experiment procedure and those were filled at the time of experiment execution. This method was used to give guarantee that experiment was conducted as planned.

6.5 Validity Evaluation

Experimental studies are always prone to some kind of threats to their result‘s validity. There can be different type of threats to validity; they are classified in four groups Internal, external, construct and conclusion threats [53]. However, it is not must that every study exposed to every kind of threats. We discussed here all major threats to validity of this study. As this study includes quasi experiment, which is exposed to mainly internal threats if subject selection is not randomized that could affect the causal relationship between independent and dependent variables. Special care is taken in subject selection (explained in Section 6.2.3) and two groups are made on the basis of their reputation that is one group consists of legitimate software and other consists of malicious software. And also exposed to

39 external threats in case of too few numbers of subjects of study, which makes generalization of the experiment results questionable. The measures used can affect the outcomes for example capturing of file system changes is done with SupermonX, that is specialized in its capability and fport and windump also well reputed. So selection of measures are been done with proper testing. As the measures used in these experiments are well verified and this helped us to minimize its effects to outcomes. For experiments same treatment for every subject is ensured in the AMA system design.

40 CHAPTER 7 7 RESULTS

This section focuses on the result generated by running AMA. In Section 7.1 complete structure of report is described. Experiments results are elaborated in sections 7.2 through 7.4 and Comparison of AMA results with other malware analysis tools is presented in section 7.5.

7.1 AMA Report

AMA is an automated behavioral analysis tool for analyzing malware. Its report is based on snapshot based technique to identify changes by comparing before and after installation snapshot. Information collected from other simultaneously running modules such as process monitor, port monitor, and network traffic monitor is added into one single final report. The final report is text-file and it is structured in six different sections: 1. General Information: This section contains some basic information about the software that is analyzed in AMA. Such as file name, type, size, MD5 and SHA-1 of file. Also, date and time of analysis. 2. File System Activity: This section shows the activity performed after installation or un-installation i.e. Files and Folders (created, deleted and modified). 3. Registry Activity: This part shows registry keys and values (created, deleted and modified) during the analysis period. 4. Services: Services that are created modified or deleted are mentioned in this part. 5. Process Activity: This section shows the creation and termination of process along with other process related actions that are performed. 6. Port Activity: This section shows all open ports that are open after completion of installation. Ports are mentioned with their respective process that is connected at the moment. Moreover, network traffic is also captured and saved in log file separately. Log file contains all packets captured during execution of application. A complete sample of report is given in Appendix B (See 12.1.5). Process monitor used in AMA is responsible for monitoring process activity in real time when installation and uninstallation is in progress. We keep check of new processes creation and termination by the use of process monitor. By using Fport we save active ports status at a particular time. Whereas, windump is used to capture network traffic which is generated while installing or uninstalling application, captured network packets shows attempts to connect outer world. File, folders, registry keys, registry values, and services that are created, modified or deleted are prime responsibility of SupermonX. It takes snapshot of system before and after

41 installation and uninstallation. Then it compares both snapshots to give a difference report, which contains all changes that were made in above mentioned places.

7.1.1 General Information

Basic information about software that is analyzed by AMA is mentioned at the start of the final report. It contains file name, file type, file size, MD5 hash value of file and SHA- 1 hash value of file. Hash value and size of file can be used to identify the uniqueness of file. By that we mean if an executable is modified or someone attached a malicious file with the legitimate file, than we can distinguish same files by their MD5 hash and file size.

7.1.2 File Activity

In file activity section of report we have list of files created, deleted and modified during installation and uninstallation. This information is very important to keep track of behavior that where and what changes are made by particular software.

7.1.3 Registry Activity

Activity performed in registry i.e. keys and values created, deleted or modified during analysis of software is mentioned in this section of report. This gives detailed information about changes occurred in registry i.e. in which particular portion a key is created or a particular key is modified.

7.1.4 Services

In this section all services that are created, modified or deleted by software installation is mentioned separately. So, user can have better understanding about changes occurred due to installation.

7.1.5 Process Activity

AMA report shows process created, deleted or modified during installation. It also shows name of the process, process ID, Parent ID of process and time at which changes occurred. We have created this utility to monitor process related activity so we can keep track of it.

7.1.6 Port Activity

Fport is used to monitor port activity i.e. after installation we can see if any software tries to connect outer host. This portion shows process ID, Process name, port on which it is connected and the directory path where the process application is located. With all these information it is easy to track down application if it is creating problems.

42 7.2 Experiment Results

In this section we will show results generated by AMA while analyzing two sets of software containing illegitimate and benign software. In our test samples there are software marked as legitimate, are downloaded from www.download.com and samples categorized as illegitimate are taken from malware repository of BTH. These samples contain some grey- zone applications that are not much harmful for system but they contain adware and other malicious behaviors. In this experiment we are not targeting applications to differentiate and find them with malicious signatures. Our method of detecting suspicious behavior is based on difference between comparisons of two snapshots. We can suspect software from its behavior that how many files and registry keys are created, modified or deleted by it and in which folder or directory these changes are made. In our results we are also generating difference report for un- installation process, to see whether software removes all of its traces or not and leave its instances behind. This is very important to see if software leave its installed instances behind this immediately make us suspicious to check what is left behind and that might be the attached malicious software which is hosted on legitimate software.

7.3 Results for Illegitimate Group

In our results we found that while installation, most of the software creates and modifies files and registry keys (See table 2). But deleted files, registry keys and services are less in numbers. According to table 8 and table 9 in Appendix B we can observe that if software creates or modifies files and registry keys and these changes are done in critical folders as described in table 9, can make this software suspicious. Table 3 below shows results of analysis that is performed while un-installing software. According to these results we can say that almost all test subjects have incomplete un-installation process. Uninstallation process leaves files and registry changes behind, which is also seen in benign software. Moreover, for getting closer view we can dive into our final report generated by AMA to see which files are still present after uninstalling the application. Let‘s take ―Stop-Sign_Install ‖ (See table 2) as an example, it has created 1910 files when it is getting installed in our test system. When AMA uninstall it and generate the difference report, we can see 1 file still remains there and 7 new registry keys created (See table 3). When we thoroughly check the final report we came to know that this application is not un-installed completely because, its un-installation process is not complete and it leaves files behind in the system.

43 Illegitimate Programs Results After Installation Software Name System Files Registry Keys Services Cre Del Modi Cre Del Modi Cre Delet Modi ated eted fied ated eted fied ated ed fied 3dspringblossoms_2066 14 0 35 41 11 85 0 0 1 abcscrabblewe 54 0 23 48 0 35 1 0 1 240 acezJukebox 33 0 9 0 48 0 0 1 4 capex_captureexpress 16 0 31 38 0 33 0 0 0 cfree4_1_pro_setup 615 0 23 77 0 31 0 0 1 Em2 N/A N/A N/A FiestaBarInstall_Spywar 10 0 25 45 0 34 0 0 1 eguide flash_decompiler_eltima 40 1 25 84 0 63 0 0 1 gdsol_goodsol 75 0 25 520 0 35 0 0 0 kazaa_setup 2 0 23 11 0 26 0 0 1 minibuginstaller 6 0 23 26 0 26 0 0 0 NetPumper-1.50-setup 10 0 3 0 0 45 0 0 0 NewsReaderSetup_crawl 48 0 3 159 2 55 0 0 0 er OEAddOn N/A N/A N/A RLSetup_Adsupported 58 0 14 16 0 38 0 0 0 sdasetup 378 1 25 611 8 68 5 0 2 Setup_hotbar 0 0 3 1 0 11 0 0 0 SnackMan_ enbrowser N/A N/A N/A SPYWARE_TREND_Sp ywareGuide_ALPluginIE 6 0 9 22 0 18 0 0 1 -1.0.2.3- setup_anti_leech.com Srv 0 0 3 18 0 18 0 0 0 191 282 stop-sign_install 0 15 1 61 3 0 0 0 9 weather 0 0 3 0 0 9 0 0 0 WebExtractProfessional 8 0 3 193 0 20 0 0 0 ZangoSA 11 0 3 31 0 40 0 0 0 ZangoSADF 0 0 4 26 0 11 0 0 0

Table 2 Snapshot comparison results after installation (Illegitimate Group)

From above result we can see that files are present in its installation directory after uninstallation. Any software can be taken as example to see its behavior; all of them are mostly behaving same and this same behavior is what we have found in both type of software.

44 Illegitimate Programs Results After Un-Installation Software Name System Files Registry Keys Services Cre Del Modi Cre Del Modi Cre Del Modi ated eted fied ated eted fied ated eted fied 3dspringblossoms_2066 N/A N/A N/A abcscrabblewe 44 0 24 78 0 54 1 0 0 210 acezJukebox 20 0 10 0 63 0 0 1 7 capex_captureexpress 3 0 38 12 0 46 0 0 0 cfree4_1_pro_setup 8 0 27 69 0 44 0 0 0 Em2 N/A N/A N/A FiestaBarInstall_Spywa 4 0 30 11 0 50 0 0 1 reguide flash_decompiler_eltima 15 1 31 82 0 75 0 0 1 gdsol_goodsol 5 0 30 19 0 49 0 0 1 kazaa_setup 1 0 27 24 0 38 0 0 1 minibuginstaller 3 0 27 25 0 40 0 0 0 NetPumper-1.50-setup 1 0 8 13 0 21 0 0 0 NewsReaderSetup_craw 1 0 8 55 0 76 0 0 0 ler OEAddOn N/A N/A N/A RLSetup_Adsupported 58 0 14 16 0 38 0 0 0 sdasetup 418 0 29 483 8 102 3 0 2 Setup_hotbar 1 0 6 13 0 25 0 0 0 SnackMan_ enbrowser N/A N/A N/A SPYWARE_TREND_S pywareGuide_ALPlugin 5 0 13 8 0 26 0 0 0 IE-1.0.2.3- setup_anti_leech.com Srv 1 0 8 25 0 26 0 0 0 stop-sign_install 1 0 6 7 0 21 0 0 0 weather 1 0 7 7 0 19 0 0 1 WebExtractProfessional 2 0 7 7 0 30 0 0 0 ZangoSA 1 0 6 9 0 30 0 0 1 ZangoSADF 0 0 4 6 0 19 0 0 1

Table 3 Snapshot comparison results after un-installation (Illegitimate Group)

While installing and un-installing applications we have collected network traffic, by this we can easily check if any software try to establish connection outside or try to download files from any outer server. For the set that contains malicious software, we can see that not all software tries to connect outside world and download data. Diagram below shows the network traffic for above mentioned malicious software group.

45

10000

1000 Packets Packets ) During

100 Installation No.of 10 Packets During 1 UnInstallati

on

Srv

Em2

Logarithmic Scale (

weather

ZangoSA

sdasetup

OEAddOn

ZangoSADF

kazaa_setup

acezJukebox

Setup_hotbar

gdsol_goodsol

abcscrabblewe

minibuginstaller

stop-sign_install

cfree4_1_pro_setup

capex_captureexpress

RLSetup_Adsupported

NewsReaderSetup_cra…

SnackMan_enbrowser

NetPumper-1.50-setup

FiestaBarInstall_Spywar…

WebExtractProfessional SPYWARE_TREND_Spyw… flash_decompiler_eltima

Figure 8 Network Traffic for Illegitimate Group

From figure 8 we can see that most of the software tries to connect to outside world to download data. This kind of activity is critical and play vital role in identifying malicious behavior. Thus we can suspect these software as not safe because of their behavior which we have found using reports generated by AMA.

7.4 Results for Legitimate Group

On the other hand we have results for safe group which contains software from www.download.com and other sources. We have notice that both group of software have same behavior to some extent that they are creating modifying most of the files and registry keys. Furthermore, table 5 below shows the result after un-installation of software during analysis process. We can see that un-installation process of most of the software is not complete as well. For further clarification we have to thoroughly check the final report for changes, that whether they are related to installed software or not. If they are related to installed software and also present in report of installation, then we can say that this software might contain some malicious content or we can say that its un-installation process is not complete. Figure 9 below shows the network traffic packets captured during installation and uninstallation of legitimate group. However, from the figure 9 we can see most of the software generated traffic and try to connect outside world. But at this point a question arises

46 Legitimate Programs Results After Installation Software Name System Files Registry Keys Services Cre Del Mod Cre Del Mod Cre Del Mod ated eted ified ated eted ified ated eted ified asc-setup 212 0 13 272 2 55 1 0 0 ccsetup302 46 0 8 49 0 29 0 0 1 CNET_TechTracker_ 40 0 11 661 0 72 0 0 1 2_0_1_51_Setup cpu-z_1.56-setup-en 19 0 11 90 0 45 0 0 1 cuteftp 49 0 14 504 2 69 0 0 1 DeepBurner1 23 0 5 12 0 12 0 0 0 digsby_setup86 4161 0 6 35 0 23 0 0 1 disk-defrag-setup 58 0 11 98 0 43 0 0 2 FoxitReader43_enu_S 18 0 14 218 0 86 0 0 1 etup free-pdf 52 0 11 148 2 44 0 0 2 FreeYouTubeToMp3 0 0 7 0 0 43 0 0 0 Converter GOMPLAYERENSE 333 0 9 963 0 114 0 0 0 TUP IncrediMailSetup 113 0 12 5006 0 50 0 0 1 Install_Mario_Foreve 77 0 6 62 4 38 0 0 0 r_v5_0 InternationalPrimoP 59 1 16 8 1 52 0 0 1 DF iview428_setup 24 0 7 16 0 26 0 0 0 K- Lite_Codec_Pack_666 127 0 11 752 0 77 0 0 0 _Mega npp.5.8.6.Installer 138 0 6 7 0 11 0 0 0 Opera_1100_int_Setu 241 0 8 257 6 44 0 0 0 p Paint.NET.3.5.6.Instal 139 1 22 565 1 61 0 0 3 l PDFViewerSetup 3 0 7 7 0 15 0 0 0 PhotoScapeSetup_V3. 1515 0 6 22 0 20 0 0 0 5 pidgin-2.7.9 707 0 7 33 0 15 0 0 0 powarc1170int 307 0 12 755 2 78 0 0 1 YouTubeDownloader 9 0 7 470 0 22 0 0 0 Setup264

Table 4 Comparison result of legitimate program after installation

47 Legitimate Programs Results After Un-Installation Software Name System Files Registry Keys Services Cre Del Mod Cre Del Mod Cre Del Mod ated eted ified ated eted ified ated eted ified asc-setup 33 0 17 280 2 86 1 0 0 ccsetup302 1 0 14 17 0 40 0 0 0 CNET_TechTracker_ 2_0_1_51_Setup 41 0 18 667 0 55 0 0 1 cpu-z_1.56-setup-en 13 0 19 96 0 45 0 0 1 cuteftp 11 0 21 97 2 79 0 0 1 DeepBurner1 23 0 13 42 0 33 0 0 0 digsby_setup86 6 0 14 91 0 48 0 0 0 disk-defrag-setup 13 0 17 98 0 59 0 0 1 FoxitReader43_enu_S etup 13 0 21 123 0 96 0 0 1 free-pdf 48 0 17 122 0 56 0 0 1 FreeYouTubeToMp3 Converter 1 0 13 6 0 49 0 0 0 GOMPLAYERENSE TUP 1 0 16 257 0 132 0 0 0 IncrediMailSetup 1 0 20 2823 0 68 0 0 1 Install_Mario_Foreve r_v5_0 8 0 12 68 4 51 0 0 1 InternationalPrimoP DF 4 1 22 6 0 60 0 0 0 iview428_setup 0 0 14 14 0 35 0 0 0 K- Lite_Codec_Pack_666 _Mega 1 0 18 25 0 84 0 0 0 npp.5.8.6.Installer 67 0 14 20 0 26 0 0 0 Opera_1100_int_Setu p 1 0 15 194 6 53 0 0 0 Paint.NET.3.5.6.Instal l 13 1 23 123 3 77 0 0 4 PDFViewerSetup 1 0 15 12 0 25 0 0 0 PhotoScapeSetup_V3. 5 1 0 14 23 0 28 0 0 0 pidgin-2.7.9 1 0 15 8 0 26 0 0 0 powarc1170int 3 0 19 499 2 91 0 0 2 YouTubeDownloader Setup264 40 0 17 486 0 59 1 0 1

Table 5 Snapshot comparison results after un-installation (Legitimate Group) that what kind of outer host they are trying to connect, either they are updating the new version of software, updating database as most anti-virus software do or they are trying to register user for verification of license. At this point we have to analyze the final report and

48 log file that contains network traffic packets by using machine learning techniques to identify and to make any conclusion on the basis of the traces found in collected data.

10000

1000 Packets

During Installation 100 Packets During 10 UnInstallat ion

1

K-…

cuteftp

Logarithmic Scale (No.of Packet)

free-pdf

asc-setup

ccsetup302

pidgin-2.7.9

DeepBurner1

powarc1170int

digsby_setup86

iview428_setup

PDFViewerSetup

IncrediMailSetup

disk-defrag-setup

npp.5.8.6.Installer

cpu-z_1.56-setup-en

GOMPLAYERENSETUP

Paint.NET.3.5.6.Install

InternationalPrimoPDF

FreeYouTubeToMp3Co…

FoxitReader43_enu_Se…

Opera_1100_int_Setup

PhotoScapeSetup_V3.5

Install_Mario_Forever_…

YouTubeDownloaderSe… CNET_TechTracker_2_0…

Figure 9 Network Traffic for legitimate Group

However, this study as already being discussed is intended to generate a comprehensive detailed report which contains information gathered by AMA. It is a prototype tool for finding the values of different parameters. Moreover, above results shown in tables (2, 3, 4 and 5) gives an overview of all changes occurred in file, registry and services. Though, we cannot identify or differentiate malicious software by just looking at these tables. To draw a conclusion it is necessary to look inside the final report and finding patterns and deep analysis on report is not automated because it is not in the scope of thesis. This could be done in future work to apply different techniques on final report to draw final conclusion. According to observed behavior in table 8 in Appendix B (See 12.2.1) by Ulrich [59], this shows most common behaviors of malicious software. However, covering every aspect to monitor executable is difficult and time consuming task. AMA report covers aspects such as: modifying file, modifying registry value, creating file, creating registry key, captured network traffic, creating process, deleting file and creation of services. These details can be found in final report except network traffic that is maintained in separate log file.

49 7.5 Comparison

For validating our results and effectiveness of our implemented prototype we have compared our report with two other online available tools. For comparison we have selected Norman and Anubis, both are well known and reputable behavioral analysis solution that are based on real time monitoring and provide free online analysis. The selection of these two online tools is due to their availability as free service and ease of use. We have submitted same software applications to both of above mentioned analyzers and they have given their results according to their format. Norman send its final analysis report to e-mail ID which is provided at the time of submission and Anubis provides its final report online at their website. After collecting reports from both, we performed manual analysis of information provided in the reports and compare results with AMA report. Numerical values for file system, directory and registry changes are collected from reports. Table 6 below shows comparison of information available in final reports of Anubis, Norman and AMA. However, results from Anubis and Norman are not containing all information in every report. Some contains changes to file system, directory and registry, and some reports contain no such material. Also, there are some software for which both analyzers are unable to generate report and for some they are unable to upload because of the file size limitation. In table 6 below, fields present with ―N/A‖ means not available because unavailability of data in reports or corrupted installers those are not execute by analyzers. Table 6 below shows the comparison of AMA report with Anubis and Norman. In the table above dark gray color means the amount of activity which is less in number in contrast with other tools. White is for the number of activity which is greater in number in comparison to other and light-gray depicts the same count in all reports. Furthermore, the huge difference in number of file activities could be due to the variance of techniques used in analyzing executable. Also, the internal functionality of Anubis and Norman is not disclosed to public that how they are performing analysis and how they are filtering data that is displayed in their final report. However, in AMA we are not filtering any data, all data that is captured and comes after taking difference of snapshots; it is displayed in final report as it is. This could be the reason that AMA shows much more numbers in white color field. Whereas, in some cases Anubis and Norman shows larger number of file and registry activity as compare to AMA.

50 ZangoSADF ZangoSA WebExtractProfessional weather stop-sign_install Srv setup_anti_leech.com 1.0.2.3- wareGuide_ALPluginIE- SPYWARE_TREND_Spy enbrowser SnackMan_ Setup_hotbar sdasetup RLSetup_Adsupported OEAddOn er NewsReaderSetup_crawl NetPumper-1.50-setup minibuginstaller kazaa_setup gdsol_goodsol flash_decompiler_eltima guide FiestaBarInstall_Spyware Em2 cfree4_1_pro_setup capex_captureexpress acezJukebox abcscrabblewe 3dspringblossoms_2066 SWName Cre. 10 10 72 12 26 0 2 8 5 4 8 7 0 8 2 4 2 Files System Del. Del. 0 0 0 5 0 0 2 0 0 3 0 0 0 1 9 0 0 Mod. 15 11 14 11 12 15 79 10 43 6 1 1 0 1 4 7 2 Cre. 0 2 0 1 1 3 1 0 1 0 2 1 1 0 0 0 1 Directory Anubis Del. Del. N/A N/A N/A N/A N/A N/A N/A N/A 0 0 8 1 0 0 0 0 0 0 0 0 0 0 0 0 0 Mod. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Cre. Registry Keys Registry 11 0 0 0 0 0 0 0 0 3 2 0 0 0 0 0 3 Del. Del. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 Mod. 152 108 515 182 11 29 29 24 17 4 1 3 2 0 0 0 0 Cre. 0 0 0 0 4 0 1 1 4 6 1 9 2 2 Files System Comparison of Illegitimate Software ComparisonofReports Illegitimate Del. Del. 0 0 0 0 0 0 1 0 1 0 0 6 0 0 Mod. 2 2 0 0 0 0 0 0 3 1 1 4 0 0 Cre. 0 0 2 2 4 0 1 1 3 4 2 0 0 1 Directory Norman Del. Del. N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Mod. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Cre. Registry Keys Registry 0 0 0 2 0 0 0 0 0 0 7 0 0 0 Del. Del. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Mod. 11 10 0 0 0 1 0 3 0 2 8 3 0 4 Cre. 1910 378 615 11 58 48 10 75 40 10 16 33 54 14 System Files System 8 0 0 6 0 6 2 Del. Del. 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 Mod. 15 25 14 23 23 25 25 25 23 31 23 35 4 3 3 3 3 9 3 3 3 9 Cre. 30 10 30 0 0 2 0 0 0 4 0 8 1 3 2 4 7 2 0 3 8 0 Directory Del. Del. AMA N/A N/A N/A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Mod. 5 6 9 5 6 6 7 7 7 9 6 6 4 4 5 5 4 6 4 8 8 0 Cre. 2829 2404 193 611 159 520 Registry Keys Registry 26 31 18 22 16 26 11 84 45 77 38 48 41 0 1 6 Del. Del. 11 0 0 0 0 1 0 0 0 8 0 2 0 0 0 0 0 0 0 0 0 0 Mod. 11 40 20 61 18 18 11 68 38 55 45 26 26 35 63 34 31 33 48 35 85 9

Table 6 Report Comparison for Illegitimate Group

51

In the above table 6 we can see clearly difference between three reports, and there information present in them. We can say that report generated by AMA contains more details in every report as compare to Anubis and Norman. However, both online analyzers contain more functionality than AMA, and they are fully functional and fully developed analyzers. Whereas, AMA is just a prototype which lacks many advance techniques for example scanning signatures and analyzing behaviors from reports and giving classification. But still report generated by AMA can give handful of information every time without missing or filtering any detail. For more clarification and identifying the similar findings of all three analyzers we can dig into reports in more details. We have taken a random example from analyzed software to give an insight into reports. For example while comparing AMA results with Norman in detail we found that zangosa.exe tries to connect to "te.zango.com". Also, it is identified as W32/Malware.GEWD by Norman. An excerpt from Norman result for network traffic is given in Appendix B. In comparison we have taken a small portion from network traffic log file which is generated by AMA while analyzing malicious software. We have found similar traces that Zango.exe tries to connect te.zango.com, similar results found for some other software from both groups. Excerpt of log file is given in Appendix B. On the other hand same behavior is identified in legitimate software, most of them also tries to connect outer host and left many files and registry keys after uninstallation this makes them suspicious as well. However, results presented in Appendix B from Norman and Anubis varies in some details and lacks identified parameters that were present in final report of AMA. Also, both online tools are fully functional and contain extra functionality that is not yet present in AMA and could be added in future. The difference in reports can be due to one factor, i.e. available tools are giving final report after analyzing their findings and applying different techniques or using signature scanners. Whereas, AMA is just giving report for collected data that is gathered by using snapshot technique. Furthermore, for giving final conclusion like Norman, Anubis or any other fully functioning tool, it is important to run further analysis on AMA report. In the end we can say that both online analyzers contain limitation of uploading file bigger than some specific size. In AMA we have no limitation regarding file size, and it provides consistent coverage of data gathering as compare to other real time monitoring analyzers.

52 p264 YouTubeDownloaderSetu powarc1170int pidgin-2.7.9 PhotoScapeSetup_V3.5 PDFViewerSetup Paint.NET.3.5.6.Install Opera_1100_int_Setup npp.5.8.6.Installer ega Lite_Codec_Pack_666_M K- iview428_setup InternationalPrimoPDF _0 Install_Mario_Forever_v5 IncrediMailSetup GOMPLAYERENSETUP verter FreeYouTubeToMp3Con free-pdf p FoxitReader43_enu_Setu disk-defrag-setup digsby_setup86 DeepBurner1 cuteftp cpu-z_1.56-setup-en _1_51_Setup CNET_TechTracker_2_0 ccsetup302 asc-setup SWName Cre. System Files System 10 30 21 2 3 1 4 0 6 9 0 0 7 9 2 Del. Del. 2 0 2 2 1 2 0 3 0 0 0 0 0 2 5 Mod. 13 32 28 32 33 16 2 5 1 7 0 5 0 0 7 Cre. 10 1 1 1 0 1 0 1 2 0 8 0 3 0 0 Directory Del. Del. Anubis 0 N/A N/A N/A 0 0 N/A 0 N/A 0 0 0 N/A 0 N/A N/A N/A 0 0 0 0 0 0 0 N/A Mod. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Cre. Registry Keys Registry 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 Del. Del. 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 Mod. 223 289 29 33 31 29 14 28 30 52 68 29 1 0 0 Cre. 2 1 1 0 0 0 0 0 1 1 2 2 1 0 Files System Comparison of Legitimate Software ComparisonReports of Legitimate Del. Del. 2 1 1 0 0 0 0 0 0 0 0 1 0 0 Mod. 0 0 0 0 0 0 0 0 1 0 0 2 0 0 Cre. 0 1 1 0 0 1 0 0 0 1 5 0 1 0 Directory Norman Del. Del. N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 0 0 1 0 0 0 0 0 0 0 0 0 0 0 Mod. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Cre. Registry Keys Registry 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Del. Del. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Mod. 0 0 0 4 1 0 4 4 1 0 1 1 4 4 Cre. 1515 4161 307 707 139 241 138 127 113 333 212 24 59 77 52 18 58 23 49 19 40 46 System Files System 9 3 0 Del. Del. 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Mod. 12 22 11 16 12 11 14 11 14 11 11 13 7 7 6 7 8 6 7 6 9 7 6 5 8 Cre. 401 10 63 19 50 57 11 10 21 13 11 11 16 1 1 9 8 6 8 0 5 6 4 4 5 Directory Del. Del. AMA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Mod. 11 12 10 10 10 18 12 10 12 11 13 13 13 10 15 13 13 12 11 13 13 13 11 12 8 Cre. 5006 470 755 565 257 752 963 148 218 504 661 272 Registry Keys Registry 33 22 16 62 98 35 12 90 49 7 7 8 0 Del. Del. 0 2 0 0 0 1 6 0 0 0 1 4 0 0 0 2 0 0 0 0 2 0 0 0 2 Mod. 114 22 78 15 20 15 61 44 11 77 26 52 38 50 43 44 86 43 23 12 69 45 72 29 55

Table 7 Report Comparison for Legitimate Group

53 Table 7 above shows the comparison of reports for legitimate group of software. We can see clear difference in the amount of details captured and present in the final report of AMA as compare to Anubis and Norman. White field are showing numbers of captured files that are greater in number in accordance with other tools, dark gray depicts the numbers those are less in comparison to other reports and light gray shows the same number of counts captured by all tools. However, at some places there is huge amount of difference in captured files. This could be due to difference in techniques using for analyzing malware. That could further analyze in future work. For current conducted study we can say that report generated by AMA is containing more information for all software as compare to other tools in reference with the file system, registry and services. Furthermore, in current study we are only targeting file system changed, registry changes, services changes, port mapping and network traffic capturing. Other tools might have different other advance techniques as well, but that is not the part of comparison. Classifications of software from Anubis and Norman according to their identified signature are presented in Appendix B (See 12.1.4). This table shows that the illegitimate group that is selected for analysis and experiment are malicious software and this is verified by reports generated from online analyzers that contains functionality of scanning signatures using antivirus or signature scanner. However, from AMA report we can just classify software as suspicious, that it seems to be malicious. But for the current study we can‘t give final conclusion that software is member of some specific malware family and we are not identifying their signatures.

54 CHAPTER 8 8 DISCUSSION

In this section we will try to answer the research questions which are driving this study. Malwares comes in many different forms, using many different advance techniques to skip away from security software and hide itself. Some of them come integrated in benign software and installed as their component, user never come to know that they are installing some malicious software because they remain out of scene before installation. They start their intended function when host software is installed. Some of them come as additional component with other software or installed as an update. Many of them are so well planned and developed that they come alone without any protection of any other software but instead they present them as friendly software or use many advance techniques to avoid detection by security software such as obfuscation and polymorphism. And even some legitimate software from reputed organizations perform many illegitimate functions such as sending private user information back or using computer resources. Detection of these malicious software or even differentiation of malicious from benign software becomes very hard. Both kind of software perform similar activities when they were installed such as adding, deleting or changing files, registry values, kernel services, starting, stopping or modifying of different processes and trying to connect to different resources on internet. It is normal that they perform these activities there is nothing wrong in it, because all legitimate software are expected to perform these activities and as our experiment result shows that all of the samples, whether they are malicious or benign performing almost all of the above mentioned activities but the big question is ―where‖ and ―what kind‖ of changes they were making. Whether they change some files in application installation folder (―PROGRAM FILES‖) or in critical system paths such as ―WINDOWS‖ or ―SYSTEM32‖ folders where most of the software not supposed to change anything normally. Or if some software tries to connect to someone from internet doesn‘t make them malicious but ―where‖ and ―for what‖ they are making connection is important questions whether they are just checking for latest updates or sending private information to someone else. And also when they were uninstalled they are supposed to undo all the changes they had made while installation, from every location in the system, whether in files system or registry e.g. But as we can see in experiment results almost all the sample software, including malicious and benign, leaving some files and registry keys behind. Most of them leaving files and even many of them tried to connect to external hosts using internet. Leaving some changes behind doesn‘t explicitly mark the software as malicious because it could happen due to bad uninstallation also. In experiments we can see that both types of software

55 performing almost same kind of activities and that‘s where it became very hard to draw a clear line between there distinct activities generally but by looking in the AMA analysis reports more deeply and applying machine learning algorithm we could find the difference. After analyzing reports we can conclude about any software that it is benign or malicious, which is out of the scope of this study. This part discusses answer of the first research question ―To what extent can automatic runtime analysis be used to analyze software characteristics?” and further in the text second research question ―Does automated analysis model lead to more accurate capturing of system changes compared to existing solutions?‖ is answered with discussion. To capture all the above mentioned activities this study proposed an automation model to analyze any number of software without any user interaction, which is elaborated in section 5 in details. To prove this model we developed a prototype, Automated Malware Analyzer (AMA). AMA analyzes software at runtime (known as dynamic analysis), first it installs and then it uninstalls the sample given to it. AMA exhibits the usage of open source off the shelf components. We delimited its functionality in time limits and scope of this master thesis work. Moreover, we evaluated the performance of proposed system by conducting experiments with AMA, where we investigated the possibility to differentiate software as legitimate or illegitimate on the basis of their captured behavior. We conducted experiments on 50 software applications by installing and uninstalling each of them to track activities performed by them i.e. if the application is making some changes in system, either it is hosting some other application with it or not and removing its tracks after uninstallation of application. Main aim of AMA is to introduce an automated system that can reduce user interaction and save examination time spent on a large amount of executable binaries. The object oriented design of AMA allows us to add different new modules to integrate with it, and work together with previous modules. However, this resulted in a foundation and a mechanism that uses automated techniques to increase user awareness and to open new horizons for researchers to study malware behaviors. AMA can be used with other solutions available such as reputation system [52] to provide firsthand information directly from analysis. By this we can prevent a risk of false ratings. The experiment has shown that most of the analyzed software contains suspicious behavior. Either they are making changes in critical files, folders or registry values, in addition with their incomplete un-installation. All applications are analyzed with an open connection to internet and most of them try to connect to foreign host and downloaded data. This connection attempt can be seriously harmful; they can download malicious software or transmit sensitive data from users system such as information about user, account details, e- mail address, credit card numbers or any private document. One of the main reasons for

56 performance degradation of computer systems is presence of such applications. This finding of same behavior of both software groups raises a new question for researcher to find out what kind of communication they performed on the name of legitimate. By this we mean that if for example Microsoft is collecting personal information from user‘s system and pretending as legitimate, but in the back they are doing illegitimate activities. However, we believe that if we can succeed in making such automated analysis a part of operating system or an organization‘s policy to analyze software, which needed to be installed by user or employee; it has to be passed through this analysis process. In case of operating system it can alert user about pros and cons of that software, then user decide whether to install it or not. Whereas, in case of organization, installation of any software has to be approved from security department, though this extra check is adding little delay and interruption in organizational process, But for the sake of security and data privacy it is adding valuable and reliable contribution. In both cases end user will get benefit and it can mitigate the risk of infection. Furthermore, according to the scope of this thesis the result generated using snapshot based technique is also effective and reduces the amount of unwanted noise in data, which is not worth to have it in analysis report. Although it is out of the scope of this study to classify software as malicious or benign, however just giving a little concentration to final results in reports, we can tell that the software is leaving tracks and doesn‘t remove all its files that are created during installation. So we can still suspect analyzed software as suspicious. The report generated by AMA contains enough information about the software behavior that could be used to classify software by using machine learning and data mining techniques. Which can be used to extract data sets of both malicious and benign software and this data set could be used in machine learning algorithm e.g. support vector machine (SVM) or statistical analyzer to generate classifier, which then can classify new instances of software as benign or malicious. For validating our report we have also compare it with results generated by other online analyzers. We found that our report contains much more details about files and registry keys they created and modified. Although our tool is just a prototype with limited functionality as compare to available professional tools. But still we can get enough detail about software behavior and its effect to system during and after installation/uninstallation. This is also an interesting result that both software contain same behavior, which makes them hard to distinguish. Also, our scope of study is limited to development of such automated tool that can perform automated analysis. Our findings and tool can be further checked in future work and can be improved with additional functionalities.

57 CHAPTER 9 9 CONCLUSION

In this study we investigate the distinct behavior of malwares, which makes them questionable and further we proposed a model for automating runtime analysis of malwares, which could capture behavior of any number of given software and evaluated effectiveness of the proposed model by experiments. We find out that there are many types of changes performed by software in different areas of computer system, which includes creation, deletion or modification of file, directories, registry values, kernel services, processes and generation of network traffic. And more importantly these changes are performed by both types of software benign or malicious. So generally we could not distinguish their activities or behavior. Further in the study we developed a prototype of the proposed automation model AMA. AMA is developed using some already available open source software components such as SupermonX, Windump etc. and some purpose built components such as AMAHost, AMAGuest and Auto-Installer. To evaluate the performance of AMA that is how effectively it captures the behavior of software samples analyzed in it, we conducted experiments by using 25 benign and 25 malicious software. Experiments result shows that both types of software behave almost identical there are no differences in their activities performed during and after installation or uninstallation generally. Both types of software making changes in files, registry, kernel services, processes and they are generating network traffic (attempt to connect on internet). There is no specific pattern could be outlined which points toward specific software group. However, by observing reports contents exclusively, one can easily understand the purpose of the analyzed software. The reports generated by AMA include all the details of activities performed and all network traffic generated. To further ensure the capturing abilities of AMA, reports generated by AMA are compared with reports from similar products available online, which shows that AMA reports has all the details which is available in others solutions reports, or in some cases even more, which is enough to classify the subjects. But for classification additional work need to be done to analyze the reports deeply and finding patterns e.g. by using machine learning, which is out of the scope of this thesis. Furthermore, in this study we have raised a new question with the finding that legitimate and illegitimate software contains same behavior, which makes them hard to distinguish from each other. This is also an interesting result which came forward after analyzing software with AMA.

58 10 FUTURE WORK

AMA is developed as functional prototype for our thesis project, so that we can examine the effectiveness of automated analysis system. This project is presented as milestone for future work, new modules can be plugged into AMA that can analyze the generated report and scan malicious signatures using AV scanners. It can be improved in different aspects; we can get more refined details according to personal requirements and can use machine learning algorithms such as Naive Bays and SVM to identify presence of malicious patterns in EULA [28] before completing the installation and remaining at the EULA accepting screen. Also, we can double check the activity of analyzed software by comparing the analysis report with EULA results, that whether this software is performing the same activities that are mentioned in EULA or not. We can create a virtual internet environment that can provide a virtual internet to AMA, malicious program inside AMA can perform its activity as in real environment. From this in future we can get network related activity of analyzed binary in more details and we can capture the actual data transmitting in this communication and analyze it automatically to classify software either legitimate or illegitimate. AMA can be updated to lookup the IP addresses or DNS names communicating with by using black-list of questionable IPs/DNS-names. Furthermore, this prototype can be updated to automatically detect commercials from e.g. pop-up windows, and keeping statistics of it. Auto-installer can be improved to cop up maximum windows installers, more intelligence can be added into it which helps to make installation process more fast and reliable for vast set and kind of software. Currently executable are added from local hard disk, in future we can attach an online system to it so users from all around the world can submit their binaries for analysis, also different other sources can be attached and further comparing AMA reports with the malware descriptions of well-known anti-virus companies.

59 11 REFERENCES

[1] Adelman, L., "Experiments, quasi-experiments, and case studies: A review of empirical methods for evaluating decision support systems,", IEEE Transactions on Systems, Man and Cybernetics, vol.21, no.2, pp.293-301, Mar/Apr 1991. [2] A. Jacobsson, M. Boldt, and B. Carlsson, ―PRIVACY-INVASIVE SOFTWARE IN FILE- SHARING TOOLS.‖ [3] A. M. Memon, M.E. Pollack and M.L. Soffa. Hierarchical GUI Test Case Generation Using Automation Planning, IEEE Trans. Software Eng. 200, 27(2): 144-155. [4] A. Memon. GUI Testing: Pitfalls and Process. Software Teclmologies. 2002: 87- 88. [5] A. Moshchuk, T. Bragin, S.D. Gribble, and H.M. Levy, ―A Crawler-based Study of Spyware on the Web.‖ [6] A.D.-it-yourself Kit and C.W. Wojnercertat, ―Mass Malware Analysis:,‖ October, 2009. [7] Anti-Spyware Coalition, ―Anti-Spyware Coalition‖, http://www.antispywarecoalition.org. Last accessed 15 Dec, 2010. [8] Anubis. http://anubis.iseclab.org/?action=about. Last accessed 15 Dec, 2010. [9] Arnett, K.P.: Busting the Ghost in the Machine. Communications of the ACM 48(8)(2005). [10] AutoIt. http://www.autoitscript.com/autoit3/index.shtml. Last accessed 15 Dec, 2010. [11] Barkley Packet Filter http://www.pcausa.com/. Last accessed 15 Dec, 2010. [12] C. Kr, ―Master‘s Thesis TTAnalyze: A Tool for Analyzing Malware,‖ 2005. [13] C. Willems, T. Holz, and F. Freiling, ―Toward Automated Dynamic Malware Analysis Using CWSandbox,‖ IEEE Security and Privacy Magazine, vol. 5, Mar. 2007, pp. 32-39. [14] COMODO. http://camas.comodo.com/. Last accessed 15 Dec, 2010. [15] D. C. Gordana. Constructive research and info-computational knowledge generation. Studies in Computational Intelligence, Model-Based Reasoning in Science and Technology, 314:359–380, 2010. [16] Download www.download.com. Last accessed 15 Dec, 2010. [17] E. Skoudis, ―Malware – Fighting Malicious Code‖, Prentice Hall PTR,Upper Saddle River NJ, 2004.

60 [18] EXEPacker http://www.symantec.com/connect/articles/understanding-difference- between-exe-and-msi. Last accessed 15 Dec, 2010. [19] Fabrice Bellard. Qemu. http://fabrice.bellard.free.fr/qemu/, 2005. Last accessed 15 Dec, 2010. [20] Fport http://www.mcafee.com/us/downloads/free-tools/fport.aspx. Last accessed 15 Dec, 2010. [21] F-Secure Virus Glossary. http://www.f-secure.com/virus-info/glossary.shtml, 2005. Last accessed 15 Dec, 2010. [22] Gerald J. Popek and Robert P. Goldberg. Formal requirements for virtualizable third generation architectures. Commun. ACM, 17(7):412–421, 1974. [23] Gibson Research Corporation, ―OptOut – Internet Spyware Detection and Removal‖, http://www.grc.com/optout.htm. [24] GNU Xnee http://www.gnu.org/software/xnee/. Last accessed 15 Dec, 2010. [25] H. A. Lichstein. When should you emulate? Datamation, 15:205–210, 1969. [26] Internet Database Internet Movie (IMDb), http://www.imdb.com last accessed, Last accessed 15 Dec, 2010. [27] J. Van Randwyk, K. Chiang, L. Lloyd, and K. Vanderveen, ―Farm: An automated malware analysis environment,‖ 2008 42nd Annual IEEE International Carnahan Conference on Security Technology, Oct. 2008, pp. 321-325. [28] J.L. Atangana, B. Carlsson, and M. Boldt, ―Countering Privacy-Invasive Software ( PIS ) by End User License Agreement Analysis Arvind Dathathri,‖ Science, 2007. [29] JoeBox. http://www.joebox.ch/index.php. Last accessed 15 Dec, 2010. [30] K. Rieck, T. Holz, C. Willems, P. D, and P. Laskov, ―Learning and Classification of Malware Behavior,‖ Learning, 2008, pp. 108-125. [31] Kevin Lawton et al. Bochs. http://bochs.sourceforge.net/, 2005. Last accessed 15 Dec, 2010. [32] Litterbox. www.wiul.org. Last accessed 15 Dec, 2010. [33] M. Boldt and B. Carlsson, ―Privacy-Invasive Software and Preventive Mechanisms,‖ 2006 International Conference on Systems and Networks Communication (ICSNC‟06), vol. 00, Oct. 2006, pp. 21-21. [34] M. Boldt, A. Jacobsson, N. Lavesson, and P. Davidsson, ―Automated Spyware Detection Using End User License Agreements,‖ 2008 International Conference on Information Security and Assurance (isa 2008), Apr. 2008, pp. 445-452. [35] M. Christodorescu and S. Jha, ―Testing Malware Detectors", in the proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis, 2004.

61 [36] M. Cova, C. Leita, O. Thonnard, A. Keromytis, and M. Dacier, ―Gone Rogue: An Analysis of Rogue Security Software Campaigns,‖ 2009 European Conference on Computer Network Defense, Nov. 2009, pp. 1-3. [37] M.-W. Wu, Y.-M. Wang, S.-Y. Kuo, and Y. Huang, ―Self-Healing Spyware: Detection, and Remediation,‖ IEEE Transactions on Reliability, vol. 56, Dec. 2007, pp. 588-596. [38] M.-wei Wu, Y.-min Wang, S.-yen Kuo, and F. Park, ―A Stateful Approach to Spyware Detection and Removal Survivability Spyware:,‖ Computing, 2006. [39] Mendel Rosenblum. The reincarnation of virtual machines. Queue, 2(5):34–40, 2004. [40] MicrosoftInstaller(MSI)http://www.symantec.com/connect/articles/understanding- difference-between-exe-and-msi. Last accessed 15 Dec, 2010. [41] MicrosoftVirtualPChttp://www.microsoft.com/windows/virtual- pc/features/default.aspx. Last accessed 15 Dec, 2010. [42] Microsoft Virtual PC. http://www.microsoft.com/windows/virtualpc/, 2005. [43] Mingming, Wang, and Wang Jiangqing. ―Realization on Intelligent GUI Automation Testing Based-on .NET.‖ Human Factors (2010): 14-17. [44] Norman SandBox. http://www.norman.com/security_center/security_tools/en. Last accessed 15 Dec, 2010. [45] Oracle Virtual Box http://www.virtualbox.org/. Last accessed 15 Dec, 2010. [46] Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. Xen and the art of virtualization. In SOSP ‘03: Proceedings of the nineteenth ACM symposium on Operating systems principles, pages 164–177, New York, NY, USA, 2003. ACM Press. [47] Pearpc. http://pearpc.sourceforge.net/, 2005. Last accessed 15 Dec, 2010. [48] Qaliber http://www.qaliber.net/testbuilder.php. Last accessed 15 Dec, 2010. [49] SupermonX http://sourceforge.net/projects/supermonx/. Last accessed 15 Dec, 2010. [50] SysAnalyzer. http://labs.idefense.com/software/malcode.php. Last accessed 15 Dec, 2010. [51] Systracer. http://www.blueproject.ro/systracer. Last accessed 15 Dec, 2010. [52] T. Larsson, N. Lind, and M. Boldt, ―Blocking Privacy-Invasive Software Using a Specialized Reputation System,‖ Science, 2007. [53] T.D. Cook and D.T. Campbell, Quasi-Experimentation – Design and Analysis Issues for Field Settings, Houghton Mifflin Company, 1979.

62 [54] T.F. Stafford and R. Poston, ―Online Security Threats and Computer User Intentions,‖ Computer, vol. 43, Jan. 2010, pp. 58-64. [55] T.-yen Wang, S.-jinn Horng, M.-yang Su, and C.-hsiung Wu, ―A Surveillance Spyware Detection System Based on Data Mining Methods,‖ 2006 IEEE International Conference on Evolutionary Computation, 2006, pp. 3236-3241. [56] The Art of Computer Virus Research And Defense (Feb 2005). [57] Thompson, R. 2005. Why spyware poses multiple threats to security. Commun. ACM48,8(Aug.2005),41-43. DOI= http://doi.acm.org/10.1145/1076211.1076237. [58] Threat Expert. http://www.threatexpert.com/overview.aspx. Last accessed 15 Dec, 2010. [59] U. Bayer, I. Habibi, D. Balzarotti, E. Kirda, and C. Kruegel, ―A View on Current Malware Behaviors.‖. [60] U. Bayer, P. Milani Comparetti, C. Hlauschek, C. Kruegel, and E. Kirda. Scalable, Behavior-Based Malware Clustering. In Symposium on Network and Distributed System Security (NDSS), 2009. [61] UIAutomationFramework.http://msdn.microsoft.com/en- us/library/ms747327.aspx. Last accessed 15 Dec, 2010. [62] Virtual Box Guest Addition http://www.virtualbox.org/wiki/VirtualBox. Last accessed 15 Dec, 2010. [63] VM Ware VSphere http://www.vmware.com/products/vsphere/. Last accessed 15 Dec, 2010. [64] Vmware. http://www.vmware.com/, 2005. Last accessed 15 Dec, 2010. [65] Vigdis By Kampenes , Tore Dybå , Jo E. Hannay , Dag I. K. Sjøberg, A systematic review of quasi-experiments in software engineering, Information and Software Technology, v.51 n.1, p.71-82, January, 2009 [66] White L and Almezen H, Generating Test Cases for GUI Responsibilities Using Complete Interaction Sequence, Int. Symp.on Software Reliability Engineering, San Jose CA. 2000-10 : 110—121. [67] Windows Presentation Foundation (WPF) http://msdn.microsoft.com/en- us/library/ms750441.aspx. Last accessed 15 Dec, 2010. [68] WinDump http://www.winpcap.org/windump/. Last accessed 15 Dec, 2010. [69] Wikipedia http://en.wikipedia.org/wiki/Constructive_research, 2010 [70] ZeroWine. http://zerowine.sourceforge.net/. Last accessed 15 Dec, 2010.

63 12 APPENDIX

Appendix A: Terminologies and Definitions Virus

―A computer virus is code that recursively replicates a possibly evolved copy of itself. Viruses infect a host file or system area, or they simply modify a reference to objects to take control and then multiply again to form new generation.‖ [56]. In [17] virus is defined as: “A virus is a self-replicating piece of code that attaches itself to other programs and usually requires human interaction to propagate.”

EULA

End User License Agreement, that user must accept before installing software. Binary Portable Executable some time called binary.

PIS

Privacy Invasive Software can be any software ranging from legitimate to benign software.

Malware Short for ―malicious software‖, ‖A common name for all kind of unwanted software such as viruses, worms, Trojan and jokes‖ [21].

Worms Worms are network viruses, primarily replicating on networks. Usually a worm will execute itself automatically on a remote machine without any extra help from a user. Worms are typically standalone applications without a host program. In other definition “A worm is a self-replicating piece of code that spreads via networks and usually doesn‟t require human interaction to propagate.” [17] Trojan horse “A trojan horse is a program that appears to have some useful or benign purpose, but really masks some hidden malicious functionality.” [17]

64 Adware Term Adware refers to ―Advertising‖ and ―Software‖, i.e. program that advertises ads with software installation or by other means. Adware can be defined as. [7, 22] “Any program that causes advertising content to be displayed.”

Spyware Any program that keeps track of user activity and steals information and sends them to attacker is known as Spyware, according to definition of Steve Gibson. [23] “Spyware is any software which employs a user‟s Internet connection in the background (the so-called „backchannel‟) without their knowledge or explicit permission.” Exe Packer

MSI still come with an EXE (Setup.exe), this EXE is so called ‗bootstrapper‘[18]. It doesn‘t perform installation it only checks if the correct version of Windows Installer is present on the system, if not it launches the MSI Redistributable (MsiInstA.exe or MSiInstW.exe depending on the platform) and then launch MSIEXEC.EXE on the MSI File. In some installers MSI and MSI Redistributable are packed inside the EXE file, so you don‘t see they are there.

Typical installations can come in three flavors:

1. A custom, third party installation packed with other installer makers 2. Windows installer installation in MSI file 3. An EXE file that bootstraps an MSI file

Whereas MSI files can only be installations and EXE files can be anything that can run on your computer [18]. MSI

MSI files are database files used by windows installer, they contain information about an application which is separated in features and components, and every component may contain files, registry entries, shortcuts, DLL etc. MSI also contains UI that is used for installing and to execute custom action to complete installation. MSI files are executed by an exe named MSIEXEC.EXE which is included in Windows by default. This exe reads the data in MSI file and execute it accordingly. It is not applicable for other packed executable to run them through command line using msiexec.exe [40].

65 WPF Windows Presentation Foundation (WPF) is a next generation presentation system developed by Microsoft for building Windows client applications with visually stunning user experience. WPF provides functionality to access user interface elements (UI elements), by which we can automate already build applications. By getting control over UI we can control application without user interaction [67].

UI Automation Framework

Microsoft UI Automation is the new accessibility framework for Microsoft Windows, available on all operating system that support Windows Presentation Foundation (WPF). UI Automation provides programmatic access to most user interface (UI) elements on the desktop, enabling assistive technology product such as screen readers to provide information about the UI to end users and to manipulate the UI by means other than standard input. UI also allows automated test scripts to interact with UI [61].

Snapshot Term Snapshot is used to capture or take an image or a dump. We are taking complete snapshot of file system, registry, and kernel services. By taking snapshot we can reduce a huge amount of noise data that is generated during real-time monitoring; due to unwanted data there is a major chance to miss an important behavioral trace. The snapshot technology is better in a sense that we can dump the complete present state of Operating system. By taking before and after snapshot we can compare them for changes occur in that specific time interval when a new software installation is going on. From this comparison results we can easily get the inside view of all changes made to file system, registry, kernel services, newly started process or process that are killed by some malicious software and open ports of system with the status of active connections.

66

12.1 Appendix B

Network Connection Result from Norman:

[Network services] * Connects to "te.zango.com" on port 80 (TCP). * Opens URL: te.zango.com/trackedevent.aspx?ver=0.0.0.0&pkg_ver=&rnd=13.

Network Traffic captured by AMA 02:34:46.618081 IP xx.xx.se.65374 > xx.xx.se.53: 289+ A? te.zango.com. (30) 02:34:46.850180 IP xx.xx.se.1050 > te.zango.com.80: S 2188354373:2188354373(0) win 64240 02:34:46.868143 IP xx.xx.se.53 > xx.xx.se.59419: 33989 NXDomain 0/1/0 (116) 02:34:47.045180 IP te.zango.com.80 > xx.xx.se.1050: S 20544001:20544001(0) ack 2188354374 win 65535 02:34:47.045258 IP xx.xx.se.1050 > te.zango.com.80: . ack 1 win 64240 02:34:47.045830 IP xx.xx.se.1050 > te.zango.com.80: P 1:587(586) ack 1 win 64240

67 12.1.1 Observed Behavior of Malware

Observed Behavior Percentage of Samples Modifying a file: 79.87% Modifying a registry value: 74.59% Creating a file: 70.78% Creating a registry key: 64.71% Network Traffic: 55.18% Creating a process: 52.19% Deleting a file: 42.57% Display a GUI window: 33.26% Installation of a Windows service: 12.12% Installation of a Windows kernel driver: 3.34% Modifying the hosts file: 1.97% Installation of an IE BHO: 1.72% Writing to stdout: 1.09% Writing to stderr: 0.78% Installation of an IE Toolbar: 0.07%

Table 8: Observed Common Behavior by Ulrich [30]

68

12.1.2 Top 10 Auto-start locations

Percentage of Auto start Location Sample HKLM\System\Currentcontrolset\Services\%\Imagepath 17.53% HKLM\Software\Microsoft\Windows\Currentversion\Run% 16.00% HKLM\Software\Microsoft\Active Setup\Installed 2.50% Components% HKLM\Software\Microsoft\Windows\Currentversion\Explorer\ 1.72% BrowserHelper Objects% HKLM\Software\Microsoft\Windows\Currentversion\Runonce 1.60% % HKLM\Software\Microsoft\Windows\Currentversion\Explorer\S 1.30% hellexecutehooks% HKLM\Software\Microsoft\Windows 1.09% NT\Currentversion\Windows\Appinit Dlls HKLM\Software\Microsoft\Windows 1.04% NT\Currentversion\Winlogon\Notify% HKLM\Software\Microsoft\Windows\Currentversion\Policies\E 0.67% xplorer\Run% C:\Documents and Settings\%\Start Menu\Programs\Startup\% 0.20%

Table 9: Top 10 Auto-start locations [59]

69 12.1.3 Trigger Conditions

Trigger Condition Process Creation: Single or multiple new processes launched, excluding the known processes. File System Activity: A file is created or modified, excluding those in ―safe‖ folders. Registry Activity: Sensitive registry entries are modified, such as those that launched on reboot, or modify any critical registry value to grant privileged access. Network Activity: Any process try to connect outside host, or website

Table 10: Trigger Conditions [5]

70 12.1.4 Illegitimate Signature from Anubis and Norman Illegitimate Group Software Anubis Norman Trojan.Win32.C4DLMedia NetPumper-1.50-setup W32/Suspicious_Gen2.DJLJV (Sig-Id: 31614184) NO_MALWARE NewsReaderSetup_crawler No report generated NO_VIRUS RLSetup_Adsupported No Threat Found Setup file corrupted Sdasetup No report generated Setup file corrupted NO_MALWARE not-a-virus: WebToolbar.Win32.Zango Setup_hotbar (Sig-Id: 1314766) NO_VIRUS

SPYWARE_TREND_SpywareGui de_ALPluginIE-1.0.2.3- No Threat Found W32/Banker.FLEX

setup_anti_leech.com not-a-virus: AdWare.Win32.HotBar Srv W32/180Solutions.AGP (Sig-Id: 28190094) NO_MALWARE stop-sign_install No report generated NO_VIRUS AdWare.Win32.HotBar Weather W32/180Solutions.AGQ (Sig-Id: 919846) NO_MALWARE WebExtractProfessional No Threat Found NO_VIRUS ZangoSA adWare.Zango.C (Sig-Id: 340064) W32/Malware.GEWD ZangoSADF No report generated W32/180Solutions.AHQ not-a-virus: Abcscrabblewe AdWare.Win32.WebHancer Suspicious_Gen2.HCTC (Sig-Id: 11103017) NO_MALWARE acezJukebox Wise_Installer vna SN:1361 NO_VIRUS NO_MALWARE capex_captureexpress No Threat Found NO_VIRUS cfree4_1_pro_setup No Threat Found Setup file corrupted NO_MALWARE FiestaBarInstall_Spywareguide No Threat Found NO_VIRUS

flash_decompiler_eltima No report generated Setup file corrupted

gdsol_goodsol No report generated Setup file corrupted

kazaa_setup No Threat Found Error Starting Program

NO_MALWARE Wise_Installer vna SN:1543 Minibuginstaller Wise_Installer vna SN:1361 NO_VIRUS

NO_MALWARE 3dspringblossoms_2066 No Threat Found NO_VIRUS

Trojan.Win32.Genome Em2 Suspicious_Gen2.ACLQC (Sig-Id: 15429377) Application.Win32.AdWare.Hotbar OEAddOn W32/HotBar.ALL. (Sig-Id: 221510) Trojan-Downloader.Win32.VB (Sig- SnackMan_ enbrowser W32/DLoader.SUG Id: 10796444) Table 11 Illegitimate Signature from Anubis and Norman

71 12.1.5 AMA Report Sample

AMA ANALYSIS REPORT 1/3/2011 2:39:44 AM

File Name: ZangoSA.exe File Type: .exe File size: 745 KB MD5: 23B2B86E0F9A8CF3D5F7B22C4BBCE246 SHA-1: 8906632B7E0FF848A121D6F14697B10A4CB16E2A

------FILES CREATED

c:\windows\Prefetch\INSTALL.EXE-03DE807A.pf c:\windows\Prefetch\INSTALL.EXE-1ADFD7D2.pf c:\windows\Prefetch\VCREDIST_X86.EXE-02095225.pf c:\windows\Prefetch\VCREDIST_X86.EXE-0832A9DF.pf c:\windows\Prefetch\AMAGUEST.EXE-02BC11CA.pf c:\windows\Prefetch\DWWIN.EXE-30875ADC.pf c:\windows\Prefetch\FPORT.EXE-21617D85.pf c:\windows\Prefetch\NETSTAT.EXE-2B2B4428.pf c:\windows\Prefetch\PSLIST.EXE-20338648.pf c:\windows\Prefetch\RUNDLL32.EXE-16224446.pf c:\windows\Prefetch\SRV.EXE-39DB67EF.pf c:\windows\Prefetch\SUPERMONX.EXE-26A6C969.pf c:\windows\Prefetch\SUPERMONX.EXE-2BED928B.pf c:\windows\Prefetch\TASKMGR.EXE-20256C55.pf c:\windows\Prefetch\ZANGOSADF.EXE-0BD71310.pf

------FILES DELETED

------FILES MODIFIED

c:\windows\WinSxS\ c:\windows\WinSxS\Manifests\ c:\windows\WinSxS\Policies\x86_policy.9.0.Microsoft.VC90.ATL_1fc8b3b9a1e1 8e3b_x-ww_9e7eb501\ c:\windows\WinSxS\Policies\x86_policy.9.0.Microsoft.VC90.CRT_1fc8b3b9a1e1 8e3b_x-ww_b7353f75\ c:\windows\WinSxS\Policies\x86_policy.9.0.Microsoft.VC90.MFC_1fc8b3b9a1e 18e3b_x-ww_4ee8bb30\ c:\windows\WinSxS\Policies\x86_policy.9.0.Microsoft.VC90.MFCLOC_1fc8b3b 9a1e18e3b_x-ww_b8438ace\ c:\windows\WinSxS\Policies\x86_policy.9.0.Microsoft.VC90.OpenMP_1fc8b3b9 a1e18e3b_x-ww_6ad67377\ c:\windows\WinSxS\x86_Microsoft.VC90.ATL_1fc8b3b9a1e18e3b_9.0.30729.1_ x-ww_d01483b2\ c:\windows\WinSxS\x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.30729.1_ x-ww_6f74963e\ c:\windows\WinSxS\x86_Microsoft.VC90.MFC_1fc8b3b9a1e18e3b_9.0.30729.1 _x-ww_405b0943\

72 c:\windows\WinSxS\x86_Microsoft.VC90.MFCLOC_1fc8b3b9a1e18e3b_9.0.307 29.1_x-ww_b0db7d03\ c:\windows\WinSxS\x86_Microsoft.VC90.OpenMP_1fc8b3b9a1e18e3b_9.0.2102 2.8_x-ww_ecc42bd1\ c:\program files\Microsoft.NET\RedistList\ c:\program files\Online Services\ c:\program files\Windows Media Player\Sample Playlists\ c:\windows\assembly\NativeImages_v4.0.30319_32\Temp\ c:\windows\Help\Tours\WindowsMediaPlayer\ c:\windows\Help\Tours\WindowsMediaPlayer\Audio\ c:\windows\Help\Tours\WindowsMediaPlayer\Img\ c:\windows\pchealth\helpctr\Config\CheckPoint\ c:\windows\pchealth\helpctr\Logs\ c:\windows\Prefetch\ c:\windows\Prefetch\NOTEPAD.EXE-336351A9.pf c:\windows\repair\ c:\windows\security\logs\ c:\windows\SoftwareDistribution\DataStore\Logs\ c:\windows\system32\config\software.LOG c:\windows\system32\config\system.LOG c:\windows\system32\config\systemprofile\Local Settings\Application Data\Microsoft\Windows Media\9.0\ c:\windows\system32\ias\ c:\windows\system32\MsDtc\Trace\ c:\windows\system32\oobe\html\ c:\windows\system32\spool\ c:\windows\Web\Wallpaper\

------REGISTRY CREATED

\\HKEY_CURRENT_USER\software\Microsoft\Windows\ShellNoRoam\MUICa che\\C:\Documents and Settings\test\Desktop\Srv.exe \\HKEY_CURRENT_USER\software\Microsoft\Windows\ShellNoRoam\MUICa che\\C:\Documents and Settings\test\Desktop\ZangoSADF.exe \\HKEY_CURRENT_USER\software\Microsoft\Windows\ShellNoRoam\MUICa che\\C:\Experiment\SupermonX\amaguest.exe \\HKEY_CURRENT_USER\software\Microsoft\Windows\ShellNoRoam\MUICa che\\C:\Experiment\SupermonX\netstatReport.bat \\HKEY_CURRENT_USER\software\Microsoft\Windows\ShellNoRoam\MUICa che\\C:\Experiment\SupermonX\PsList.exe \\HKEY_CURRENT_USER\software\Microsoft\Windows\ShellNoRoam\MUICa che\\C:\Experiment\SupermonX\report.bat \\HKEY_CURRENT_USER\software\Microsoft\Windows\ShellNoRoam\MUICa che\\C:\WINDOWS\system32\shell32.dll \\HKEY_CURRENT_USER\software\Microsoft\Windows\ShellNoRoam\MUICa che\\C:\WINDOWS\system32\taskmgr.exe \\HKEY_CURRENT_USER\software\Sysinternals\ \\HKEY_CURRENT_USER\software\Sysinternals\PsList\ \\HKEY_CURRENT_USER\software\Sysinternals\PsList\\EulaAccepted \\HKEY_CURRENT_USER\software\Zango\ \\HKEY_CURRENT_USER\software\Zango\HostOI\ \\HKEY_CURRENT_USER\software\Zango\HostOI\Updates\ \\HKEY_CURRENT_USER\software\Zango\HostOI\Updates\\CRA

73 \\HKEY_CURRENT_USER\software\Zango\HostOI\Updates\\LastBStatDynamic Report \\HKEY_CURRENT_USER\software\Zango\HostOI\Updates\\LastDayReportDW \\HKEY_CURRENT_USER\software\Zango\HostOL\ \\HKEY_CURRENT_USER\software\Zango\HostOL\Updates\ \\HKEY_CURRENT_USER\software\Zango\HostOL\Updates\\CRA \\HKEY_CURRENT_USER\software\Zango\HostOL\Updates\\LastBStatDynami cReport \\HKEY_CURRENT_USER\software\Zango\HostOL\Updates\\LastDayReportD W \\HKEY_CURRENT_USER\software\Zango\Zango\ \\HKEY_CURRENT_USER\software\Zango\Zango\Updates\ \\HKEY_CURRENT_USER\software\Zango\Zango\Updates\\CRA \\HKEY_CURRENT_USER\software\Zango\Zango\Updates\\LastBStatDynamic Report \\HKEY_CURRENT_USER\software\Zango\Zango\Updates\\LastDayReportDW \\HKEY_LOCAL_MACHINE\software\Microsoft\RFC1156Agent\ \\HKEY_LOCAL_MACHINE\software\Microsoft\RFC1156Agent\CurrentVersio n\ \\HKEY_LOCAL_MACHINE\software\Microsoft\RFC1156Agent\CurrentVersio n\Parameters\ \\HKEY_LOCAL_MACHINE\software\Microsoft\RFC1156Agent\CurrentVersio n\Parameters\\TrapPollTimeMilliSecs \\HKEY_LOCAL_MACHINE\software\Zango\ \\HKEY_LOCAL_MACHINE\software\Zango\Zango\ \\HKEY_LOCAL_MACHINE\software\Zango\Zango\PI\ \\HKEY_LOCAL_MACHINE\software\Zango\Zango\PI\3.2\ \\HKEY_LOCAL_MACHINE\software\Zango\Zango\PI\3.2\\PID00

------REGISTRY DELETED

------REGISTRY MODIFIED

Generated by SupermonX.exe Version 0.40

\\HKEY_CURRENT_USER\software\ \\HKEY_CURRENT_USER\software\Microsoft\Windows\CurrentVersion\Explor er\BitBucket\c\ \\HKEY_CURRENT_USER\software\Microsoft\Windows\CurrentVersion\Explor er\CLSID\{645FF040-5081-101B-9F08-00AA002F954E}\DefaultIcon\ \\HKEY_CURRENT_USER\software\Microsoft\Windows\CurrentVersion\Explor er\CLSID\{645FF040-5081-101B-9F08-00AA002F954E}\DefaultIcon\\ \\HKEY_CURRENT_USER\software\Microsoft\Windows\CurrentVersion\Explor er\ComDlg32\LastVisitedMRU\ \\HKEY_CURRENT_USER\software\Microsoft\Windows\CurrentVersion\Explor er\ComDlg32\LastVisitedMRU\\a \\HKEY_CURRENT_USER\software\Microsoft\Windows\CurrentVersion\Explor er\ComDlg32\OpenSaveMRU\*\ \\HKEY_CURRENT_USER\software\Microsoft\Windows\CurrentVersion\Explor er\ComDlg32\OpenSaveMRU\*\\MRUList \\HKEY_CURRENT_USER\software\Microsoft\Windows\CurrentVersion\Explor er\ComDlg32\OpenSaveMRU\bat\

74 \\HKEY_CURRENT_USER\software\Microsoft\Windows\CurrentVersion\Explor er\ComDlg32\OpenSaveMRU\bat\\MRUList \\HKEY_CURRENT_USER\software\Microsoft\Windows\CurrentVersion\Explor er\Discardable\PostSetup\ \\HKEY_LOCAL_MACHINE\software\ \\HKEY_LOCAL_MACHINE\software\Microsoft\ \\HKEY_LOCAL_MACHINE\software\Microsoft\Cryptography\RNG\ \\HKEY_LOCAL_MACHINE\software\Microsoft\Cryptography\RNG\\Seed \\HKEY_LOCAL_MACHINE\software\Microsoft\Windows NT\CurrentVersion\ProfileList\S-1-5-21-1960408961-1078145449-1957994488- 1003\ \\HKEY_LOCAL_MACHINE\software\Microsoft\Windows\CurrentVersion\Grou p Policy\ \\HKEY_LOCAL_MACHINE\software\Microsoft\Windows\CurrentVersion\Grou p Policy\State\Machine\Extension-List\{00000000-0000-0000-0000- 000000000000}\ \\HKEY_LOCAL_MACHINE\software\Microsoft\Windows\CurrentVersion\Grou p Policy\State\Machine\Extension-List\{00000000-0000-0000-0000- 000000000000}\\EndTimeHi \\HKEY_LOCAL_MACHINE\software\Microsoft\Windows\CurrentVersion\Grou p Policy\State\Machine\Extension-List\{00000000-0000-0000-0000- 000000000000}\\EndTimeLo \\HKEY_LOCAL_MACHINE\software\Microsoft\Windows\CurrentVersion\Grou p Policy\State\Machine\Extension-List\{00000000-0000-0000-0000- 000000000000}\\StartTimeHi \\HKEY_LOCAL_MACHINE\software\Microsoft\Windows\CurrentVersion\Grou p Policy\State\Machine\Extension-List\{00000000-0000-0000-0000- 000000000000}\\StartTimeLo

------SERVICES CREATED

------SERVICES DELETED

------SERVICES MODIFIED

MSIServer Windows Installer WIN32_SHARE_PROCESS RUNNING DEMAND_START C:\WINDOWS\system32\msiexec.exe /V LocalSystem

------PROCESSES CREATED

ZangoSA.exe PID: 168 ParentID: 1508 Time: 1/3/2011 2:35:09 AM

ZangoSA.exe PID: 1228 ParentID: 1508 Time: 1/3/2011 2:35:18 AM

------PROCESSES DELETED

75 ZangoSA.exe PID: 168 ParentID: 1508Time: 1/3/2011 2:35:11 AM

ZangoSA.exe PID: 1228 ParentID: 1508Time: 1/3/2011 2:35:20 AM

------PROCESSES MODIFIED

System Idle Process PID: 0 ParentID: 0Time: 1/3/2011 2:36:44 AM svchost.exe PID: 1060 ParentID: 668Time: 1/3/2011 2:36:44 AM explorer.exe PID: 1508 ParentID: 1484Time: 1/3/2011 2:36:44 AM VBoxTray.exe PID: 1816 ParentID: 1508Time: 1/3/2011 2:36:44 AM System PID: 4 ParentID: 0Time: 1/3/2011 2:36:44 AM WinDump.exe PID: 400 ParentID: 1056Time: 1/3/2011 2:36:44 AM csrss.exe PID: 600 ParentID: 528Time: 1/3/2011 2:36:44 AM lsass.exe PID: 680 ParentID: 624Time: 1/3/2011 2:36:44 AM wmiprvse.exe PID: 932 ParentID: 880Time: 1/3/2011 2:36:44 AM amaguest.exe PID: 964 ParentID: 448Time: 1/3/2011 2:36:44 AM

------TCP/IP PRCESS TO PORT MAPPED

Pid Process Port Proto Path 968 -> 135 TCP 4 System -> 139 TCP 4 System -> 445 TCP 212 -> 1025 TCP 0 System -> 1052 TCP

0 System -> 123 UDP 212 -> 137 UDP 0 System -> 138 UDP 968 -> 445 UDP 4 System -> 500 UDP 0 System -> 1900 UDP 4 System -> 4500 UDP

76