Framework for Easy Malware Analysis
Total Page:16
File Type:pdf, Size:1020Kb
MASARYK UNIVERSITY FACULTY}w¡¢£¤¥¦§¨ OF I !"#$%&'()+,-./012345<yA|NFORMATICS Framework for Easy Malware Analysis BACHELOR’S THESIS Radoslava Povalov´a Brno, autumn 2014 Declaration Hereby I declare, that this paper is my original authorial work, which I have worked out by my own. All sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Advisor: Mgr. V´ıt Bukacˇ ii Acknowledgement I would like to thank my supervisor Mgr. V´ıt Bukacˇ and my con- sultant RNDr. Vaclav´ Lorenc for their continuous support, guidance and valuable feedback which helped me to write this thesis. iii Abstract The primary purpose of this thesis is to study tools which are used for malware deobfuscation and to implement web application for de- obfuscation and detection of malware. The web application allows set type of deobfuscation methods, their keys and order of them. It also supports automatic analysis without necessity to set analysis configuration. The application uses pattern matching system YARA for malware detection. iv Keywords malware, deobfuscation, Python, web application, Flask, cluster, Cel- ery, YARA v Contents 1 Introduction ............................ 3 2 Malware Overview ........................ 5 2.1 General Malware Categories ................ 5 2.2 Malware Lifecycle ...................... 6 2.3 Examples of Malware Lifecycle .............. 8 2.3.1 New Malware Sample Example Scenario . 8 2.4 Obfuscation ......................... 10 3 Analysis .............................. 12 3.1 Overview of Existing Tools . 12 3.2 Specification of Requirements . 14 3.2.1 Functional Requirements . 14 3.2.2 Non-fuctional Requirements . 15 4 System Design .......................... 17 4.1 Available technologies ................... 17 4.1.1 Web Framework . 18 4.1.2 Database . 18 4.1.3 User Interface . 19 4.1.4 Parallel Processing . 21 4.1.5 Deobfuscation Heuristics . 23 5 Implementation .......................... 24 5.1 User Interface ........................ 24 5.1.1 Home Page . 24 5.1.2 Administration . 24 5.1.3 Analysis Configuration . 24 5.1.4 Details of a Result . 25 5.2 Application Layer ...................... 25 5.2.1 Flask . 27 5.2.2 Celery . 28 5.2.3 Creation of a New Deobfuscation Analysis . 28 5.3 Database Layer ....................... 31 1 5.4 Presentation Layer ..................... 33 5.5 Deployment ......................... 34 5.5.1 Application Preparation . 34 5.5.2 Nginx and uWSGI Deployment . 35 5.5.3 Apache and uWSGI . 36 5.5.4 Alternative Deployment . 36 5.6 Extending the Application . 37 5.6.1 Adding New YARA Rules . 37 5.6.2 Changing the Configuration . 37 5.6.3 Adding New Operation . 37 6 Testing ............................... 39 6.1 Unknown Malware ..................... 39 6.2 Kordeef Trojan Horse .................... 40 6.3 Infected Word Document . 41 7 Conclusions ............................ 43 7.1 Future Improvements .................... 43 7.2 Conclusion .......................... 44 A Attachments ............................ 51 B Contents of attached ZIP archive . 59 2 Chapter 1 Introduction Nowadays it is hard to imagine a world without computers. They are used in banking, medical, various control and communication systems, business application etc. This IT era gives us a lot of new possibilities and brings improvement in efficiency, but also brings security risks. The best solution of these risks would be to completely prevent intrusions and attacks to the systems. However, in real world, despite using various defensive and safety mechanisms, some secu- rity attacks are successful, system is infected and it is necessary to solve this security incident by analyzing it. Malware analysis is dissecting the malware to understand how it works, how to identify it, and how to defeat or eliminate it. [39, p. 29]. There are two types of analysis techniques. Dynamic analysis involves launching and also debugging an executable file in a con- trolled and monitored environment so that its effects on a system can be observed and documented[5, p. 287]. Static analysis includes loading the executables into disassembler and examing program in- struction to discover what the program does. In this thesis dynamic analysis is not in my concern. I will mainly focus on static analysis. In static analysis the disassembled executable is not often readable, because malware authors usually used some techniques to mask the code and to hide their targets, for example, common encryption, en- coding or obfuscation. Thereby in the beginning of analysis it is nec- essary to convert the executables to the decoded forms. It is more efficient and comfortable for security analyst to use some automatic tool to provide this conversion. 3 1. INTRODUCTION In my work I will implement the tool which security analyst could use during static analysis to obtain the decoded disassembled exe- cutable. In the tool the user will have option to completely manage the process of analysis. There will be possible to choose type, pa- rameters and also order of the functions used to decoding the given source. In addition to the set of predefined malware detection pattern rules, the tool also will allow to define and use only custom pattern rules. This tool will be implemented as intuitive and user-friendly web application. The application will be developed in collaboration with a global Computer Incident Response Team (CIRT) at Honey- well International Inc. 4 Chapter 2 Malware Overview 2.1 General Malware Categories Malware is any software that does something that causes harm to user, computer, or network [39, p. 29]. Software which is not primary harmful, but performs potentionally unwanted actions, such as gath- ering sensitive information, is also often considered as malware. The most common forms of malware are executable programs, the oth- ers include scripts, applets, plugins, etc. Malicious software can be categorized into following groups: • Rootkits modify the operating system so that they are capable of hiding themselves and other system components from users and even the operating system itself [5, p. 309] . They can op- erate in a kernel mode (Ring 0 on MS Windows systems) or userland (Ring 3) mode. • Backdoors have primary function to create remote access to the compromised system or network. Most of them are used dur- ing direct attacks to ensure access to the victims and for data exfiltration. They are often combined with rootkits. • Downloaders exist mainly to download other malicious code. They are usually a part of a botnet ecosystem that is available as a service for other malware authors who can pay the botmaster (owner of a botnet) to spread their malware. • Spyware collects sensitive information. Collected data often in- cludes information about bank accounts, user behavior, brows- ing history and user accounts. 5 2. MALWARE OVERVIEW • Worms use self-replication for rapid infection spread. They do not modify files, but create a copy of itself to spread via security vulnerabilities. They may cause some harm, but they are not designed to modify systems. • Trojan horses infect other executables to spread infection. They usually carry other malware payload and spread by modifying existing files, for example, executables, documents or scripts. • Adware displays unwanted advertisements, often classified as Potentionally unwanted Program (PuP) which is not intented to be malicious. It is often installed as a part of legitimate pro- gram. • Ransomware tries to obtain money by threating users. Most of the ransomware display a notice from the police to pay ransom to avoid legal prosecution because of illegal activity or notice that user’s important files have been encrypted and he must to pay the malware author to decrypt them. • Bots connect to botnet to receive commands from botmaster. An infected computer becomes a zombie in a large botnet con- sisting of other infected systems receiving commands from bot- master. A botnet is usually used for DDoS (Distributed Denial of Service) attacks, sending spam, minning bitcoins, spreading other malware and so on. It is difficult to exactly categorize malware, because it is often combination of several categories mentioned above. 2.2 Malware Lifecycle A typical malware lifecycle begins by using a delivery system to spread the malware to the potential targets which is most often in a form of a script on a compromised web site. This is done by ex- ploit kit, set of exploits - the pieces of software, the chunks of data or the sequences of commands, that take advantage of vulnerability or a bug in another software to execute attacker intended instruc- tions. It can cause unusual behavior in the vulnerable software [40, 6 2. MALWARE OVERVIEW Figure 2.1: Malware lifecycle stages [1] p. 191]. Exploit kit tries to detect outdated plugins/software of po- tential targets and exploit them to deliver malware. Another type of delivery can be done by sending phishing emails with malicious pay- load, such as infected PDF or ZIP archive containing executable file with double extension. Next stage is to deliver malware by exploiting victim and ensur- ing persistence to survive reboot of the compromised host. After the persistence is ensured, the malware begins to perform actions which has been designed to. Actions may differ per malware, but usually include establishing a callback (C2 channel) to the CnC server to re- cieve commands or data exfiltration. Malware lifecycle can be often different, for example, delivery sys- tem can be omitted, because it is deployed by another malware or by a stager. The stager is responsible for downloading a large payload (actual malware), injecting it into memory, and passing execution to it. By using a stager, it is almost guaranteed that the malware will be deployed on the victim host. 7 2. MALWARE OVERVIEW 2.3 Examples of Malware Lifecycle Zeus [30] (trojan horse) and CryptoLocker [23] (ransomware) - A victim browses to the web site infected by Blackhole exploit kit [21] (well-known exploit kit). By opening this infected web site, the ex- ploit kit is executed (in form of JavaScript) and then it detects that the victim has an outdated Java version.