Applying Dynamic Binary Instrumentation to Light-Weight Malware Behavior Analysis Maksim Shudrak About Me
Total Page:16
File Type:pdf, Size:1020Kb
Tricky Sample ? Hack it easy! Applying dynamic binary instrumentation to light-weight malware behavior analysis Maksim Shudrak About Me Interests BIO Vulnerabilities Hunting 2018 – present: Senior Offensive Security Researcher Fuzzing 2016: Defended PhD (Vulns Hunting) in Tomsk, Russia Reverse-engineering 2015-2017: Researcher, IBM Research, Haifa, Israel Malware Analysis 2011-2015: Security Researcher, PhD student Dynamic Binary Instrumentation Projects Drltrace – transparent API-calls tracing for malware analysis https://github.com/mxmssh/drltrace WinHeap Explorer – PoC for heap-based bugs detection in x86 code https://github.com/WinHeapExplorer/WinHeap-Explorer IDAMetrics – IDA plugin for machine code complexity assessment https://github.com/mxmssh/IDAmetrics Outline ● Why Dynamic Analysis? ● Current approaches ● Runtime Overhead vs Visibility ● Dynamic Binary Instrumentation ● Technique Overview ● DBI Frameworks Comparison ● DrLtrace ● How Does It Work ? ● Usage ● Examples & Demo 3 Why dynamic ? ● Obfuscated & packed code hard for static analysis. ● In some cases, we need only a high- level view on malware behavior. 4 Current Situation in Dynamic Analysis Emulation (e.g. QEMU+PANDA) Library Calls Tracing (e.g. APIMonitor) Visibility Syscalls Tracing (e.g. ProcMon) Runtime overhead 5 Emulation. Visibility Example 6 Runtime Overhead. Stalling Code 7 10 min on CPU = 1d08h in emulator 8 Emulation. Visibility Example 9 Syscalls Tracing. Visibility Example 10 API Tracing. Visibility Example 11 Ltrace for Linux 12 Current Situation in Dynamic Analysis Emulation ? (e.g. QEMU+PANDA) Library Calls Tracing (e.g. APIMonitor) Visibility Syscalls Tracing (e.g. ProcMon) Runtime overhead 13 Current Situation in Dynamic Analysis Emulation (e.g. QEMU+PANDA) Dynamic Binary Instrumentation (e.g. Intel Pin) Library Calls Tracing (e.g. APIMonitor) Visibility Syscalls Tracing (e.g. ProcMon) Runtime overhead 14 Dynamic Binary Instrumentation (DBI) is a technique of analyzing the behavior of a binary application at runtime through the injection of instrumentation code. 15 Modern DBI Frameworks DynamoRIO Intel PIN Redistribution Open-source, BSD – license Proprietary model Supported x86, x86-64, ARM, AArch64 x86, x86-64 architectures Supported Linux, Windows, MacOS, Linux, Windows, MacOS, Platforms Android Android Average 108% (no tool) 130% (no tool) runtime overhead 139% (BBs counter) 162% (BBs counter) Language C/C++ C/C++ (some Python wrappers available) Technology Binary code transformation callout/trampolines 16 How Does DynamoRIO Work ? (10000 foot view) DynamoRIO Application in memory Launcher Target application shared system libs Kernel 17 How Does DynamoRIO Work ? (10000 foot view) DynamoRIO Application in memory (1) Launch (suspended) Launcher Target application shared system libs . Kernel 18 How Does DynamoRIO Work ? (10000 foot view) DynamoRIO Application in memory (1) Launch (suspended) Launcher Target application (2) shared system libs Inject instrumentation library Kernel 19 How Does DynamoRIO Work ? (10000 foot view) DynamoRIO Application in memory (1) Launch (suspended) Launcher Target application shared system libs (3) Hook entry point DynamoRIO lib + user-defined libs Kernel 20 How Does DynamoRIO Work ? (10000 foot view) DynamoRIO Application in memory (1) Launch (suspended) Launcher Target application shared system libs (3) (4) Hook entry point DynamoRIO lib + user-defined libs basic block Take first basic block Take first basic ins1 ins2 ins3 Kernel 21 How Does DynamoRIO Work ? (10000 foot view) DynamoRIO Application in memory (1) Launch (suspended) Launcher Target application shared system libs (3) (4) Hook entry point DynamoRIO lib + user-defined libs Code cache ins1 DR’s ins1 basic block DR’s ins2 (5) ins2 Take first basic block Take first basic ins1 DR’s ins3 ins2 transformation DR’s ins4 ins3 ins3 DR’s ins5 DR’s ins6 Kernel 22 How Does DynamoRIO Work ? (10000 foot view) DynamoRIO Application in memory (1) Launch (suspended) Launcher Target application Take next basic block shared system libs (3) (4) (6) Hook entry point DynamoRIO lib + user-defined libs Code cache ins1 DR’s ins1 basic block DR’s ins2 (5) ins2 Take first basic block Take first basic ins1 DR’s ins3 ins2 transformation DR’s ins4 ins3 ins3 DR’s ins5 DR’s ins6 Kernel 23 How Does Drltrace Work ? Application in memory Main module kernel32.dll ntdll.dll DBI engine dynamorio.dll drltrace.dll 24 How Does Drltrace Work ? Application in memory Main module kernel32.dll ntdll.dll (1) DBI engine Instrument LoadLibrary call dynamorio.dll (1.1) drltrace.dll Setcallback 25 How Does Drltrace Work ? Application in memory Main module (2) LoadLibrary(msvcrt.dll) kernel32.dll ntdll.dll (1) DBI engine Instrument LoadLibrary call dynamorio.dll (1.1) drltrace.dll Setcallback 26 How Does Drltrace Work ? Application in memory Main module (2) LoadLibrary(msvcrt.dll) kernel32.dll ntdll.dll (1) msvcrt.dll (being loaded) DBI engine Instrument LoadLibrary call (3) dynamorio.dll (1.1) drltrace.dll Setcallback Instrument exported functions exported Instrument 27 DynamoRIO and Drltracelib in The Memory of Calculator 28 Usage drltrace.exe –logdir . – malware.exe Why DrLtrace Rock ● Support x86 and x64 (ARM in future). 30 Why DrLtrace Rock ● Support x86 and x64 (ARM in future). ● Support Windows and Linux (macOS in future). 31 Why DrLtrace Rock ● Support x86 and x64 (ARM in future). ● Support Windows and Linux (macOS in future). ● Support self-modifying code. 32 Why DrLtrace Rock ● Support x86 and x64 (ARM in future). ● Support Windows and Linux (macOS in future). ● Support self-modifying code. ● Support all types of library API calls (static and dynamic). 33 Why DrLtrace Rock ● Support x86 and x64 (ARM in future). ● Support Windows and Linux (macOS in future). ● Support self-modifying code. ● Support all types of library API calls (static and dynamic). ● Not-detectable by standard malware anti-research techniques (anti-hooking, anti-debugging and anti- emulation). 34 Why DrLtrace Rock ● Support x86 and x64 (ARM in future). ● Support Windows and Linux (macOS in future). ● Support self-modifying code. ● Support all types of library API calls (static and dynamic). ● Not-detectable by standard malware anti-research techniques (anti-hooking, anti-debugging and anti- emulation). ● External configuration file to add new API calls. 35 Why DrLtrace Rock ● Support x86 and x64 (ARM in future). ● Support Windows and Linux (macOS in future). ● Support self-modifying code. ● Support all types of library API calls (static and dynamic). ● Not-detectable by standard malware anti-research techniques (anti-hooking, anti-debugging and anti- emulation). ● External configuration file to add new API calls. ● Easy-to-use (no additional dependencies, no heavy- weight GUI). 36 Why DrLtrace Rock ● Support x86 and x64 (ARM in future). ● Support Windows and Linux (macOS in future). ● Support self-modifying code. ● Support all types of library API calls (static and dynamic). ● Not-detectable by standard malware anti-research techniques (anti-hooking, anti-debugging and anti- emulation). ● External configuration file to add new API calls. ● Easy-to-use (no additional dependencies, no heavy- weight GUI). ● Open-source (BSD-license). 37 Runtime Overhead 50 44,6 45 43,3 40 37,2 35 32 30 25,6 25 22 20,2 20 16,5 14,7 15 RUNTIME OVERHEAD 10 10 5 0 PowerPoint Word Calculator Internet Explorer VMWare IDA PRO StarCraft II Dropbox Chrome Notepad++ 38 Launcher Example 1. EmbusteBot ● Type - Brazilian Banking Trojan ● Language - Delphi ● Main Functionality – Keylogger, Screenshots capturing ● Obfuscation – time-based anti-research checks, encryption of sensitive strings, no code packing ● Operation period – (2017 – present) Report- https://securityintelligence.com/brazilian-malware-never-sleeps-meet-embustebot/ More details - https://github.com/mxmssh/drltrace/wiki/Malware-Analysis-Examples 39 Example 1. EmbusteBot drltrace.exe -logdir . -print_ret_addr – vdeis.exe 40 Example I. EmbusteBot. Searching for Tab with Bank Name 41 Example I. EmbusteBot. Screenshots Capturing and Keylogging API Calls 42 Example I. EmbusteBot. Trigger 43 Example II. Gootkit Loader ● Type : Banking Trojan ● Language : C ● Main Functionality: Unpack actual payload loader, deliver Gootkit malware on victim’s machine ● Obfuscation: anti-VM, anti-debugging, anti-emulation, time-based anti-research, packed payload, machine- code obfuscation, anti-sandboxing. ● Operation period: (2014 – present) Technical report - https://drive.google.com/file/d/0BzFSoGMCVlTORUExdF9RTklpX3c/view More details - https://github.com/mxmssh/drltrace/wiki/Malware-Analysis-Examples 44 Example II. Gootkit Loader drltrace.exe -logdir . -print_ret_addr -- 477c305~f01.exe 45 Example II. Gootkit Loader. Unpacking 46 Example II. Gootkit Loader. Process Hollowing. New Process Creation 47 Example II. Gootkit Loader. Process Hollowing. New Section 48 Example II. Gootkit Loader. Process Hollowing. Write & Resume 49 Example II. Gootkit Loader 50 Example II. Gootkit Loader 51 Example II. Gootkit Loader 52 DEMO. NotPetya/PetrWrap 53 54 Drltrace. API calls visualization script python api_calls_vis.py -i wannacry.jpeg -gr -t drltrace_log_wannacry.log 55 Drltrace. API calls visualization script python api_calls_vis.py -ht wannacry.html -gr -t drltrace_log_wannacry.log 56 Future Work ● Make DynamoRIO more resistant against anti-DBI tricks. ● Add heuristics to search for certain (YARA rules?) malicious patterns in logs. ● ARM and macOS. ● Attach drltrace into running process 57 Conclusion ● Dynamic binary instrumentation is a reasonable trade-off for dynamic malware analysis. ● Drltrace is the first efficient and light-weight solution for API calls tracing in modern sophisticated malicious samples based on DBI technique. ● The solution allowed to revel in several minutes a lot of internal technical details about malicious sample without even starting IDA or debugger. 58 Thank you! https://github.com/mxmssh/drltrace https://www.linkedin.com/in/mshudrak https://twitter.com/MShudrak .