2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS)

Non-intrusive Analysis and Reverse Debugging with SWAT

Pavel Dovgalyuk Ivan Vasiliev Natalia Fursova Institute for System Programming Institute for System Programming Institute for System Programming Moscow, Russia Moscow, Russia Moscow, Russia [email protected] [email protected] [email protected]

Denis Dmitriev Mikhail Abakumov Vladimir Makarov Institute for System Programming Institute for System Programming Institute for System Programming Moscow, Russia Moscow, Russia Moscow, Russia [email protected] [email protected] [email protected]

Abstract—This paper presents SWAT — System-Wide Analysis Most of the virtual machine debugging tools (like GDB) Toolkit. It is based on open source emulation and debugging are targeted to . Debugging of Windows software is projects and implements the approaches for non-intrusive system- different, because GDB can’t read symbol files for wide analysis and debugging: lightweight OS-agnostic virtual machine introspection, full system execution replay, non-intrusive Windows. There is a powerful — WinDbg, provided debugging with WinDbg, and full system reverse debugging. by . But this debugger requires a running debug These features are based on novel non-intrusive introspection server within the guest system, which limits applicability of and reverse debugging methods. They are useful for stealth the debugger, because analyzed malware can detect the usage debugging and analysis of the platforms with custom kernels. of the server. SWAT includes multi-platform QEMU with additional instrumentation and debugging features, GUI for convenient Virtual machine introspection (VMI) is used to extract QEMU setup and execution, QEMU plugin for non-intrusive structure and behavior of the guest system and programs [4]. introspection, and modified version of GDB. Our toolkit may be Most of the existing VMI methods are intrusive — they require useful for the developers of the virtual platforms, , and code injection into the guest (modified kernel, introspection firmwares/drivers/operating systems. Virtual machine intospec- agent, and so on) [5] and can’t work without the build tools tion approach does not require loading any guest agents and source code of the OS. Therefore it may be applied to ROM- inside the VM or when the execution is recorded. Others (like based guest systems and enables using of record/replay of the PyREBox [6]) can only work with a limited set of OS versions system execution. This paper includes the description of SWAT and builds. Therefore such methods can’t be applied for the components, analysis methods, and some SWAT use cases. custom builds of Linux, that do not have build tools inside. Index Terms—Software instrumentation; Dynamic analysis; WinDbg server and VMI agents also are not possible to use Virtual machine; Introspection; QEMU; SWAT when the execution is recorded, because they must be running live to provide the data. I. INTRODUCTION Our work is targeted to overcoming the following limitations System wide analysis and debugging are needed for op- of state-of-the-art methods: erating system (OS) development, malware analysis, driver • No introspection tool for custom Linux cores. Every debugging, and so on. Tools for aiding in this work use virtual existing tool requires either instrumenting of the source machines to provide isolation and ease the instrumentation and code, or loading guest agent into the virtual machine. analysis. Therefore such tools can’t introspect custom Linux cores The following techniques are the concrete dynamic analysis that do not provide SDK for compilation of the core or methods used for virtual machine and user-level debugging. guest agent. Reverse debugging is used for examining the past states of • No stealth/repeatable debugging with support of Windows the system or program and for deterministically replaying the internals. Windows SDK includes WinDbg which can be recorded executions. There are some reverse debugging tools used for system-wide introspection and debugging. But it for programs (e.g., Mozilla RR [1]). can’t be used in deterministic mode of execution, because Existing solutions for system wide execution record/replay the guest debugging server must be run inside the virtual [2], [3] can’t be easily obtained, or lack the capabilities for machine. For the same reason WinDbg can’t be used reverse debugging the whole machine, when the state of all for stealth debugging, because malware may detect the virtual devices is replayed. Therefore the drivers and firmwares debugging server. can be observed only as the code plus memory without any • No convenient tool for reverse debugging of the virtual infomation about the current virtual device state. machines (gdb + gui). Reverse debugging of the virtual

978-1-7281-8913-0/20/$31.00 ©2020 IEEE 196 DOI 10.1109/QRS51102.2020.00036 machines incurs record and replay of its’ executions. But none of the existing tools provide handy interface Recorded Execution for replaying the executions and configuring the virtual machine devices. Manual command line con- VM Configuration figuration for hardware configuration, reverse debugging, and introspection is error prone. SWAT (System Wide Analysis Toolkit) solves the above GDB QEMU GUI problems by providing the following tools and methods: • Extended lightweight introspection method for supporting platform- and OS-agnostic virtual machine API monitor- ing. • Modified QEMU — multi platform emulator. It includes the features that are not available in the vanilla version: VMI WinDbg reverse debugging, WinDbg server, instrumentation layer, plugin support. These features provide stealth and deter- ministic debugging and analysis methods. • Introspection QEMU plugin for virtual machines based Syscall and API log on , x86 64, ARM, and AArch64 platforms. This plugin support both Windows and Linux kernels. Figure 1: Components of the System Wide Analysis Toolkit. • GDB which includes reverse debugging with improved WinDbg is an external tool from SDK. performance. • Method for extracting command line parameters from III. EXECUTION RECORD AND REPLAY QEMU for visual management of the virtual machine command line configuration. Virtual machine record and replay is the feature which • -gui — a tool for managing QEMU-based virtual allows recording whole machine execution and later replaying machines with the support of execution recording and it for the sake of debugging or analysis [8]. replaying. Modern approaches to record and replay of the virtual machine execution are implemented within several emulators II. SYSTEM WIDE ANALYSIS TOOLKIT (QEMU [8], Simics [3]) and dynamic analysis frameworks (PANDA [2], Crosscut [14]). Another approach to analysis of SWAT is aimed to make full system debugging and analysis the recorded execution is collecting detailed traces for analysis easier. In its core lays QEMU — multi-platform emulator [7], instead or replaying system behavior [15]. which was modified to add debugging and analysis tools We used QEMU in our toolkit, because it is open source, (Figure 1. high performance, has wide cross-ISA support, and can be Execution record/replay is mandatory for reverse debugging, modified for adding reverse debugging and introspection ca- because program can’t run in backward direction without pabilities. Unlike PANDA, it replays whole emulator behavior recording prior execution steps [8]. It is also very helpful to (including the video output and state of the virtual devices), dynamic binary analysis, because analysis can be decoupled which is useful for convenient debugging, and for development from the execution, and won’t affect the guest behavior [9]. and debugging of the virtual devices within the emulator. Recent versions of QEMU include execution record and replay. To make it more useful we added WinDbg debug IV. GUI FOR QEMU server into QEMU [10], and implemented reverse debugging We introduced a graphical utility for virtual machine man- commands for using with GDB [11]. We also extended GDB agement. Our aim was to help two categories of users: to improve its reverse debugging capabilities. regular QEMU users that debug virtual machines, and QEMU Dynamic binary instrumentation is the only option for developers that may have several builds of QEMU to test their analysis both of kernel and user-level code in the virtual virtual machines. machines [12], [13]. Our modifed version supports instrumen- QEMUs command line is very tricky for a regular user. tation of the guest code and includes a plugin subsystem [5]. One would often forget and miss parameters or would not We also provide a plugin for non-intrusive virtual machine be able to configure QEMU for the specific operation mode introspection. like execution record or replay. QEMU developers could use SWAT also includes graphical utility for managing the some help too, because they need a convenient way to switch virtual machine configurations, which is similar to the one between different builds or versions of the emulator. provided with VirtualBox and VMWare. Our tool has some Also different versions of QEMU support different virtual additional features that allow convenient control of the virtual devices. We can’t hardcode all of them in our tool, therefore machine recordings to make full system debugging more user- we needed to extract list of the supported hardware from the friendly. emulator.

197 We focused our qemu-gui project on providing the fol- complex than in regular execution mode (Figure 3). QEMU lowing possibilities: documentation describes additional options that should be • Managing existing virtual machine configurations. specified for enabling record/replay, for recording/replaying • Graphical tree view for editing VM hardware configura- all disk operations, and for recording/replaying all network tion. operations. User may also need disk overlay image, which • Switching between several QEMU builds. protects the original from overwriting by the • Recording and replaying virtual machine executions. analyzed scenario and virtual machine snapshots. • Embedded QEMU management console window. Therefore we added recordings management to qemu-gui. First, it automatically creates disk overlay. All disk changes A. Retrieving Command Line Parameters from QEMU made during recorded scenario are written to that file. It also We choose not to hardcode all possible QEMU options for stores all created virtual machine snapshots. configuring virtual machine devices for the following reasons: Second, qemu-gui forms a command line which uses • One can use different QEMU builds or versions that -icount option for enabling record or replay, and adds have different capabilities. For example, developers can replay-specific filters to the disk images and network adapters. extend QEMU and test its features within different code These filters are required, because interaction with network or branches. order of disk operations may be non-deterministic and must • QEMU is a live project and synchronization of the sup- be recorded. ported options with qemu-gui needs additional efforts. QEMU also supports recording of the data from and the signal from the microphone, but this is done QEMU provides QEMU machine protocol (QMP), which automatically, when -icount is enabled. allows communication betweeen side tools and QEMU with JSON-based messages. This protocol supports introspection of . Comparison with the Existing Tools the virtual devices: we can read already installed devices in the virtual machine, and request the list of available hardware, Existing virtual machine management solutions are usually that can be added through the command line or QMP. designed for the with hardware , but We used the following commands for extracting the infor- some of them can be applied to QEMU. is a portable mation about supported virtual devices: API for managing virtual machines [16]. It includes support • query-machines — list of the supported machines for QEMU, but mostly targets on cloud-like virtualization, and • query-cpu-definitions — list of the supported not on debugging and analysis. CPUs virt-manager is a desktop for managing virtual • qom-list-types — list of the supported devices machines through libvirt [17]. Available peripheral devices are • device-list-properties — list of the specific hardcoded in this tool, it is also does not support arbitrary bus device properties conviguration. Example of QMP interaction is presented in Figure 2. AQEMU is a project which provides GUI for managing the However, these commands do not provide enough details about local virtual machines on QEMU [18]. It is more like our GUI the buses that connect controlles and peripheral devices. subproject, but it doesn’t support reading the available con- Therefore we started from network adapters recognition. figuration options from QEMU through the QEMU machine We distinguish network adapeters by filtering the devices that protocol, and doesn’t have any record/replay aiding features. include netdev field, which means that they are connected That is why we couldn’t use that project for our purposes to the network. And we also need only PCI-based network without its complete rewrite. adapters. To get them, we build inheritance tree and filter only In contrast with all existing tools, qemu-gui is designed the subclasses of the abstract PCI device. for the users, that require precise peripheral configuration and For example, e1000 inherits its properties from the record-replay capabilities for system analysis and debugging. following device classes: object -> device -> Therefore qemu-gui has the following benefits over the pci-device -> e1000-base -> e1000. existing solutions: Therefore completing this feature is the first thing to do • VM hardware configuration. qemu-gui is designed to in this project. We plan to extend QMP on QEMU side to support all possible parameters, that QEMU allows to make required bus and connection information available for configure through the command line. E.g., one may need qemu-gui. to specify the order of the devices on the some bus, because it may affect the execution. B. Execution Record and Replay • Switching between several QEMU builds, which may be Virtual machine record and replay is the feature which possible when using different unofficial (with the QEMU- allows recording whole machine execution and later replaying based code analysis tools) or debug builds. it for the sake of debugging or analysis [8]. • Recording and replaying virtual machine executions. Enabling record or replay for the existing virtual machine None of the existing tools capable of creating the com- may be tricky. Command line in these modes is much more mand line for record and replay. The command line for

198 -> {execute: query-machines} <- {"return": [{"hotpluggable-cpus": true, "name": "pc-i440fx-2.2", "cpu-max": 255}, {"hotpluggable-cpus": true, "name": "pc-q35-2.4", "cpu-max": 255}, ... {"hotpluggable-cpus": true, "name": "pc-i440fx-2.8", "cpu-max": 255}, ... {"hotpluggable-cpus": true, "name": "pc-1.0", "cpu-max": 255}]}

-> "execute": "device-list-properties", "arguments": {"typename": "e1000"}} <- {"return": [{"name": "bootindex", "type": "int32"}, {"name": "mitigation", "description": "on/off", "type": "bool"}, {"name": "addr", "description": "Slot and optional function number, example: 06.0 or 06", "type": "int32"}, ... {"name": "vlan", "description": "Integer VLAN id to connect to", "type": "int32"}, {"name": "romfile", "type": "str"}, {"name": "rombar", "type": "uint32"}, {"name": "autonegotiation", "description": "on/off", "type": "bool"}, {"name": "netdev", "description": "ID of a netdev to use as a backend", "type": "str"}]}

Figure 2: Querying machines list and e1000 netcard properties through QMP

Command line for regular execution

qemu-system-i386 -hda disk.qcow2 -nic user

Command line for recording the execution

qemu-system-i386 -icount shift=7,rr=record,rrfile=replay.bin -drive file=disk.qcow2,if=none,snapshot,id= -drive driver=blkreplay,if=none,image=img,id=img-blkreplay -device ide-hd,drive=img-blkreplay -netdev user,id=net1 -device rtl8139,netdev=net1 -object filter-replay,id=replay,netdev=net1

Figure 3: QEMU command line becomes very complex when using record/replay

these tasks may be very complex, therefore automatic Reverse continue is intended to search breakpoints in the command line creation helps users to avoid the mistakes. past. Snapshots split whole execution into the several parts. To make search of the breakpoints faster, emulator in the first V. R EVERSE DEBUGGING phase searches them in the latest part (from the latest snapshot to the current moment). Therefore reverse continue operation Reverse debugging is the feature that allows a debugger to may include several ”scanning“ executions before stopping at ”travel back in time“ from the failures for uncovering their the breakpoint (Figure 4): reasons [3]. • Load the snapshot. QEMU includes GDB debug server which emulates the • Replay to the moment of start to examine the breakpoints. behavior of in-guest GDB server and allows debugging the • If breakpoint or watchpoint was met in this pass, then virtual machine through the remote connection. GDB re- mote protocol supports two reverse debugging commands: – Load the snaphot again. reverse-stepi and reverse-continue. The first one – Replay to the latest breakpoint. steps single instruction backwards in time, and the second one • Else, repeat all previous steps from the earlier snapshot. finds the last of prior breakpoints. Reverse step loads the nearest snapshot and replays the We can’t literally execute code in backward direction. But execution until the required instruction is met. In some cases we can replay it to the desired moment of the execution. When it is necessary to step a couple of instructions backward. For GDB sends to QEMU any reverse command, QEMU loads one this goal can be used the reverse-stepi command with of the prior VM snapshots and replays the execution. Then the parameter — the number of instructions. Unfortunately, emulator stops at some moment of the execution, that comes GDB will simply repeat reverse-stepi as many times as earlier, than the original one. Therefore usage of the reverse needed. This leads to unnecessary overhead for executing the debugging requires at least one snapshot created in advance. command, because every time execution steps one instruction

199 reverse-continue workflow To use QEMU WinDbg server, emulator must be started with the option - pipe: with some . QEMU will pause and wait for WinDbg connection. WinDbg should be started in remote debugging mode with Load VM snapshot the pipe option, which can be connected to the virtual machine instead of serial port. Search breakpoints After the execution starts, WinDbg will detect that the kernel is loaded at some moment. Then WinDbg enters the Load VM snapshot command mode and the virtual machine may be debugged without switching the OS into the debug mode. Proceed to the breakpoint Our module supports all WinDbg requests for i386 and x86_64 architectures. Debugger can’t detect that our module is different and works with QEMU as it worked with the reverse-continue effect original server. QEMU with these modifications is included in SWAT. A. Related Work on Debugging Figure 4: Reverse debugging with execution replay Our idea of debug server emulation is similar to the existing implementation of QEMU GDB server [20]. Since GDB has backward, QEMU needs to load the latest snapshot and run a set of reverse debugging commands, we implemented their forward to the desired instruction. support in QEMU, using the execution record and replay with the aid of virtual machine snapshotting. A. Reverse Roll Other open source reverse debugging tools are Mozilla RR [1], and standalone GDB [21]. However, these do Unfortunately, reverse step support in GDB remote proto- not support full system debugging. Simics is a full system col can’t be extended with number-of-instructions parameter. emulator, which supports reverse debugging [3]. But this tool Therefore, to make reverse stepping more efficient, we intro- can’t be easily obtained even for evaluation. duced reverse-roll command, which observable behavior Recent version of PANDA supports system wide reverse is equivalent to reverse-stepi N. debugging [2], but it offers only CPU and memory recording. Reverse roll command takes parameter (number of instruc- Therefore virtual device state and the output to the display tions), loads the nearest snapshot and replays the execution can’t be observed during the debug session. until the required instruction is met without splitting the com- There is another approach of implementing WinDbg server mand into sequence. Reverse roll command without parameter emulator — Winbagility project [22]. However, it does not is equivalent to the reverse-stepi command. support all the required WinDbg functions. It also uses GDB with support of non-standard reverse-roll com- VirtualBox as a , which can’t provide execution mand is included in SWAT. record/replay functions. IceBox is a Winbagility-based project for debugging with VI. WINDBG WinDbg [23], therefore it has the same limitations. GDB is inefficient for remote debugging of Windows ap- In contrast, our implementation of WinDbg server may be plications. It can’t distinguish the processes, show kernel used in cooperation with record/replay, allowing convenient structures, load debug symbols. WinDbg is a debugger for debugging of heisenbugs. Windows, which supports all of these options, and is also capable of debugging the applications remotely [19]. VII. VIRTUAL MACHINE INTROSPECTION However, it requires a running debug server in the guest SWAT includes an introspection plugin for our modified OS (Kdsrv.exe). Running a guest-side debug service allows version of QEMU. This plugin monitors the virtual machine the application to detect it, and also prevents using execution for extracting the runtime information that might be useful for record and replay, because debug server behavior can’t change application debugging and analysis. during replay. SWAT introspection plugin uses the approach which is That is why we created a module running within QEMU, based on the system call monitoring [5]. which supports WinDbg remote protocol and extracts required We extended this method of monitoring from i386 to information from the guest memory and registers. Thus, the x86 64, ARM, and AArch64-based virtual machines. The debugger connects to the debugging module, not to the kernel method is based on tracking system call/return instructions for of the . The module reads all necessary infor- collecting the information about kernel objects. Every system mation from guest CPU and memory for replying the debugger call is executed in the context of the OS process (which is requests. The details of its implementation are presented in identified by the value of CR3/TTBR0_EL1 registers). System paper [10]. calls operate the kernel entities. Intercepting them, we can

200 Function b75cee10:__getpid instrumentation engine which allows adding the plugins that Function b7592840:strdup we can use for analysis and debugging the systems. Function b758e300:malloc There also were several attempts of implementing Pin- Function b758e950:free compatible whole-system dynamic instrumentation: PinOS Function b76fcc30:__udivdi3 [13] and PEMU [27]. Function b76fcd60:__umoddi3 PinOS can use plugins developed for Pin dynamic instru- Function b76138d0:__snprintf_chk mentation framework, but it can only boot Linux on x86 virtual Function b7613900:__vsnprintf_chk machine. PinOS incurs significant execution slowdown (up to 120x) even without any instrumentation. Figure 5: Sample of QEMU output with introspection plugin PEMU solves the semantic gap problem by forwarding enabled. It prints list of executed named functions and list system calls to the guest. These system calls are used to executed of system calls. retrieve guest-level information from the virtual machine. PEMU is intrusive, because the forwarded syscalls may alter guest system behavior. Therefore it cannot be used for the collect the following entities for every process: files, sections offline analysis of the recorded executions. (for Windows), and file mappings to the address space. Virtual machine introspection is usually accompanied with a Commodity operating systems use memory-mapped files guest side agent which collects the data from the guest system, to load and dynamic libraries. We hook map- or used for setting up the introspection algorithms executed on ping/unmapping operations and try to parse file contents. And the host [25], [28]. if it has PE or ELF format, we add tracepoints for every exported function in that file. This approach allows us to track Other approach is using memory forensic tools like Volatil- named function calls (API calls) in the context of every process ity [29] or Rekall [30]. RTKDSM system uses Volatility in the virtual machine. for locating the OS-specific data structures and uses host- Our plugin currently supports Windows XP on i386, Win- side monitoring agent to keep track of the changes in these dows 10 on x86 64, and Linux (all possible versions) on i386, structures [31]. The main limitation of RTKDSM system is x86 64, ARM, and AArch64. Linux introspection does not targeting to x86 platform, because of using hypervisor depend on the version or build, because introspection extract for the virtual machine. information from the intercepted system calls, and system call Nitro is a KVM-based framework for VMIs [32]. It was IDs in Linux never change. On Figure 5 you can find fragment tested with guest Windows, Linux on 32- and 64-bit platforms. of the introspection plugin output. Nitro is able to trace system calls on all these platforms, but Monitoring of API function calls may be useful in itself it can’t support such platforms as ARM and MIPS, because (e.g., for detecting anomalies), and also for recovering more of using KVM. system information, than from the system calls (e.g., hooking Intrusive methods are not suitable for use in the analysis CreateProcess in Windows may be used to recover the scenarios that utilize record/replay of the virtual machine executed processes). executions. They also require full SDK for the analyzed system and doesn’t work when such SDK is not supplied (e.g., for the A. Requirements and Limitations custom Linux builds). We aimed to creating an introspection method which allows In contrast, SWAT implements introspection approach de- analysing of the systems with minimum knowledge about scribed in [5], which is applicable to replayed executions and the internals. This approach requires the following data from to read only virtual machines without SDK. It also supports system ABI: system call mechanism and format. non-x86 platforms, because it uses QEMU instead of hardware This is less, than with other introspection methods. E.g., we virtualization. don’t require parsing of system internals like process queue. There are two main limitations of our toolkit. First, it does VIII. EVALUATION not support multi-cpu replay and analysis. However, this is not A. Instrumentation possible for all emulation-based debuggers and analysis tools. Second, introspection can’t detect executables when there is We tried to run instrumentation detection tests described in no knownledge about the system calls. This is the case for our paper [33]. These tests are targeted on Windows 7 and include future efforts. the scenarios that are trying to do things not allowed within application-level instrumentation frameworks. B. Other Instrumentation and Introspection Approaches As we are using QEMU, such tests were passed, but the test There were many attempts of adding instrumentation engine which checks the correctness of FPUInstructionPointer failed. into QEMU [12], [24]–[26]. Some of them are targeted to It means that the FPU emulation in QEMU could be improved. upstreaming instrumentation functions into the QEMU core. But in other aspects, instrumenting the program in the Therefore we don’t focus on instrumentation efficiency in virtual machine is effective against simple instrumentation de- our work, because we plan to adopt the approaches included tection, because virtual machine is intended to emulate all the into QEMU someday. SWAT just includes proof-of-concept details of the environment for the program. Execution within

201 Table I: Firmware and OS introspection evaluation We also extended QEMU with instrumentation subsystem and created an introspection plugin which is capable of Firmware / OS Kernel Regular API tracing full system API tracing for Windows XP/i386, Windows OS load time, sec overhead i386 10/x86 64, and various Linux-based operating systems for 3.7.0 4.9.65-1 76 83 9.2 % i386, ARM, x86 64, AArch64. Most of the introspection code 4.2.3-1 24 31 29 % is agnostic to the OS version and hardware platform, therefore DD-WRT v24 2.6.23.17 59 66 12 % Endian 2.1.2 2.6.9-55 42 45 7.1 % other Windows versions and other CPUs for Linux can be floppyfw 3.0.14 2.4.37.10 23 31 34 % easily supported. IPCop 2.1.8 3.4-3 156 230 47 % IPFire 2.19 3.14.79 480 516 7.5 % We plan to add more analysis and introspection capabilities LEAF 3.1 2.4.34 57 79 39 % like process and thread tracking, stack monitoring, and process LEAF 6.1.1 4.9.68 613 653 6.5 % memory dumping. We also will extend debugging capabilities LEDE 17.01.4 4.4.92 51 52 1.9 % MikroTIK 6.41 59 82 39 % to allow WinDbg-style separate process analysis for non- Openwall 3.1 2.6.18-408 38 46 21 % Windows operating systems. OpenWRT 15.05 3.18.23 27 28 3.7 % Untangle 3.16.0.4 241 280 16 % Windows XP 20 529 2545 % ACKNOWLEDGMENTS 2.0 3.4.6 362 460 27 % x86 64 The work was partially supported by the Ministry 7 3.2.0-4 66 121 83 % OpenWRT 19.07.01 4.14.167 16 23 43 % of Education and Science of Russia, research project Windows 10 60 509 748 % No. 2.6146.2017/8.9, and by the Russian Foundation of Basic ARM Debian 7 3.2.0-4 65 134 106 % Research (research grant 18–07–00900 A). OpenWRT 19.07.01 4.14.167 6.2 7.5 21 % Raspbian 57 239 319 % AArch64 REFERENCES Debian 10.0.3 4.19.0-5 24 54 125 % [1] R. O’Callahan, C. Jones, N. Froyd, K. Huey, A. Noll, and N. Partush, “Engineering record and replay for deployability: Extended technical report,” CoRR, vol. abs/1705.05937, 2017. [Online]. Available: virtual machine can be detected too, but these methods usually http://arxiv.org/abs/1705.05937 differ from userspace instrumentation detection methods [34]. [2] B. Dolan-Gavitt, J. Hodosh, P. Hulin, T. Leek, and R. Whelan, “Repeat- able reverse engineering for the greater good with panda,” Oct. 2014. B. Introspection [3] J. Engblom, “A review of reverse debugging,” in in S4D, 2012. [4] T. Garfinkel and M. Rosenblum, “A virtual machine introspection based We applied our introspection framework to publicly avail- architecture for intrusion detection,” in In Proc. Network and Distributed able and firewall distributions1, to some other Linux Systems Security Symposium, 2003, pp. 191–206. distributions, and Windows releases for x86, x86 64, ARM, [5] P. Dovgalyuk, N. Fursova, I. Vasiliev, and V. Makarov, “Introspection of the linux-based embedded firmwares: Work-in-progress,” in Proceedings and AArch64 platforms. of the International Conference on Embedded Software, ser. EMSOFT Kernels in the firmwares are usually customized. Therefore ’18. Piscataway, NJ, USA: IEEE Press, 2018, pp. 3:1–3:2. [Online]. we cannot apply any of the predefined profiles from the other Available: http://dl.acm.org/citation.cfm?id=3283535.3283538 [6] “Python scriptable reverse engineering sandbox, a virtual machine tools. Introspection agents couldn’t be uploaded, because there instrumentation and inspection framework based on qemu.” [Online]. are no development tools in the most of the firmwares. Available: https://github.com/Cisco-Talos/pyrebox We evaluated execution overhead caused by enabling the [7] F. Bellard, “Qemu, a fast and portable dynamic translator,” in Proceedings of the Annual Conference on USENIX Annual Technical introspection by booting the OS and measuring the time to Conference, ser. ATEC ’05. Berkeley, CA, USA: USENIX Association, boot without any introspection and with enabled API moni- 2005, pp. 41–41. [Online]. Available: http://dl.acm.org/citation.cfm?id= toring (Table I). 1247360.1247401 [8] P. Dovgalyuk, “Deterministic replay of system’s execution with multi- There are some cases with extremely large overhead. We target qemu simulator for dynamic analysis and reverse debugging,” plan to investigate them and improve the performance of in Proceedings of the 2012 16th European Conference on Software the instrumentation and introspection. But in most cases the Maintenance and Reengineering, ser. CSMR ’12. Washington, DC, USA: IEEE Computer Society, 2012, pp. 553–556. overhead is quite reasonable, which allows using our API [9] J. Chow, T. Garfinkel, and P. M. Chen, “Decoupling dynamic program monitoring tool for full analysis. analysis from execution in virtual environments,” in USENIX 2008 Annual Technical Conference on Annual Technical Conference, ser. IX. CONCLUSION ATC’08. Berkeley, CA, USA: USENIX Association, 2008, pp. 1–14. [Online]. Available: http://dl.acm.org/citation.cfm?id=1404014.1404015 SWAT is a toolkit aimed to full system debugging and anal- [10] A. M.A. and D. P.M., “Stealth debugging of programs in qemu ysis. It is available on github https://github.com/ispras/swat emulator with windbg debugger,” Proceedings of the Institute for System Programming of the RAS, vol. 30, no. 3, pp. 87–92, 2018. and supports full system reverse debugging with GDB and [Online]. Available: http://ispras.ru/proceedings/docs/2018/30/3/isp 30 full system non-intrusive debugging with WinDbg. SWAT 2018 3 87.pdf provides GUI for managing QEMU virtual machines and [11] P. Dovgalyuk, D. Dmitriev, and V. Makarov, “Don’t panic: Reverse debugging of kernel drivers,” in Proceedings of the 2015 10th Joint creating execution recordings. Meeting on Foundations of Software Engineering, ser. ESEC/FSE 2015. New York, NY, USA: ACM, 2015, pp. 938–941. [Online]. Available: 1https://en.wikipedia.org/wiki/List of router and firewall distributions http://doi.acm.org/10.1145/2786805.2803179

202 [12] E. G. Cota and L. P. Carloni, “Cross-isa machine instrumentation York, NY, USA: ACM, 2017, pp. 944–948. [Online]. Available: using fast and scalable dynamic ,” in Proceedings http://doi.acm.org/10.1145/3106237.3122817 of the 15th ACM SIGPLAN/SIGOPS International Conference on [25] B. Dolan-Gavitt, J. Hodosh, P. Hulin, T. Leek, and R. Whelan, Virtual Execution Environments, ser. VEE 2019. New York, “Repeatable reverse engineering with panda,” in Proceedings of the 5th NY, USA: ACM, 2019, pp. 74–87. [Online]. Available: http: Program Protection and Reverse Engineering Workshop, ser. PPREW-5. //doi.acm.org/10.1145/3313808.3313811 New York, NY, USA: ACM, 2015, pp. 4:1–4:11. [Online]. Available: [13] P. P. Bungale and C.-K. Luk, “Pinos: A programmable framework http://doi.acm.org/10.1145/2843859.2843867 for whole-system dynamic instrumentation,” in Proceedings of the [26] V. Chipounov, V. Kuznetsov, and G. Candea, “S2e: A platform 3rd International Conference on Virtual Execution Environments, ser. for in-vivo multi-path analysis of software systems,” SIGPLAN VEE ’07. New York, NY, USA: ACM, 2007, pp. 137–147. [Online]. Not., vol. 47, no. 4, pp. 265–278, mar 2011. [Online]. Available: Available: http://doi.acm.org/10.1145/1254810.1254830 http://doi.acm.org/10.1145/2248487.1950396 [14] J. Chow, D. Lucchetti, T. Garfinkel, G. Lefebvre, R. Gardner, J. Mason, [27] J. Zeng, Y. Fu, and Z. Lin, “Pemu: A pin highly compatible S. Small, and P. M. Chen, “Multi-stage replay with crosscut,” out-of- dynamic binary instrumentation framework,” SIGPLAN in Proceedings of the 6th ACM SIGPLAN/SIGOPS International Not., vol. 50, no. 7, pp. 147–160, Mar. 2015. [Online]. Available: Conference on Virtual Execution Environments, ser. VEE ’10. New http://doi.acm.org/10.1145/2817817.2731201 York, NY, USA: ACM, 2010, pp. 13–24. [Online]. Available: [28] Y. Hebbal, S. Laniepce, and J. Menaud, “Virtual machine introspection: http://doi.acm.org/10.1145/1735997.1736002 Techniques and applications,” in 2015 10th International Conference on [15] N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, Availability, Reliability and Security, Aug 2015, pp. 676–685. J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, [29] “Volatility framework — volatile memory extraction utility framework.” M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood, “The gem5 simulator,” [Online]. Available: https://github.com/volatilityfoundation/volatility SIGARCH Comput. Archit. News, vol. 39, no. 2, pp. 1–7, Aug. 2011. [30] “Rekall memory forensic framework.” [Online]. Available: http: [Online]. Available: http://doi.acm.org/10.1145/2024716.2024718 //www.rekall-forensic.com [16] M. Bolte, M. Sievers, G. Birkenheuer, O. Niehorster,¨ and A. Brinkmann, [31] J. Hizver and T.-c. Chiueh, “Real-time deep virtual machine “Non-intrusive virtualization management using libvirt,” in Proceedings introspection and its applications,” in Proceedings of the 10th ACM of the Conference on Design, Automation and Test in Europe, ser. SIGPLAN/SIGOPS International Conference on Virtual Execution DATE ’10. 3001 Leuven, Belgium, Belgium: European Design Environments, ser. VEE ’14. New York, NY, USA: ACM, 2014, pp. and Automation Association, 2010, pp. 574–579. [Online]. Available: 3–14. [Online]. Available: http://doi.acm.org/10.1145/2576195.2576196 http://dl.acm.org/citation.cfm?id=1870926.1871061 [32] J. Pfoh, C. Schneider, and C. Eckert, Advances in Information [17] “.” [Online]. Available: https://virt-manager. and Computer Security: 6th International Workshop, IWSEC 2011, org/ Tokyo, Japan, November 8-10, 2011. Proceedings. Berlin, Heidelberg: [18] “Aqemu: a gui for virtual machines using qemu as the backend.” Springer Berlin Heidelberg, 2011, ch. Nitro: Hardware-Based System [Online]. Available: https://github.com/tobimensch/aqemu Call Tracing for Virtual Machines, pp. 96–112. [Online]. Available: [19] “Remote debugging using windbg.” [Online]. Avail- http://dx.doi.org/10.1007/978-3-642-25141-2 7 able: https://docs.microsoft.com/en-us/windows-hardware/drivers/ [33] D. C. D’Elia, E. Coppa, S. Nicchi, F. Palmaro, and L. Cavallaro, debugger/remode-debugging-using-windbg “Sok: Using dynamic binary instrumentation for security (and [20] “Using the gdbserver program.” [Online]. Available: ftp://ftp.gnu.org/ how you may get caught red handed),” in Proceedings of the old-gnu/Manuals/gdb/html node/gdb 130.html 2019 ACM Asia Conference on Computer and Communications [21] “Gdb: The gnu project debugger.” [Online]. Available: https://www. Security, ser. Asia CCS’19. New York, NY, USA: Association gnu.org/software/gdb for Computing Machinery, 2019, pp. 15–27. [Online]. Available: [22] “Winbagility.” [Online]. Available: https://winbagility.github.io/ https://doi.org/10.1145/3321705.3329819 [23] “Icebox.” [Online]. Available: https://github.com/thalium/icebox [34] R. Paleari, L. Martignoni, G. F. Roglia, and D. Bruschi, “A fistful [24] P. Dovgalyuk, N. Fursova, I. Vasiliev, and V. Makarov, “Qemu-based of red-pills: How to automatically generate procedures to detect cpu framework for non-intrusive virtual machine instrumentation and emulators,” in Proceedings of the 3rd USENIX Conference on Offensive introspection,” in Proceedings of the 2017 11th Joint Meeting on Technologies, ser. WOOT’09. USA: USENIX Association, 2009, p. 2. Foundations of Software Engineering, ser. ESEC/FSE 2017. New

203