PyPANDA: Taming the PANDAmonium of Whole System Dynamic Analysis Luke Craig∗, Andrew Fasano∗y, Tiemoko Ballo∗, Tim Leek∗ Brendan Dolan-Gavittz William Robertsony ∗MIT Lincoln Laboratory {Luke.Craig,Andrew.Fasano,Tleek,Tiemoko.Ballo}@ll.mit.edu yNortheastern University
[email protected],
[email protected] zNew York University
[email protected] Abstract—When working with real world programs, dy- IDA Pro1, and Ghidra2 all support conducting analyses namic analyses often must be run on a whole-system instead of from scripting languages, such functionality is rarely just a single binary. Existing whole-system dynamic analysis present in whole-system dynamic analysis platforms lead- platforms generally require analyses to be written in compiled languages, a suboptimal choice for many iterative analysis ing to cumbersome workflows. For example, consider the tasks. Furthermore, these platforms leave analysts with a split task of conducting a whole-system dynamic taint analysis view between the behavior of the system under analysis and on data sent to a custom kernel module that ultimately the analysis itself—in particular the system being analyzed flow into a user space application. An analyst must must commonly be controlled manually while analysis scripts approach this task through two distinct, but complemen- are run. To improve this process, we designed and imple- mented PyPANDA, a Python interface to the PANDA dynamic tary, processes. First, they must drive the guest system’s analysis platform. PyPANDA unifies the gap between guest behavior: boot the system, log in, obtain the relevant virtual machines behavior and analysis tasks; enables painless source code and toolchains, compile the code (or copy in a integrations with other program analysis tools; and greatly prebuilt binary), and load the kernel module.