Submission Version

Emulization: a Framework for Static Analysis of Dynamic Code via Emulation

Abbas Naderi-Afooshteh, Anh Nguyen-Tuong, Jason D. Hiser, Jack W. Davidson Department of Computer Science, University of Virginia [email protected], [email protected], [email protected], [email protected]

Abstract Categories and Subject Descriptors F.3.2 [Logics and Static analysis of dynamic scripting code is a very chal- Meanings of Programs]: Semantics of Programming Languages— lenging problem due to the extensive use of dynamic features Program Analysis; D.3.4 [Programming Languages]: Processors— such as run-time code generation, dynamic aliasing, dynamic Interpreters; D.3.1 [Programming Languages]: Formal Defi- weak typing, and implicit object creation, to name a few. nitions and Theory With increasing popularity of dynamic languages, prior General Terms Design, Security research has focused on theoretical, or practical but ad- hoc solutions to handle many of these dynamic features. Keywords Dynamic Analysis, Static Analysis, Emulation Unfortunately, a comprehensive static analysis method that scales to popular real-world applications is lacking. 1. Introduction This work tackles the problem from a new perspective. Our Static analysis of dynamic applications, such as those writ- approach enables us to have an accurate, scalable method for ten in PHP, Javascript, Python or Bash Script, is an open analysis of dynamic applications. research problem. Several challenging problems plague anal- Our method, emulization, is based on a novel dynamic ysis of dynamic code. Constructing a sound or complete analysis powered by emulation, combined with static analysis. control flow graph (CFG), reasoning about dynamic evalua- The combination provides a framework for context and flow tions (i.e., calls to eval), type inference, error handling and sensitive dynamic and static analysis of dynamic applications. alias inference are some of these issues. To demonstrate its scalability and accuracy, emulization Most existing solutions try to extrapolate static analysis was applied to large and very popular dynamic applications methods developed for strongly typed, static (or less dynamic) such as Wordpress and (262k and 472k lines of code languages such as C or Java to dynamic languages, while respectively, with more than 40 different dynamic features). providing novel solutions for some of the problems specific Our analysis of these applications resulted in the discovery to dynamic languages (such as dynamic includes). None of of at least 7 bugs in PHP, at least 2 of which are confirmed these approaches has demonstrated analysis of dynamic real- security vulnerabilities. We also applied emulization to a world applications scaling above 100k lines of code. benchmark group of applications commonly evaluated by Another sizable category of solutions uses a combination prior work. of static and dynamic analysis, while employing a wide Finally, we demonstrate the simplicity of implementing range of ad-hoc approaches to tackle challenges of statically a practical analysis on top of our framework by creating a analyzing dynamic code. These solutions either use dynamic taint tracker in less than 3 hours consisting of 300 lines of analysis on small basic blocks, or obtain a trace of the code. The taint tracker confirmed several reported vulnerabil- application to help in the static analysis phase. ities from the benchmark and discovered two new security We believe neither of these approaches is satisfactory for vulnerabilities in Wordpress and Joomla. analysis of dynamic code, especially when the code base is large (as is the case with most popular applications today). This belief is strengthened by the fact that none of the prior work could reason about such applications. In fact, many projects note that it is not possible to precisely analyze dynamic code via their approach [1, 3, 9, 10, 15, 22, 30, 36]. With respect to the implementation of analysis techniques, many solutions modify the interpreter [1, 28, 33]. Others instrument the code, either lightly or heavily, to extract the [Copyright notice will appear here once ’preprint’ option is removed.] data and perform post processing analysis [27, 29]. The

1 2016/11/12 remaining solutions use formal methods to parse and model The paper is organized as follows. Section 2 provides the dynamic code and then reason about it [14–16, 22]. technical background and discusses why dynamic features We believe that these implementation approaches are not present challenges for static analysis. Section 3 describes well suited for the analysis of dynamic applications. Modi- our method which is based on emulation. Section 4 covers fying the interpreter is very hard, error prone, and requires challenges of emulating dynamic code and our experience modeling all functionality. Formal reasoning also requires with addressing these challenges. Section 5 presents our modeling of all language features, which can take thousands analysis framework and its capabilities. The evaluation of of hours of work to provide reasonable completeness. That our framework is contained in Section 6. It includes analysis is why the majority of prior work notes that they are not coverage and performance measurements as well as case complete, due to lack of modeling or implementation of var- studies. Section 7 discusses the related work and contrasts it ious challenging dynamic features. Finally, instrumenting with our work, while Section 8 provides a summary. the code breaks metaprogramming and reflection, and incurs significant performance overhead. 2. Background emulation In contrast, our solution is based on , i.e., creat- 2.1 Dynamic Code Definition ing an interpreter of the dynamic language in itself. There are several benefits. Emulation supports proxying, allowing the Throughout this paper dynamic code is defined to be any implementation of necessary features with the desired granu- code that does not have a clear separation between data larity, while proxying all undesired features of the language and executable code, relying on the interpreter regularly to to the underlying interpreter to preserve accuracy. execute data (e.g., variable contents) as code. Since emulation is done in the dynamic language itself, Based on this definition, we use dynamic programming low-level details of its semantics are abstracted away (e.g., languages to refer to languages which pervasively use dy- memory management, optimizations and implicit type casts), namic features, such as PHP, Ruby, Python, Perl, Javascript, significantly reducing the complexity of implementation Lua, Bash Script and several other scripting languages, in while minimizing the risks associated with incorrect modeling contrast with languages such as C# and Java, which have an or erroneous implementation of semantics [10, 17, 28]. interpreter ready and provide dynamic coding features such as Finally, emulation in the high-level language supports Reflection and CodeDOM, but are not typically used to create rapid development. Our prototype implementation of the PHP dynamic code. Dynamic languages are also frequently used emulator consists of 4000 total lines of code, whereas the for metaprogramming, i.e., programs that modify themselves analyzer by Dahse et. al. [10] is roughly 65,000 lines of code. during runtime and execute data as code. The implementation of Weverca [15] is about 200,000 lines The goal of this research is to provide a platform for anal- of code, while supporting much less of the PHP language ysis of dynamic applications developed using dynamic pro- than our tool. As far as we are aware, our work is the first to gramming languages (which is the majority of applications fully model its target language, including 44 dynamic features developed using such languages). necessary for accurate analysis. 2.2 Dynamic Features The primary drawback of emulation is that it adds another layer on top of the interpreter, which impacts performance. In this section, we discuss some of the dynamic features However, our evaluations show that the performance is com- common in dynamic code, making their analysis challenging: petitive with other analysis methods. 2.2.1 Dynamic Include Contributions. The main contributions of this research are Dynamic languages, just like any other programming lan- as follows: guage, benefit from structuring code into several files, orga- nized into several directories. Many applications developed • A discussion of difficult to analyze dynamic features in using these languages have one or few entry points into the dynamic programming languages, and potential solutions. application, even though all their files can be executed individ- • A novel method for analysis of dynamic code, accompa- ually. This design allows such applications to load libraries nied by a dynamic code analysis framework based on the and code required to perform desired functionality, then inter- method and powered by emulation. nally dispatch the request to the respective individual script, • An open source, well-documented and extensible emulator enabling easier control over flow of the application while for PHP supporting all features of the target language. reducing complexity of individual scripts. One of the most problematic features in analyzing dy- • An open source, extensible flow and context-sensitive namic languages are dynamic includes. Includes in dynamic analysis framework for PHP. languages, in contrast with static languages, are not prepro- • An evaluation of the framework and its capabilities, in- cessed or static, instead they are typically an expression. An cluding discovery of seven new bugs in PHP, as well as include expression can be repeated (resulting in the same two new security vulnerabilities in Wordpress and Joomla. code being included and executed multiple times), can be lo-

2 2016/11/12 cated inside an anonymous function, and commonly involves These dispatch blocks are very frequent in popular applica- subexpressions such as function calls and operators. tions, about 184 instances exist in Wordpress and more than Our framework puts extra focus on dynamic includes, and 80 exist in Joomla. The transforms used in these dispatch the difficulty in analyzing them is one of the important under- blocks rely heavily on application configurations, application lying assumptions of our method. Our framework analyzes state and user input. dynamic applications as a whole rather than individually ana- lyzing scripts available in a larger application. 2.2.2 Weak Typing Resolving dynamic includes is necessary both for con- Weak typing and unspecified, undocumented semantics structing a precise, yet complete Control Flow Graph (CFG), for explicit and implicit type conversions are one of the most and for proper code discovery. important issues in statically analyzing dynamic languages [3, Much of the prior work assume that the entire code of one 14, 18, 19]. The limited type analysis cripples static analysis, application resides in a directory hierarchy, and exercise this and is particularly troublesome in dynamic analysis. assumption to find possible candidates for dynamic includes. For example, both if statements in Listing 2 evaluate to Although this assumption is usually valid, many applications true, due to implicit casting of strings to integers and floats fetch code from the Internet, databases and other sources when being compared to a number. These implicit type casts (e.g., license validation, plugins, auto-update, etc.). Javascript aim to make the application more resilient, a desired property is notoriously famous for pulling code from many different in dynamic languages. servers dynamically. The dynamic loading of external code results in an incomplete code discovery and analysis, if 1 $input=$_GET[’password’]; 2 if(md5($input)== dynamic includes are not properly resolved. 3 "0e123456789012345678901963424806") Another issue is when analyzers discover code that is un- 4 echo"Access granted!",PHP_EOL; used in the application. Many popular dynamic applications 5 support plugin architectures for several of their components 6 if($input<100) 7 echo"Access granted!!",PHP_EOL; such as themes, modules, languages, layouts etc. Usually only one (or a few) of these available plugins are loaded into one Listing 2: Two counter-intuitive implicit casts done by PHP instance of the application, and because these architectures resulting in security vulnerabilities. generally use similar structures (e.g., same function, class and file names), including all of them in the analysis will In lines 1 to 4, a password is received via user input, then its result in significant loss of precision. MD5 checksum is compared against a literal MD5 to validate It is important to note that in many applications, input the password, but because the literal MD5, although in string is used as part of the expression that determines included form, is the scientific notation representation of zero (i.e., 0 files, as observed by [30] and [18]. For example, the list of to the power of 1234567890...), any input that generates an active plugins in Wordpress is loaded dynamically from the MD5 of a similar form will be accepted because both sides of database, and database configurations are loaded dynamically the == operator will be automatically converted into floating from a configuration file. point numbers. Dynamic includes can be abstracted as converting data into Line 6 checks whether input is less than a hundred. PHP code. Although the included file is typically executed as is, automatically converts all strings to the nearest possible there are many cases in which the file is modified dynamically number when comparing against a number (e.g., ”123hello” before inclusion and execution (such as the cases of template would become 123), and defaults to zero when no viable engines or temporary files). candidate is found. Perhaps the most challenging instances of dynamic in- Unless all these subtle behaviors are modeled accurately, cludes are calls to dispatch blocks, code that transforms in- an analyzer will produce imprecise results, as observed in puts into dynamic calls and dispatches them (see Listing 1). previous work [10, 14, 22]. As apparent from the code, it is very hard, if not impossible, 2.2.3 Function Overloading to reason about such basic blocks sans context. Dynamic languages typically do not support function 1 function dispatch($args){ overloading, because type checking is not performed when 2 $callee=dispatch_transform($args[0]); 3 if(is_callable($callee)) calling functions, causing developers to employ branching 4 return call_user_func_array( logic in the beginning of the functions to provide the desired 5 array_pop($args),$args); functionality based on the types of function arguments. 6 } For example, many PHP functions use func get args(), 7 include dispath($_GET[’app’],$user,$pass); call user func() and similar constructs to push or retrieve Listing 1: A dispatch block in PHP. Input to the function is used parameters available on the stack (or directly access them to dynamically determine the next call. on the application backtrace), branching inside the function based on the types of the variadic arguments. As another

3 2016/11/12 example, Python supports unwrapping lists and dictionar- 6 $z=3; ies into function parameters in its calling convention, e.g., 7 constu=4; func(a,b,c,*list,**dict). Such behavior is further dis- 8 functionv(){return5;} 9 cussed in Section 2.2.8. 10 $in=$_GET[’1’][0]; 11 assert($in<=’z’ and$in>=’u’); 2.2.4 Autoloading 12 if(function_exists($in)) Autoloading is the ability to dynamically resolve symbols 13 echo$in(); 14 if(isset($$in)) that are used, but not yet defined. Dynamic code commonly 15 echo$$in; benefits from autoloading, typically via dispatch blocks that 16 if(isset(C::$$in)) include files instead of calling functions, by receiving the 17 echoC::$$in; 18 echo@$o->$in; symbol name (e.g., class name) as input and dynamically 19 if(defined($in)) generating the file name that needs to be included (see 20 echo constant($in); Listing 3). Listing 4: Examples of dynamically referencing entities such as Several autoloaders can be registered in a dynamic appli- classes, methods, functions, variables and constants in PHP. cation, and the interpreter will try all of them before throwing an error due to lack of declaration. At line 10, an input is received from the user. Line 11 ensures that the input is between characters ’u’ and ’z’ lexically. Then

1 function autoloader($class){ lines 12 to 20 attempt to dynamically reference the input via 2 preg_match_all(’!([A-Z][A-Z0-9]*(?=$|[A-Z][ dynamic function referencing, dynamic variable referencing a-z0-9])|[A-Za-z][a-z0-9]+)!’,$class, (a.k.a. variable variables), dynamic class member referencing, $matches); 3 $file=__DIR__."/classes/".implode("/",$ dynamic object property referencing and dynamic constant matches[1])."."; referencing respectively. Note that on line 18, the error 4 include$file; suppression operator @ is used, which suppresses any possible 5 } errors in its following expression. Dynamic languages also 6 spl_autoload_register("autoloader"); 7 $obj=new MySampleClass(); support dynamic referencing of arbitrary dimension array elements, a feature commonly used by applications. Listing 3: A common autoloader used by many PHP projects It is important to note that these features are not rarely and conventions. used, rather they are frequently employed by dynamic appli- For example, many PHP frameworks provide autoloading cations. Wordpress for example, has more than 850 instances functions that convert a class name to its representative source of dynamic function referencing in its core code, executing code file in the application directory hierarchy either using a functions whose names are stored in a dynamic array popu- map or dynamic conversion. lated by application configurations and user input. The code in Listing 3 registers an autoloader to be invoked 2.2.6 Dynamic Evaluation whenever an undefined symbol is used, then attempts to Perhaps the most challenging feature of dynamic lan- create an object of type MySampleClass. Since the class is guages that hinders many static analysis methods applicable undeclared, the autoloader is invoked with the class name as to static languages is dynamic evaluation. Dynamic evalu- its parameter. The regular expression on the first line converts ation basically means executing a string variable as code, the class name into an array of words (based on CamelCase and is commonly available via eval() function in dynamic notation), and then constructs a file-system path using those languages. It is important to note that dynamic evaluation words and then includes the file. is by no means limited to the eval() function, many other 2.2.5 Dynamic Referencing functions and language constructs in dynamic languages use dynamic evaluation internally. This fact is often ignored by Almost all entities (e.g., functions, variables, classes and prior work [18]. constants) in dynamic languages can be dynamically created. Unless dynamic evaluation’s input can be statically de- For example, the PHP function create function() can termined and reasoned about, static analysis will lose a sig- dynamically create a function, the PHP function define() nificant accuracy whenever dynamic evaluation is exercised. can be used to define constants dynamically and the PHP Prior work has attempted to refactor evals into static language function class alias() can be used to create aliases for constructs, as many use cases of eval are for brevity and are existing classes. These entities can also be dynamically statically resolvable, but admits that there are cases of eval referenced. Consider Listing 4: usage in many applications that can not be resolved stati- 1 ClassC{ cally [18]. In fact, eval is truly only required in these cases, 2 static$x=1; which involve user input in constructing the evaluated string; 3 public$y=2; 4 } otherwise there would be no need for supporting eval in the 5 $o=newC; programming language.

4 2016/11/12 Listing 5 shows a sample dynamic code that implements the interpreter, many prior work have reported having is- an advanced calculator using a few lines of PHP code: sues with modeling object handling in dynamic code analy- sis [9, 19, 36]. As far as we are aware, none of the prior work 1 function calculate($in){ fully support modeling and analyzing object oriented PHP 2 if(!preg_match(’/^[\sa-z0-9\(\) \.\/\*\+\-]*$/i’,$in)) return null; code, even though it has existed since 2004. 3 preg_match_all("/(\w+)\s*\(.*?\)/i",$in, $matches); 2.2.8 Interpreter Introspection 4 foreach($matches[1] as$m) 5 if(!in_array($m,explode(",","abs,cos, A very important dynamic feature frequently exercised log,max,min,pow,rand,sin,sqrt,tan") by dynamic code is what we call Interpreter Introspection. )) return null; //unsafe function Applications relying on interpreter introspection make control 6 return eval("return {$in};"); flow decisions based on the state of the interpreter, such as 7 } backtrace and its parameters, list of executed files, amount of Listing 5: An advanced calculator implemented in only a few memory used by the interpreter, version of the interpreter etc. lines of code, by relying on the interpreter dynamic evaluation. Interpreter introspection is very common in dynamic applications. Wordpress uses more than 60 different func- The code uses eval to evaluate arbitrary mathematical expres- tions that probe the interpreter state in more than 100 in- sions, but only allows the mathematical functions whitelisted stances throughout its core code. Listing 6 is taken from on line 5. Static analysis would have a very hard time reason- wp-includes/theme.php in Wordpress: ing about the resulting dynamically-generated expression. Eval is typically used when the application forms a se- 1792 if(’title-tag’ ==$feature){ quence of statements constructed from sanitized user input, 1793 // Don’t confirm support unless called internally. performing a complex task using built-in interpreter features. 1794 $trace= debug_backtrace(); Eval is also commonly used to address compatibility and 1795 if(! in_array($trace[1][’function’], portability issues when the platform does not support certain array(’_wp_render_title_tag’,’ symbols or operations. In such use cases, availability of the wp_title’))) 1796 return false; functionality is checked first, then the code is executed using 1797 } eval to prevent parse errors that can arise prior to execution. Listing 6: Interpreter introspection example in Wordpress 2.2.7 Object Handling The code uses interpreter introspection to check whether Dynamic languages implement objects differently than the immediate caller (available via the backtrace on line 1794) static languages. Objects in dynamic languages are not nec- is a certain function or not, and makes control flow decisions essarily instances of strongly typed classes, instead they can based on this information. change their class and type implicitly or explicitly, aside from To the best of our knowledge, interpreter introspection is not conforming to their original class definitions. For exam- not explicitly categorized or supported by any of the prior ple, Javascript has the ability to change an object’s prototype, work, and is unique to this work. Most previous work provide then create new objects based on the modified object’s proto- ad-hoc solutions to certain use cases of interpreter introspec- type. PHP has the ability to dynamically add functions and tion, but do not discuss general solutions for modeling them. properties to an object. We further investigate challenges involved with modeling and Dynamic languages also typically support dynamic object analysis of interpreter introspection in Sections 3 and 4. handlers, enabling programmers to take control of object behaviors in a wide range of programming constructs, such 2.2.9 Other Features as using objects as loop iterators and array-like access. There are many other features common in dynamic lan- PHP provides magic methods, methods that are called guages that are very hard to model and properly implement. when a certain operation is performed on an object and the These features are outliers, but our evaluations show that default behavior is not applicable. For example, accessing without proper support, analysis produces invalid artifacts. a property that does not exist or is not visible, casting to a For example, prior to providing support for dynamic prop- type that is not implicitly possible, calling a method that is erty handlers (a.k.a., magic methods) and property visibility not available etc. which will fail by default, are redirected to checks, our framework would mistakenly retrieve a null pri- magic methods in PHP. Custom object handlers, available via vate property from an object, rather than calling the dynamic inheriting certain core classes such as ArrayIterator and property handler, missing about 30% of the code. ArrayObject are also supported by PHP. Javascript, Ruby and a range of other dynamic languages also provide these features. Because object handling mech- 3. Method anisms are sometimes counter-intuitive and their semantics Considering the challenges discussed in Section 2, and are not well documented and change with every release of with respect to noted inability of the related work to scale

5 2016/11/12 index.php 1 //index.php load 2 load_config...; config 3 if($input_available) 4 if($database_config) input No DB 5 include’database.php’; available? Config 6 else 7 die("No DB Config."); database 8 else config? 9 include default_theme; 10 11 //theme_default.php 12 if($content_available){ 13 draw_theme...; content database 14 show_content($input); available? connect? 15 } else 16 do_404...; 17 18 //database.php draw db_error 19 def show_content($arg): print sql($arg); do theme 404 20 if($database_connect...) show_content 21 include$selected_theme; 22 else theme_default.php database.php 23 call($error_func,$db_error,$input); 24 draw 25 //custom_theme.php theme 26 draw_theme...; custom_theme.php 27 show_content($input); Figure 1: Caption. Listing 7: Psuedo code for figure 1 to large and popular dynamic applications (Section 7), we developed a novel combined analysis method called Emuliza- evaluations demonstrate that in many cases, one trace corre- tion that revolves around dynamic analysis accompanied by sponds to more than 50% of the application files as well as traditional static analysis as a means of analyzing dynamic a majority of application code being discovered, due to the code. common design patterns used in dynamic applications, while Emulization is grounded on accurate dynamic analysis, our counterfactual, multi-path emulation provides coverage as we believe (due to reasons described in Section 2 and the for up to 100% of application files and code. emerging patterns in the cumulative experience of prior work) 1 $input=$_GET[’algorithm’]; that a static analysis of dynamic languages is either infeasible 2 if($input==1){ or an unsolved hard problem. Emulization provides static 3 functionf($x){ return$x*2;} analysis with concrete artifacts discovered and refined via 4 } elseif($input==2){ 5 functionf($x){ return$x*$x;} dynamic analysis, either concurrently or consecutively. Our 6 } else{ hypothesis is that emulization enables us to address several 7 functionf($x){ return$x;} of the challenges in analysis of dynamic code. 8 } 9 $v=5; 3.1 Combined Analysis 10 printf($v); Instead of starting with static analysis and dynamically Listing 8: Example of dynamic code with multiple definitions reconstructing basic blocks and dynamic evaluations, we start for the same function, only one of which is used at runtime. with dynamic analysis powered by emulation and statically analyze blocks that are procured. That is our contrast with 3.2 Emulation all other approaches. Our framework exercises a two-pass Although there are alternatives to emulation as a means interpretation on dynamic code, inspired by the natural of dynamic analysis (many of which are used by prior work), behavior of many interpreters, by first extracting definitions, we believe emulation provides the desired characteristics not and then emulating the code. The two-pass interpretation viable in alternative approaches. In this section, we will go segregates statements into declarations and executions. This through other alternatives and contrast them with emulation. approach allows multiple definitions for certain artifacts (such One alternative is to use a Shadow Interpreters. A shadow as functions) to co-exist in different blocks of an application. interpreter supplies the interpreter with shadow computations For example, consider the code in listing 8. that keep track of the analysis artifacts in parallel with the Emulation enables discovery of at least one viable con- actual interpretation. This approach would provide better trol flow graph, including all dynamically included files. Our performance and is easier for quick modifications, but has

6 2016/11/12 several significant drawbacks, the most important of which Error Handling. Many scripting languages support error han- is understanding and maintenance of low-level, highly opti- dlers. These error handlers deal with both fatal and non-fatal mized interpreter code. The difficulty of fully understanding errors in execution. Commonly, fatal errors only allow clean- interpreter code and changing it accordingly has resulted in up, and are non-recoverable, hence emulators should be able the relevant prior work being incomplete and imprecise (as to recognize fatal errors before they happen, as there would noted by [9]). Our experiences show that implementation of a be no way to recover from them and continue or suspend particular feature in a shadow interpreter takes 5 to 10 times emulation (this becomes more important in counterfactual the effort of implementing the same feature in the high-level execution, where counterfactual branches frequently result in language. a recoverable fatal error). Another approach is to instrument the source code. Due Emulators need to register an error handler via the inter- to highly reflective nature of dynamic code, instrumentation preter to handle emulation errors, to ignore suppressed errors, is very likely to change the behavior of the code (e.g., the and to forward errors to the possible error handler registered caller will be different). Prior work has shown that even in the emulated application. Emulators also need to have try- minimal instrumentation incurs more than 20X performance catch blocks on each block of executed code to catch potential increase [29]. exceptions in the interpreter. The third approach used by at least five of the prior Interpreter State. Every dynamic language has several func- works [7, 14, 16, 22, 39] defines semantics of the target tions that either probe or modify the interpreter state. Such language in another language. This method has been the most functions have to be overwritten (mocked) by the emula- error prone historically, and also requires the most effort. tor, returning the emulator state instead. For example, most Emulation on the other hand, has several benefits: dynamic languages have functions that return the current sym- bol table, list of defined functions etc., as well as functions Functionality proxy. An emulator can proxy all features overriding other functions, changing operators etc. Correct not necessary in a certain analysis, increasing performance implementation of such features results in an accurate Inter- and reducing complexity, while maintaining correctness and preter Introspection functionality (Section 2.2.8). precision. A basic emulator can consist of a parser and a loop that proxies statements to the underlying interpreter. Desired Variable Handling. As variables can be unset in dynamic features can be iteratively added to this basic emulator. languages (i.e., removed from a symbol table), the variable handling logic needs to return the symbol table that contains High-level code. Our prototype emulator consists of less a certain variable, as well as the variable name in that symbol than 4000 lines of PHP code, about 2000 lines of which are table. Returning a direct reference to the variable hinders to emulate object-oriented features. As far as we are aware, unsetting of variables (and some of major prior work do not our emulator supports more features of the PHP language support this [9, 14, 22]). Our approach uses separation of than any prior work. variable handling logic into symbol resolving and variable Native extensions. High-level languages provide convenient access, making it easier for future data-flow analyses. means of adding certain functionalities natively, typically Aliasing can be handled either explicitly or implicitly for performance and compatibility reasons. Our evaluations in an emulator. If the analyses requires explicit access to show that native implementation of performance intensive aliasing information, alias information can be maintained in features of the emulator (e.g., state isolation, cyclic reference the variable state, but requires maintenance on how native detection) improves analysis performance by one to two functions modify aliasing (or checking and rebuilding alias orders of magnitude. information after every native function call). Otherwise, A key observation is that dynamic (scripted) applications aliasing can be proxied to the interpreter, and alias detection are typically scripts intended to run and terminate in a logic used on demand. reasonable amount of time, and differ from daemons and Function Calls. Emulators need to support several types GUI applications where the application runs hypothetically of function call. Arguments to a function can be parse- ad infinitum. This assumption enables emulization to discover tree elements needing to be emulated before passing to the as much of the application as possible in reasonable time. function, or they can be emulated symbols and literals. The called function can be an emulated function (user function) or an interpreter core function (native function). For the former, 4. Emulation Challenges the function execution should be emulated, for the latter, In this section, we will discuss challenges in designing an variables and their types should be converted, native function emulator for dynamic languages. We realize that developing called, and then results expanded. Native functions can also an emulator that supports all dynamic features mentioned in be mocked for analysis and accurate emulation, in which case Section 2 is a hard problem. Our prototype PHP emulator instead of the original, the mocked version should be called. was developed over a period of 6 to 9 months, providing us with many valuable insights.

7 2016/11/12 Conversion of arguments is particularly important for One tricky area is handling mixed inheritance, where a types that are different between the emulator and the inter- user class inherits from a native class. A possible workaround preter when calling native functions. Two of such types are in such situations is to create an object of the core class, and emulated objects, which need to be converted to native ob- pass method calls and property fetches to the native class jects, and callbacks, which should be wrapped in another using reflection if the emulated hierarchy fails to discover the callback that calls the original callback via emulator rather method/property. than directly. The wrapping ensures that native functions can Mocked Functions. All dynamic languages have a myriad call callbacks that point to emulated functions, rather than of functions and classes that provide information about and attempting to call a non-existent callback. For languages that modify the state of the interpreter. These functions need to do not have a specific type for callbacks, a list of callback be mocked to return and modify the corresponding state in arguments in native functions is needed to model callback the emulator instead of the interpreter. In many cases, these conversion properly. functions output part of the interpreter state for debugging Dynamic languages typically support importing global purposes, but this output is sometimes extracted and used by and other scope variables into the current symbol table, using dynamic applications, and thus needs to be mocked as well. keywords such as global. Such variables should be aliased In our prototype PHP emulator, we had to mock about into the current symbol table. It is important to fully evaluate 100 functions, spanning areas such as reflection, object the arguments before creating the new symbol table, and probing, output buffering, error handling, debugging, variable adding function entry to the backtrace, because arguments handling, serialization, autoloading and dynamic calling. might (and typically do) have side-effects. Finally, it is vital that the emulator captures the standard output of native 5. Framework function calls, and redirects them as emulation output. We developed an open source emulator for PHP that Expressions. Dynamic expressions, like those used to evalu- supports all dynamic features in the language (up to PHP ate variable names dynamically, or those which extract con- 5.3). The open source emulator consists of about 4000 lines tainers into the current symbol table or vice versa, need to be of PHP code, about 2000 lines of which are to emulate object handled carefully. Many dynamic languages also support one oriented features in the language. Development of the PHP or more forms of error suppression, typically as expressions, emulator was an intense task requiring 6 to 9 months of allowing an expression to run without emitting errors and development, mostly due to lack of granular specifications warnings. Two other noteworthy expressions are eval, which for certain language features, which had to be either reverse evaluates a string as code, and include, which execute a engineered or extracted from the interpreter source code. code file. Includes in dynamic languages can return values. We also developed an open source analysis framework Remaining expressions should be relatively easy to emulate, (called the framework henceforth) for PHP based on the typically by proxying them to the underlying interpreter. emulator. The analysis framework provides several critical Object-Orientation. Objects and classes, just like functions, features for analysis of dynamic applications, keeps track of should be divided into native and user. User classes will be all artifacts procured and provides fine-grained statistics. fully emulated, and native classes and objects will be handled The goal of the analysis framework is to be easily extensi- by the underlying interpreter. When new objects are created, ble so that special purpose analysis (called ad-hoc analysis) aside from pulling all the properties from the class hierarchy henceforth can be easily implemented on top of it, while into the object, the appropriate constructor should be called. providing data-flow and control-flow artifacts to the ad-hoc Wrapping all user classes in one native class defined by analysis at hand. the emulator, e.g., EmulatorObject, allows using automatic The remainder of this section discusses some of the most object handling features of the underlying interpreter for important features of the analysis framework. proxying, such as constructors, destructors, copy-constructors and magic methods. Using the wrapper class also helps 5.1 Isolation generalize expression behavior when objects are involved, One of the most crucial features of the framework is regardless of whether they are native objects or user objects. the ability to isolate emulation. Isolation enables emulating Object-oriented expressions, such as fetching object/class code and evaluating expressions that cause side-effects to properties and calling object/class methods, are typically very the program state in a controlled environment, observing the dynamic, commonly containing the name of the class in one changes that they make, without damaging the original state variable and the name of the function/property another, e.g., of the program. $class::${$property}. Calling object-oriented methods Isolation is used frequently by the framework, for example, is essentially the same as aliasing the active object (this) to isolation is used when executing counterfactual code branches the current symbol table, and calling a function, after finding (i.e., branches the path invariant of which are not satisfied), the designated function in the class hierarchy. as well as when the post-processing static phase evaluates complex expressions used in dynamic includes.

8 2016/11/12 Isolation is done by duplicating the emulation state via summary instead of re-emulating the code block, improv- deep copying the emulator. A deep copy clones the object ing performance when multiple counterfactual executions and all its underlying objects, arrays and other variables, then exercise the same functions. rebuilds the aliasing relations between the cloned variables. When encountering errors, counterfactual execution sup- All cloned variables are copy-on-write versions of the original ports two options, whether to terminate execution and return variables (except for primitive types such as integers), to to previous isolation (n.b. nested isolations) or create a new preserve memory and improve performance. isolation and execute the rest of the code block (diehard The isolation operation is very expensive, especially when mode1). size of the emulation state grows. The framework provides deep copy functionality both as pure PHP code, and as a 5.3 Static Analysis native PHP extension, the latter of which is one to two orders A flow-discovering static analysis is also one of the of magnitude faster and is used in all our evaluations. features of the framework. This static analysis can be invoked It is noteworthy to mention that development of the native at any time during the analysis, but two common cases are version of deep copy functionality required at least one order after analysis of each basic block and after the dynamic of magnitude more effort than creating the same functionality analysis is completed. in pure PHP, even though native code has access to all private The static analysis operates on all code that has not been and hidden properties of variables, all of which have to be dynamically emulated, and attempts to resolve complex dy- obtained using different hacks in the pure PHP version. This namic expressions used in control flow statements (e.g., extra effort was mainly due to strict memory management branches and dynamic includes) using the artifacts procured requirements in the native interpreter. by the dynamic analysis. Once a a new code block is discov- One particularly important and challenging aspect of iso- ered, the static analysis either reiterates or invokes dynamic lation is handling external state, such as database connections analysis on it. and file handles. Fortunately, there are a limited number of Our evaluations show that such static analysis is very these in dynamic languages and the framework handles them effective, discovering much more of the control-flow and by re-mocking emulator functions that handle these exter- data-flow artifacts in the application compared to the baseline nal connections. For a majority of them, dismissing resource dynamic emulation. deallocation solves the problem. For example, a database con- Our static analysis implementation is conservative. It does nection object will use the same connection variable (handle) not resolve include statements that it can not determine when deep copied, but once the deep copy is disposed of, the with confidence. This conservative approach means that destructor will attempt to disconnect the database connection. overlapping files and folders, such as themes and certain Ignoring deallocations under isolation resolves the issue in plugins are not discovered, because only one of them is this case. active in an application setup. This conservatism can easily The framework also supports nested isolations (maintain- be reversed by an ad-hoc analysis when necessary. ing an isolation stack), so that one isolation can emulate code without damaging its own state, an important requirement for 5.4 Sensitivity counterfactual execution. As one of the major reasons for the development of the 5.2 Counterfactual Execution framework was constructing accurate CFGs, the analysis that it provides is flow-sensitive on all the code that it discovers. The framework supports emulating arbitrary code under Analysis is context-sensitive for all the code dynamically counterfactual execution, an isolated environment which will analyzed, and the statically analyzed code is not context- not modify the original emulation state, but is useful in sensitive by default (although an ad-hoc analysis can try to artifact procurement. For example, to exercise all branches maintain context). Accompanying counterfactual execution of an application without satisfying the branch invariant, with a constraint solver will also make the analysis fully the framework uses counterfactual execution mode. This path-sensitive. mode enables the framework to discover much more of the CFG and data-flow artifacts, without damaging the original 5.5 Extensibility emulation state (for example by discovering new dynamic The framework can keep track changes between two include statements that resolve into new code files). emulation states, such as definition changes, context changes The isolated counterfactual code can also be sandboxed and heap changes. This information is accessible to the ad- to disable any unpredictable external effect it might have hoc analysis in desired granularity. on the system, by using sandboxed versions of external- The framework is extensible via object oriented inheri- state affecting functions in the emulator. This sandboxing tance and overriding different functionality instead of provid- is beneficial in analysis of untrusted code such as malwares. ing an observer pattern API. Each operation of the framework The framework can also keep track of counterfactual code blocks that are executed, creating a summary and using that 1 The exit statement in PHP is die();

9 2016/11/12 is encapsulated in a well documented method, and by ex- During our development and testing we were able to dis- tending and overriding each method, the ad-hoc analysis can cover several bugs and vulnerabilities in PHP, Joomla and modify the analysis to suit its particular needs (e.g., keep Wordpress, all of which were responsibly reported to their track of isolations). respective authors. We discovered and reported a total of 7 The emulator is similarly extensible. In fact, higher level PHP bugs using our analyzer, 4 of which were discrepancies programming features of the emulator such as object orienta- between documentation, specification and behavior of the in- tion are implemented via extending the basic emulator, which terpreter, and 3 of which caused crashes and were vulnerable can only emulate the procedural subset of the PHP language. to malicious inputs. 3 of these bugs were fixed in PHP release 7.0.7, 2 more were fixed in PHP 7.0.9 and at least one more will be fixed in the upcoming PHP release. 6. Evaluation 6.1 Performance and Coverage In this section, we evaluate our work in three dimensions. We test our framework in different configurations. First, First, we evaluate code discovery, a very important aspect we perform a simple static analysis of the application, to of analysis methods that has been mostly neglected in prior provide a baseline for comparison with prior work. Then we work. Analysis of dynamic languages without a full-featured perform a simple, dynamic analysis of the application. Then analyzer will simply miss much of the code, either by the we combine dynamic and static analyses, to obtain better inability to construct a precise and complete CFG spanning coverage. Finally, the normal dynamic analysis is replaced the entire code base, or by the inability to discover and with a counterfactual dynamic analysis, with and without reason about dynamically generated code (which can not accompanying static analysis. be measured by lines of code in existing files). Second we Table 1 provides performance and coverage numbers of evaluate the performance of our method. Finally, we evaluate our prototype under different analysis configurations. The last the practicality of our method and the ease of use of our three columns of the table list the number of statements ex- framework by implementing a data-flow taint analysis as a isting in the analyzed component of the application and total case study. number of statements and lines of code in the application. All our evaluations were performed on an iMac Late 2012, The coverage numbers in the table are based on the with 32 GBs of memory and a 2.9 GHz Core i5 Intel CPU. number of statements in the application and its main strongly The operating system was Mac OS X 10.11. The analysis connected component (SCC). Web applications usually have did not use more than 300 MB of memory at any time. The more than one SCC, the entry point and all functionality evaluations are performed on a prototype analyzer built on top provided by a SCC is different from another, thus SCCs of one of our PHP emulator. PHP was selected as the target language application can be considered independent applications. For because it is one of the most popular dynamic languages used example, in Wordpress, the administrative interface is a totally to build web applications. different application than the actual content management Initially, we developed the emulator using PHP 5 targeting system. Applications such as HotCRP or myBB have several PHP 5.3. In the midst of our research, PHP 7 was released – strongly connected components. This was the common design a major new PHP version with vastly improved performance. in older web applications. After porting, our emulator ran 2 to 10 times faster. Note that As apparent from the table, projects commonly studied porting the emulated needed changing only a few lines of by prior work (the bottom part of the table) are easily code. In contrast, porting a shadow interpreter would have discoverable using basic static analyses, as they do not use required starting from scratch as PHP 7 is a complete rewrite dynamic features pervasively. They are also quickly analyzed of PHP 5. even in the most exhaustive mode (Counterfactual+Static), The correctness of the emulator was evaluated by compar- taking only a few seconds on a full counterfactual run, ing the output generated by running the native PHP interpreter compared to more than 500 seconds for Wordpress or Joomla on evaluated projects to the output emulated by the emulator. (column T of Counterfactual+Static mode). We focus our evaluation on Wordpress, both because it For Joomla, we were unable to complete a counterfactual powers more than 32% of Web and 25% of top 10 million analysis due to bugs in PHP2 that would crash the analyzer. websites [5] and because it is a sizable project (250+ KLOC) We were still able to run the analyzer up to a certain point that utilizes dynamic PHP features pervasively. We also into the application (370% of the code) and extract useful provide evaluations for Joomla (470+ KLOC), the second statistics. Thus, numbers marked with a star in table 1 are most popular Web application with more than 50 million estimated. downloads [26], as well as one of the largest PHP applications. Finally, we evaluate several applications evaluated by prior 2 work to provide a baseline for comparison, although these We discovered and reported a new use-after-free security vulnerability in PHP that caused the analysis of Joomla to crash at several points. Hopefully, applications do not frequently use dynamic features and are PHP developers will fix the bug by publication time so that accurate numbers significantly smaller in size. can be reported.

10 2016/11/12 T=Time Static Dynamic Counterfactual Dynamic+Static Counterfactual+Static SCC Total LOC C=Coverage T C T C T C T C T C Stmts Stmts Wordpress 4.2.2 3ms 0% 10.1s 49% 522s 56% 41s 81.0% 579s 81.6% 36034 58786 262k Joomla 3.5.1 9ms 0% 9.5s 21.7% 485s* 48%* 16.8s 22.4% 525s* 67%* 50058* 95271 472k phpBB 2.0.23 .5s 100% 458ms 40.7% 1.9s 100% 570ms 40.7% 2s 100% 1189 5787 20k HotCRP 2.10 115ms 8% 24ms 2.6% 468ms 10.1% 127ms 8% 2.7s 89.5% 5112 24002 58k myBB 1.8.07 60ms 0% 356ms 31.4% 8.5s 39.6% 1.8s 100% 13.4s 100% 7704 25961 149k mybloggie 2.1.3b 250ms 81% 12ms 25% 104ms 57% 70ms 100% 720 100% 673 1008 9.14k mybloggie 2.1.4 285ms 72% 30ms 20% 3.9s 51.8% 99ms 100% 4.6s 131%** 831 979 9.15k NOCC 1.9.4 .9s 43% 9ms 2.9% 1.6s 50% 141ms 44% 3.95s 101%** 4732 4732 29.5k

Table 1: Coverage of main strongly connected component (index.php) of different PHP applications under different configurations of the analysis method. Results marked with * are estimated, while results marked with ** include generated code (i.e., eval).

Another interesting observation is coverage above 100%, Wordpress Joomla for myBloggie 2.1.4 and NOCC 1.9.4, which is due to total files in application 480 2476 dynamically generated code using eval in these applications. files in main SCC 204 1049 basic blocks in SCC 9759 26,526 We also studied the number of files covered compared to statements in SCC 36,034 50,058* the total number of files available in each application. These number of include expressions 678 446 numbers correlate with the number of covered statements. number of exit nodes 313 2339 A conclusion from the table is that in concentrated projects number of unique visited statements 8054 5724 such as Wordpress, a dynamic trace can cover up to 50% of number of unique executed statements 7650 5483 the CFG (column C of Dynamic mode), whereas in sparse total number of visited statements 109,535 77,850 total number of executed statements 83,186 66,177 projects such as HotCRP, a dynamic trace covers roughly number of parsed statements 17,673 10,899 2% of the entire code base. A counterfactual analysis, al- number of executed if branches 25,578 22,679 though taking significantly more time, can uncover a signifi- number of executed else branches 3681 2410 cant portion of the code base, for example, dynamic analysis number of executed elseif branches 1826 441 of NOCC uncovers only 2.9% of the code, whereas a coun- number of executed switch statements 121 300 terfactual analysis covers up to half the code. This is because number of switch cases 1254 2157 static analysis iterations 5 2 NOCC dynamically loads different parts of its application includes resolved statically 102 16 based on user input, in contrast with Wordpress which does includes resulting in non-existent files 25 14 not exhibit additional coverage under counterfactual execu- includes statically unresolvable 12 46 tion (column C of Counterfactual and Counterfactual+Static files included in counterfactual mode 189 184 modes). branches taken counterfactually 91,426 28,640 Since the framework’s counterfactual analysis exercises unique branches taken counterfactually 3104 1319 errors in counterfactual mode 140 68 each counterfactual block only once, it can be exercised warnings in counterfactual mode 2599 160 on all entry points of the application to provide a CFG for the entire application rather than individual SCCs. The Table 2: More analysis results from exercising the framework on framework’s combined mode supports this feature. Table 2 Wordpress and Joomla. provides more detailed analysis results for Wordpress and Joomla, including number of basic blocks and branches exercised by counterfactual mode. 6.2 Case Study Finally, the conclusion that confirms our hypothesis, is that To demonstrate the effectiveness of emulization, we have although static analysis is not very effective in covering code used the data-flow infrastructure provided by the framework and discovering CFG of dynamic and complex applications, to perform taint tracking and discover SQL and code injection dynamic analysis combined with static analysis is a viable vulnerabilities. means of code discovery and analysis for large-scale realistic The taint tracking code consists of less than 120 lines applications. of code. It was easily created by extending the data-flow functionality of the framework. The taint tracker keeps taint as a floating point number. Initially, all inputs to the application

11 2016/11/12 are tainted with a value of 1.0. The function call data-flow sensitive point after the initialization of the application, functionality is extended by assigning taint scores to each and performing static analysis using concrete data from the internal PHP function that retains taint. These scores are snapshot [22]. Although their observation that ”static analysis multiplied to tainted arguments sent to the function, and typically gives very imprecise results in such cases [where determine the taint of the function result. The expression application configurations determine CFG]” is accurate and evaluation data-flow functionality is extended by assigning their approach is novel, it is limited to applications that fall taint scores to certain expressions (e.g., concatenation) and under their key assumption that configurations are only used setting taint to zero on others (e.g., bitwise operators). in the early initialization segments of an application. Finally, the function call functionality of the framework is Hauzer et. al. provide several works resulting into a extended to check whether the called function is a sensitive tool called Weverca [14, 15], a PHP parser and analyzer sink (i.e., a function that should not receive tainted data), implemented in C#.NET. Weverca is the most comprehensive and whether the inputs to the function are sufficiently tainted. PHP static analyzer that we could find. It includes an SMT The analyzer then alerts if the criteria is satisfied. This entire solver to backtrace vulnerabilities to concrete values, properly process was coded in less that 3 hours. recognizes many dynamic features, and is very fast. However, This taint analysis does not add significant overhead to the it has a huge code base (80 megabytes packages) which is the baseline metrics reported in Table 1 (less than 5% runtime result of several years of dedicated work by scientists and has overhead and less than 5% memory overhead). not been evaluated on sizable PHP applications, reportedly Our taint analysis confirms several vulnerabilities reported due to Phalanger [11] limits. in prior work for the applications listed in the bottom part of Schafer et. al. present a novel determinacy analysis ap- Table 1. In addition, we discovered one new SQL injection proach in which they use dynamic analysis to refactor as vulnerability in Joomla, and one new SQL injection vulner- many of the dynamic feature instances (such as evals) as ability in Wordpress. All newly discovered vulnerabilities they reliably can [30]. Their approach enables discovery of a were responsibly reported to the respective authors. much broader CFG and is accompanied by a soundness proof. Because they take the conservative approach when facing 7. Related Work dynamic features that are non-determinate, i.e., marking the entire heap as non-determinate, their approach does not scale Perhaps the most notable frameworks for static/dynamic well. They have evaluated their work on jQuery versions prior code analysis are ATOM and PIN [25, 35]. Both PIN and to 1.3, while noting that their approach fails on jQuery 1.3 ATOM provide observer pattern APIs for ad-hoc analysis due to dynamic event handlers. They also introduce coun- developers, whereas since our framework is built in a high- terfactual execution, which is similar to concolic execution level language, it provides a simpler, object-oriented API. without actually changing the values to satisfy constraints. There are several approaches to static analysis of dynamic Arguably the most significant static analysis of PHP languages. For brevity, we focus on works performed on PHP, language is performed by Dahse et. al. in their tool called due to the sheer number of mature and sizable code available RIPS [9]. It includes an extensive model of PHP and is for it, while mentioning works on other languages, specifi- implemented in PHP. RIPS has also been able to find several cally Javascript (which is growing rapidly in prevalence). new vulnerabilities in a broad range of medium-sized PHP Xie et. al. are one of the earliest to provide a method projects. The available version of RIPS, however, was unable for static analysis of dynamic code [39]. They had difficulty to analyze Wordpress, getting stuck 30 to 120 minutes into modeling regular expressions, and evaluated their work on a the analysis. The original publication focuses on data-flow benchmark that did not include sizable applications. analysis, and does not evaluate Wordpress and other sizable Pixy, a tool developed by Jovanovic et. al. uses Java to projects. A follow-up publication, which focuses on ROP parse and analyze PHP code [19]. They do not support object- gadgets in Web applications [10], does evaluate Joomla oriented futures, and report significant false positives and and Wordpress, but the analysis is not context-sensitive problems with dynamic includes, despite manually guiding or flow-sensitive, and is unable to find true positives on evaluation on a group of small web applications. Wordpress, due to a multitude of listed reasons, such as Balzarotti et. al. are one of the first to provide a combined reflection (which is considered an unsolved problem in static static and dynamic analysis in a tool called Saner [2]. Their analysis [4, 16, 23]). static analysis discovers viable candidates, and they use Artzi et. al. use a modified (the PHP inter- dynamic analysis to reduce the number of false positives preter) to perform static analysis with the aim of discovering discovered in the static phase. They had a significant number malformed HTML output [1]. Even though their implemen- of false positives when evaluating a medium-sized popular tation is written in C, its performance is not scalable (15-20 web application (PHP-Fusion) consisting of 50,000 LOC. minutes for small Web applications with less than 20,000 Perhaps the closest prior work to ours is of Kneuss et. al. lines of code and 100 files). in PHANTM, where they use a dynamic trace to determine a reasonable CFG, taking a snapshot at a manually selected

12 2016/11/12 Jensen et. al. provide several arguments and data points as trace compilation for the next generation web applications. to why it is hard to statically analyze dynamic applications, ACM, Mar. 2009. including statistics confirming that a majority of popular [7] C. Csallner, N. Tillmann, and Y. Smaragdakis. DySy: dynamic Ruby metaprogramming and Javascript applications use the symbolic execution for invariant inference. Proceedings of the eval construct, many of which are applied on user input [18]. 30th international conference on Software engineering, May Most prior work on Javascript focused on client-side 2008. applications, which were relatively small compared to the [8] R. Dahl. Node.js, a JavaScript runtime. https://nodejs. new wave of server-side applications using platforms such as org/, 2009. Node.js [8] (many of which are enterprise applications) [6, [9] J. Dahse and T. Holz. Simulation of Built-in PHP Features for 12, 18, 21]. Precise Static Code Analysis. NDSS 2014, 2014. [10] J. Dahse, N. Krein, and T. Holz. Code Reuse Attacks in PHP: 8. Summary Automated POP Chain Generation. Automated POP Chain In this paper, we discussed why context-sensitive static Generation. ACM, New York, New York, USA, Nov. 2014. analysis of dynamic code is challenging. We then presented [11] DEVSENSE. Phalanger, open-source PHP implementation in- a novel analysis method, called emulization, as a means to troducing the PHP language into the family of compiled .NET enable static analysis of dynamic code and described how an languages. https://phalanger.codeplex.com/, 2013. emulization-based analysis framework addresses the many [12] K. Dewey, V. Kashyap, and B. Hardekopf. A parallel abstract challenges associated with analysis of dynamic code. Using interpreter for JavaScript. In CGO ’15: Proceedings of the emulization, we developed a prototype analysis framework 13th Annual IEEE/ACM International Symposium on Code with PHP as the target language. The power and effectiveness Generation and Optimization. University of California, Santa of emulization was demonstrated by analyzing Wordpress Barbara, IEEE Computer Society, Feb. 2015. and Joomla, two large, dynamic web applications. The re- [13] M. Furr, J.-h. D. An, J. S. Foster, M. Furr, J.-h. D. An, and sults showed that when compared to previous approaches, J. S. Foster. Profile-guided static typing for dynamic scripting emulization significantly increased analysis coverage of these languages. ACM SIGPLAN Notices, 44, Oct. 2009. important dynamic applications. To further demonstrate the [14] D. Hauzar and J. Kofron. On Security Analysis of PHP Web power of emulization, a case study was presented where the Applications. In Computer Software and Applications Con- base analysis framework was augmented with a dynamic ference Workshops (COMPSACW), 2012 IEEE 36th Annual, taint-tracking data-flow analyzer. The taint tracker was able pages 577–582. IEEE, 2012. to confirm previously reported vulnerabilities in applications [15] D. Hauzar and J. Kofron. Framework for Static Analysis of studied in prior work, and it discovered two new security PHP Applications. LIPIcs - Leibniz International Proceedings vulnerabilities in Wordpress and Joomla. in Informatics, 37:711, 2015. [16] M. Hills, P. Klint, and J. Vinju. An empirical study of PHP feature usage: a static analysis perspective. Proceedings of References the 2013 International Symposium on Software Testing and [1] S. Artzi, A. Kiezun, J. Dolby, F. Tip, D. Dig, A. Paradkar, Analysis, July 2013. and M. D. Ernst. Finding Bugs in Web Applications Using [17] Y.-W. Huang, F. Yu, C. Hang, C.-H. Tsai, D.-T. Lee, and S.-Y. Dynamic Test Generation and Explicit-State Model Checking. Kuo. Securing web application code by static analysis and IEEE Transactions on Software Engineering, 36(4):474–494, runtime protection. WWW, 2004. 2010. [18] S. H. Jensen, P. A. Jonsson, and A. Møller. Remedying the eval [2] D. Balzarotti, M. Cova, V. Felmetsger, N. Jovanovic, E. Kirda, that men do. Proceedings of the 2012 International Symposium C. Kruegel, and G. Vigna. Saner: Composing Static and Dy- on Software Testing and Analysis, July 2012. namic Analysis to Validate Sanitization in Web Applications. [19] N. Jovanovic, C. Kruegel, and E. Kirda. Pixy: a static analysis IEEE Symposium on Security and Privacy, pages 387–401, tool for detecting Web application vulnerabilities. Security and 2008. Privacy, pages 6 pp.–263, 2006. [3] P. Biggar and D. Gregg. Static analysis of dynamic scripting [20] N. Jovanovic, C. Kruegel, and E. Kirda. Precise alias analysis languages. Draft: Monday 17th August, 2009. for static detection of web application vulnerabilities. PLAS, [4] E. Bodden, A. Sewe, J. Sinschek, H. Oueslati, and M. Mezini. 2006. Taming reflection: Aiding static analysis in the presence of [21] V. Kashyap, K. Dewey, E. A. Kuefner, J. Wagner, K. Gibbons, Proceedings of the 33rd reflection and custom class loaders. J. Sarracino, B. Wiedermann, and B. Hardekopf. JSAI: a static International Conference on Software Engineering , pages 241– analysis platform for JavaScript. In FSE 2014: Proceedings 250, 2011. of the 22nd ACM SIGSOFT International Symposium on [5] BuiltWith. CMS Usage Statistics. http://trends. Foundations of Software Engineering. Harvey Mudd College, builtwith.com/cms, 2016. ACM, Nov. 2014. [6] M. Chang, E. Smith, R. Reitmaier, M. Bebenita, A. Gal, [22] E. Kneuss, P. Suter, and V. Kuncak. Runtime instrumentation C. Wimmer, B. Eich, and M. Franz. Tracing for web 3.0: for precise flow-sensitive type analysis. International Confer-

13 2016/11/12 ence on Runtime Verification, 2010. 15th conference on USENIX Security Symposium - Volume 15. [23] B. Livshits, J. Whaley, and M. S. Lam. Reflection analysis Stanford University, USENIX Association, July 2006. for Java. Asian Symposium on Programming Languages and Systems, 2005. [24] V. B. Livshits and M. S. Lam. Finding Security Vulnerabilities in Java Applications with Static Analysis. Usenix Security, 2005. [25] C.-K. Luk, R. S. Cohn, R. Muth, H. Patil, A. Klauser, P. G. Lowney, S. Wallace, V. J. Reddi, and K. M. Hazelwood. Pin - building customized program analysis tools with dynamic instrumentation. PLDI, 2005. [26] MarketWired. Joomla! CMS Passes 50 Million Down- loads. http://www.marketwired.com/press-release/ joomla-cms-passes-50-million-downloads-1882565. htm, 2014. [27] A. Naderi-Afooshteh and A. Nguyen-Tuong. Joza: Hybrid Taint Inference for Defeating Web Application SQL Injection Attacks. 2015 45th Annual IEEE/IFIP International Confer- ence on Dependable Systems and Networks, 2015. [28] A. Nguyen-Tuong, S. Guarnieri, D. Greene, J. Shirley, and D. Evans. Automatically Hardening Web Applications Using Precise Tainting. In Security and Privacy in the Age of Ubiquitous Computing, pages 295–307. Springer US, Boston, MA, 2005. [29] I. Papagiannis, M. Migliavacca, and P. Pietzuch. PHP Aspis: Using Partial Taint Tracking to Protect Against Injection Attacks. WebApps 2011, 2011. [30] M. Schafer,¨ M. Sridharan, F. Tip, M. Schafer,¨ M. Sridharan, and J. Dolby. Dynamic determinacy analysis. ACM SIGPLAN Notices, 48, June 2013. [31] Y. Smaragdakis, M. Bravenboer, and O. Lhotak.´ Pick your con- texts well: understanding object-sensitivity. ACM SIGPLAN Notices, 2011. [32] S. Son and V. Shmatikov. SAFERPHP - finding semantic vulnerabilities in PHP applications. PLAS, 2011. [33] S. Son, K. S. McKinley, V. Shmatikov, S. Son, K. S. McKinley, and V. Shmatikov. Diglossia: detecting code injection attacks with precision and efficiency. Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, Nov. 2013. [34] M. Sridharan, S. Artzi, M. Pistoia, S. Guarnieri, O. Tripp, and R. Berg. F4F - taint analysis of framework-based web applications. OOPSLA, 2011. [35] A. Srivastava and A. Eustace. ATOM - A System for Building Customized Program Analysis Tools. PLDI, 1994. [36] Z. Su and G. Wassermann. Sound and precise analysis of web applications for injection vulnerabilities. ACM Sigplan Notices, 42, June 2007. [37] S. Wei and B. G. Ryder. A practical blended analysis for dynamic features in javascript. 2012. [38] S. Wei and B. G. Ryder. State-Sensitive Points-to Analysis for the Dynamic Behavior of JavaScript Objects. ECOOP, 2014. [39] Y. Xie and A. Aiken. Static detection of security vulnerabilities in scripting languages. In USENIX-SS’06: Proceedings of the

14 2016/11/12