Rapid Development of Flexible and Efficient Dynamic Program

Rapid Development of Flexible and Efficient Dynamic Program Analysis Tools ∗ Danilo Ansaloni University of Lugano Switzerland [email protected] ABSTRACT calling-context profiling, there is a lack of tools that allow In this work, we show that it is possible to raise the abstrac- users to specify custom analyses. Moreover, since prevailing tion level of dynamic program analysis frameworks, pro- analysis tools lack the flexibility to profile only a specific viding high-level constructs to improve flexibility, efficiency, subset of the observed application, resource utilization and and usability of dynamic analysis tools. We introduce a new performance are usually measured after the implementation framework based on aspect-oriented programming that en- phase, when the program is stable enough to be deployed ables rapid development of efficient dynamic program anal- and executed. However, detecting and solving runtime issues ysis tools. Moreover, we present high-level constructs for so late in the development process often incurs high costs, the parallelization of specific code regions and for runtime particularly if it turns out that some taken design choices adaptation of the scope of the analysis. The viability and prevent good performance and the design has to be revised. benefits of our approach are confirmed with several dynamic We believe there is a need for a novel dynamic program analysis tools implemented with our framework. analysis framework that allows programmers to rapidly specify custom analyses for measuring runtime performance of parts of a program, during the implementation phase. To 1. PROBLEM AND MOTIVATION this end, we identify the following shortcomings in prevailing Prevailing programming languages, such as Java, C#, dynamic analysis tools: Scala, as well as some implementations of Python and Ruby, • Limited flexibility: bytecode instrumentation is usu- can be compiled to an intermediate code representation (i.e., ally performed using low-level techniques that require bytecode) and executed on a virtual machine. This approach expert knowledge at the level of the virtual machine. allows code portability across different platforms, providing While this approach allows developers to specify any a standard abstraction of the underlying system. Moreover, possible instrumentation, tool development is tedious state-of-the-art virtual machines offer automatic memory and error-prone, and the resulting tools are often management, just-in-time compilation, and automatic code hardly extensible and customizable [2, 24, 46]. optimizations. All these features move the task of producing optimized code from the programmer to the virtual machine, • Limited efficiency: the inserted bytecode usually col- reducing development time to achieve good performance. lects dynamic information about the observed program The presence of a virtual machine layer in the software and performs some analysis. In many cases, this com- stack makes it difficult to understand the performance of putation could be done in parallel with the execution complex applications executing on modern hardware plat- of the observed application. However, only few dy- forms. For example, automatic memory management and namic program analysis tools take advantage of this runtime optimizations performed by the virtual machine opportunity to efficiently exploit under-utilized CPU may have a significant impact on overall performance. Since cores and reduce analysis overhead [3, 4]. these virtual machine activities depend on the runtime behavior of an application, static program analysis may not • Performance interference: due to the aforementioned always locate potential hotspots. In addition, some features issues, the user is often forced to perform a set of perva- of the programming language, such as polymorphism and sive and computationally expensive analyses that may reflection, may limit the scope of static analysis tools and create some bias in the observed metrics. potentially lead to biased results. To overcome these limitations, dynamic program analysis The goal of our research is to overcome the aforemen- tools provide insight into an application’s runtime behavior tioned limitations by raising the abstraction level of current by measuring relevant system activities while the program dynamic program analysis frameworks, providing high-level is executing. Such tools are usually based on bytecode in- constructs to improve flexibility, efficiency, and usability of strumentation techniques to insert monitoring code into the dynamic analyses. bytecode of the observed application. While there are dynamic analysis tools for common tasks, 2. BACKGROUND such as data race detection, memory leak detection, and Aspect-oriented programming (AOP) [23] enables the ∗ Adviser: Prof. Walter Binder, University of Lugano, specification of cross-cutting concerns in applications, avoid- Switzerland ing related code that is scattered throughout methods, classes, or components. Aspects specify pointcuts to inter- Javassist [14] is a library for load-time structural re- cept certain points in the execution of programs (so-called flection and for defining new classes at runtime. The join points), such as method calls, field accesses, etc. Ad- API allows two different levels of abstraction: source-level vice are executed before, after, or around the intercepted and bytecode-level. Compared to the aforementioned ap- join points, and have access to contextual information of proaches, the source-level abstraction of Javassist is partic- the join points. During the weaving process, aspect weavers ularly interesting as it does not require any knowledge of modify the code of the base application to execute advice at the Java bytecode format and it allows insertion of code intercepted join points. fragments specified with source text. Traditionally, AOP has been used for disposing of“design While these low-level approaches are very flexible because smells”such as needless repetition, and for improving main- the possible transformations are not restricted, tool devel- tainability of applications. AOP has also been successfully opment is tedious and error-prone. As another drawback, applied to the development of software engineering tools, the resulting analysis tools are often complex and difficult such as profilers, debuggers, or testing tools [7,28,40], which to maintain and to extend [2]. in many cases can be specified as aspects in a concise man- ner. Hence, in a sense, AOP can be regarded as a versatile 3.2 Java Profiling Tools approach for specifying certain program transformations at Prevailing Java profilers, such as JProf 4, and JFluid [16] a high level, hiding low-level implementation details, such are good examples of current state-of-the-art profilers. They as bytecode manipulation, from the programmer. are optimized for common tasks, such as memory-leak detec- Dynamic AOP allows aspects to be changed and code tion, CPU utilization monitoring, and heap walking. How- to be rewoven in a running system, thus enabling runtime ever, prevailing profilers fall short when it comes to provide adaptation of applications. Dynamic AOP has been used a way to specify custom instrumentations to perform more for adding persistence or caching at runtime [26], or for de- specific analysis. The same limitation applies to many dy- bugging and fixing bugs in a running server [15]. However, namic analysis tools, such as *J [17], jPredictor [12], and prevailing aspect languages have not been specifically devel- THOR [34]. oped for implementing dynamic analyses, thus lacking some BTrace 5 provides an annotation-based language to let the features to express essential instrumentations for certain dy- user specify which program attributes to profile. However, namic analyses. this language is limited and does not support the specification of more advanced instrumentations. For example, a BTrace profiler cannot allocate new objects, specify synchro- 3. RELATED WORK nized blocks, or have loops. Since our framework relies on Java bytecode instrumen- AOP-based tools, such as DJProf [28], allow the devel- tation, Section 3.1 describes prevailing approaches for Java oper to define custom instrumentation by writing custom bytecode manipulation. Section 3.2 reviews common profil- aspects. However, mainstream AOP languages lack high- ing tools for Java. Section 3.3 offers a description of state- level constructs to develop efficient dynamic analysis tools of-the-art frameworks for dynamic AOP. Finally, Section 3.4 that can take advantage of modern multicore architectures. discusses how recent dynamic analysis tools take advantage of multicore machines. 3.3 Dynamic Aspect-Oriented Programming While dynamic AOP is often supported in dynamic lan- 3.1 Java Bytecode Instrumentation guages [8, 21] such as Lisp or Smalltalk, where code can be easily manipulated at runtime, it is more difficult and chal- Java bytecode instrumentation is usually performed with lenging to offer dynamic AOP for languages like Java. low-level tools that require in-depth knowledge of the byte- PROSE [27, 29, 30] is an adaptive middleware platform code format. 1 2 offering dynamic AOP. PROSE has evolved in three versions, BCEL and ASM allow programmers to analyze, create, which make use of the three different approaches to dynamic and manipulate Java class files by means of a low-level API. AOP. The latest version of PROSE supports two weaving Java classes, constant pools, methods, and bytecode instruc- strategies: advice weaving, based on method replacement, tions are represented as

Rapid Development of Flexible and Efficient Dynamic Program

Static Analysis the Workhorse of a End-To-End Securitye Testing Strategy

The LLVM Instruction Set and Compilation Strategy

Precise and Scalable Static Program Analysis of NASA Flight Software

Advanced Systems Security: Symbolic Execution

Static Program Analysis Via 3-Valued Logic

Toward IFVM Virtual Machine: a Model Driven IFML Interpretation

Opportunities and Open Problems for Static and Dynamic Program Analysis Mark Harman∗, Peter O’Hearn∗ ∗Facebook London and University College London, UK

Superoptimization of Webassembly Bytecode

The Art, Science, and Engineering of Fuzzing: a Survey

DEVELOPING SECURE SOFTWARE T in an AGILE PROCESS Dejan

Using Static Program Analysis to Aid Intrusion Detection

Engineering of Reliable and Secure Software Via Customizable Integrated Compilation Systems