Analyzing Large-Scale Object-Oriented Software to Find and Remove Runtime Bloat

Analyzing Large-Scale Object-Oriented Software to Find and Remove Runtime Bloat Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Guoqing Xu Graduate Program in Computer Science and Engineering The Ohio State University 2011 Dissertation Committee: Atanas Rountev, Advisor Feng Qin Michael D. Bond ABSTRACT Over the past decade, the pervasive use of object-oriented languages and the in- creasing complexity of problems solved by computer software have led to the prolifera- tion of large-scale framework-intensive applications. These applications are typically built by combining standard packages (e.g., J2EE application server frameworks), third-party layers for domain-specific functionality, and in-house solutions. While employing libraries and frameworks eases the development effort, it can be expensive to invoke the general APIs provided by these libraries, especially when they are used for simple tasks. As a result, many applications suffer from excessive memory footprint caused by chronic runtime bloat that significantly impacts scalability and performance. In addition, programmers are taught to pay more attention to abstractions and models, and to leave the performance optimization to the runtime system. While a few redundant objects, calls, and field copies may seem insignificant, the problem quickly gets exacerbated due to nesting and layering. At some point, these inefficiencies accumulate, and compilers (e.g., the JIT in a JVM) can no longer elim- inate them, since layers of abstractions grow to be deep, dynamic, and well beyond the capabilities of compiler analysis. In this dissertation, the term bloat is used to refer to the general phenomenon of using excessive work and memory to achieve seemingly simple tasks. Bloat exists commonly in large-scale object-oriented programs and it is a major challenge that ii stands in the way of bridging the productivity-performance gap between managed and unmanaged languages. The overarching goal of our work is to find large bloat-related optimization opportunities, with a small amount of developer time. As a fundamental methodology to achieve this goal, we advocate tool-assisted manual optimization, in order to combine developer insight with the automated tool support. In particular, we have designed, implemented, and evaluated code checking techniques accomplished by bytecode-level static analysis, and runtime optimization techniques achieved by VM- level dynamic analysis, for detecting and removing runtime bloat in large framework- intensive Java applications. These techniques can be used to (1) help a programmer detect the root cause if a problem is seen; (2) remove frequently-occurring bloat patterns; and (3) prevent bloat from occurring in an early stage of development. Incorporating these techniques in an existing Java compiler or virtual machine will lead to increased developer productivity by helping programmers quickly locate bloat- related performance bottlenecks, as well as improved scalability of large programs by optimizing away bloat at run time. In addition, it is possible to generalize these approaches to handle similar problems in programs written in other object-oriented languages (e.g., C#), so that non-Java communities would also benefit from them. While these techniques are designed specifically to solve the bloat problem, many of them are applications of more general (theoretical) frameworks developed in the dissertation that can be instantiated to solve other problems. Bloat contains wasteful operations that, while not strictly necessary for the for- ward progress, are executed nevertheless. We propose novel dynamic analysis techniques to detect such wasteful operations and to produce information that is necessary for the programmer to pinpoint the performance bottlenecks. One such analysis, as iii the first contribution of this dissertation, is copy profiling. This analysis is designed based on an important observation that the wasteful operations often consist of data copy activities that move data among heap locations without any useful computation. By profiling copies, this analysis looks for program regions containing large volumes of copies and data structures whose construction involves data copied frequently from other data structures. Different from this “from-symptom-to-cause” approach that finds bloat from the symptoms through which it manifests, the second dynamic analysis this dissertation advocates attempts to capture directly the center of bloat, which is the set of operations that, while expensive to execute, produce values of little benefit. Detecting the operations that have high cost-benefit rates is, thus, an important step towards tracking down the causes of performance problems. In order to compute cost and benefit efficiently, we propose a novel technique, called abstract dynamic thin slicing, that performs dynamic thin slicing over bounded abstract domains. We demonstrate, using real-world examples, that this technique can also be adopted to solve a range of backward data flow problems efficiently. With the help of a variety of data aggregation approaches, these analyses can help a programmer quickly find potential performance problems. The accumulation of bloat effects may result in memory leaks, which occur when object references that are no longer needed are unnecessarily maintained. Existing dynamic analyses for leak detection track fine-grained information about individual objects, producing results that are hard to interpret and lack precision. The third contribution of the dissertation is a novel container-based heap-tracking technique, based on the observation that many memory leaks in Java programs occur due to iv containers that keep references to unused data entries. By profiling containers and understanding their semantics, it is much easier to track down the causes of memory leak problems, compared to existing leak detection approaches based on the tracking of arbitrary objects. Although container profiling is effective in detecting container-induced memory leaks, it has difficulties dealing with leaks that are caused by general unnecessary references instead of containers. In order to help programmers identify the root causes of these general leaks, we propose a specification-based dynamic technique called LeakChaser, as the fourth contribution of this dissertation. LeakChaser brings high-level application semantics into low-level leak detection by allowing programmers to specify and infer object liveness properties. This new technique exploits object lifetime relationships and uses varying levels of abstraction to help both experts and novices quickly explore the leaky behavior to pinpoint the leak cause. Using these four dynamic analyses, we have found many interesting bloat patterns that can be regularly observed in the execution of large Java programs. A further step to avoid bloat is to develop static analyses that can find and remove such patterns during application development, so that small performance issues can be prevented before they accumulate and become significant. One interesting pattern is the inefficient use of Java containers. The fifth contribution of this dissertation is a static analysis that identifies inefficiencies in the use of containers, regardless of inputs and runs. Specifically, this static analysis detects underutilized and overpopulated containers by employing a context-free-language v (CFL)-reachability formulation of container operations, and by exploiting container- specific properties. The analysis is client-driven and demand-driven. It always gen- erates highly-precise reports, but trades soundness for scalability. We show that this analysis exhibits small false positive rates, and large optimization opportunities can be found by inspecting the generated reports. Another bloat pattern that we have regularly observed is constructing and initial- izing data structures that are invariant across loop iterations. As the sixth contribution of this dissertation, we develop a static analysis that uses a type and effect system to help programmers find such loop-invariant data structures. Instead of trans- forming a program to hoist these data structures automatically (which must be over- conservative and thus becomes ineffective in practice), we advocate a semi-automated approach that uses the static analysis to compute hoistability measurements. These measurements indicate how likely it is that these data structures can be hoisted, and are presented to the user for manual inspection. Our experimental results indicate that the proposed technique can be useful both in the development phase (for find- ing small performance issues before they accumulate) and in performance tuning (for identifying significant performance bottlenecks). In conclusion, despite the existence of a number of profiling tools, this dissertation attempts to establish more systematic ways of identifying run-time inefficiencies. The dynamic analyses presented in the dissertation are implemented in J9 (a com- mercial JVM developed by IBM) and Jikes RVM (an open-source JVM written in Java). The static analyses are implemented in Soot, a popular Java program analysis framework. All of the proposed analyses have been shown to scale to real-world Java applications. Our experimental results strongly suggest that these techniques can vi be used in practice to find and remove bloat in large-scale applications, leading to performance gains. We hope that with the

Analyzing Large-Scale Object-Oriented Software to Find and Remove Runtime Bloat

Advanced Data Structures

Fast As a Shadow, Expressive As a Tree: Hybrid Memory Monitoring for C

Rockjit: Securing Just-In-Time Compilation Using Modular Control-Flow Integrity

Speculative Separation for Privatization and Reductions

Download/Repository/Ivan Fratric.Pdf, 2012

Practical Analysis of Framework-Intensive Applications

Detecting Cacheable Data to Remove Bloat

IAR C/C++ Compiler User Guide

Energy Modelling of Multi-Threaded, Multi-Core Software for Embedded Systems

Shadow State Encoding for Efficient Monitoring of Block-Level Properties

Fast As a Shadow, Expressive As a Tree: Hybrid Memory Monitoring for C∗

Control-Flow Integrity: Precision, Security, and Performance