Program Analyses for Understanding the Behavior and Performance of Traditional and Mobile Object-Oriented Software

Program Analyses for Understanding the Behavior and Performance of Traditional and Mobile Object-Oriented Software Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Dacong Yan Graduate Program in Computer Science and Engineering The Ohio State University 2014 Dissertation Committee: Atanas Rountev, Advisor Feng Qin Michael D. Bond ABSTRACT The computing industry has experienced fast and sustained growth in the complexity of software functionality, structure, and behavior. Increased complexity has led to new challenges in program analyses to understand software behavior, and in particular to uncover performance inefficiencies. Performance inefficiencies can have significant impact on software quality. When an application spends a substantial amount of time performing redundant work, software performance and user experi- ence can deteriorate. Some inefficiencies can use up certain types of resources and lead to program crashes. In general, performance inefficiency is an important and challenging problem for modern software systems. It is also a shared problem for traditional and mobile object-oriented software. Static and dynamic analyses need to keep up with this trend, and this often requires novel technical approaches. One important symptom of performance inefficiencies is run-time bloat: exces- sive memory usage and work to accomplish simple tasks. Bloat significantly affects scalability and performance, and exposing it requires good diagnostic tools. As the first contribution of this dissertation, we present a novel analysis that profiles the run-time execution to help programmers uncover potential performance problems. The key idea of the proposed approach is to track object references, starting from object creation statements, through assignment statements, and eventually ending at statements that perform useful operations. An abstract view of reference propagation ii is provided with path information specific to reference producers and their run-time contexts. Several client analyses demonstrate the use of this abstract view to uncover run-time inefficiencies. Memory leaks, both for traditional and for mobile object-oriented software, present a significant problem for software quality. Static memory leak detection is challenging because it is extremely difficult to statically compute precise object liveness for large- scale applications. We bypass this difficulty by leveraging a common leak pattern. In many cases, severe leaks occur in loops where, in each iteration, some objects created by the iteration are unnecessarily referenced by objects external to the loop. These unnecessary references are never used in later loop iterations. Based on this insight, we shift our focus from computing liveness, which is very difficult to achieve precisely and efficiently for large programs, to the easier goal of identifying objects that flow out of a loop but never flow back in. We formalize this analysis using a type and effect system and present its key properties. This technique was applied on eight real-world programs, such as Eclipse, Derby, and log4j. It not only identified known leaks, but also discovered new ones whose causes were unknown beforehand, while exhibiting a false positive rate suitable for practical use. In addition to static analysis, performance testing is an effective approach to discover memory leaks. For example, sustained growth in memory usage during test execution can indicate potential memory leaks. However, performance testing to ex- pose leaks for arbitrary software is very difficult, because, similar to other dynamic approaches, it also requires specific leak-triggering program inputs. As the third contribution of this dissertation, we introduce LeakDroid, a novel and comprehensive approach for systematic testing of resource leaks in Android applications. At the core iii of the proposed testing approach is model-based test generation that focuses specif- ically on coverage criteria aimed at resource leak defects. These criteria are based on the novel notion of neutral cycles: sequences of GUI events that should have a \neutral" effect and should not lead to increases in resource usage. Several important categories of neutral cycles are considered in the proposed test coverage criteria. As demonstrated by experimental evaluation and case studies on eight Android applications, the proposed approach is very effective in exposing resource leaks. Model-based test generation such as LeakDroid depends critically on GUI models, which describe accessible GUI objects and corresponding user actions. GUI models ultimately determine the possible flow of control and data in GUI-driven applications. The ability to understand Android GUIs is critical for the reasoning of the semantics of an Android application. We introduce the first static analysis to model GUI-related Android objects, their flow through the application, and their interactions with each other via the abstractions defined by the Android platform. We first develop a formal semantics for the relevant Android constructs to provide a solid foundation for this and other analyses. Based on the semantics, we define a constraint-based reference analysis. The analysis employs a constraint graph to model the flow of GUI objects, the hierarchical structure of these objects, and the effects of relevant Android operations. Experimental evaluation on real-world Android applications strongly suggests that the analysis achieves high precision with low cost. The analysis enables static modeling of control/data flow that is foundational for compiler analyses, instrumen- tation for event/interaction profiling, static error checking, security analysis, test generation, and automated debugging. It provides a key component to be used by static analysis researchers in the growing area of Android software. iv GUI applications are usually organized as a series of GUI windows containing structures of GUI widgets. User interaction with these windows (e.g., navigating from one to another and then going back) drives the control flow of the application. In Android, an activity plays the role of a GUI window, and transitions between activities are managed with the help of an activity stack. To understand this additional aspect of Android semantics, we introduce the first static analysis to model the An- droid activity stack, the changes in this stack, and the related interactions between activities. The analysis is an important step toward fully modeling the control/data flow of an Android application. It can be leveraged by other researchers to prune infeasible control flow paths in static analysis for Android, or to discover more paths that would be missing without modeling of the activity stack. In conclusion, this dissertation presents several dynamic and static program analysis techniques to understand the behavior of object-oriented software systems, to uncover potential performance inefficiencies in them, and to locate the root causes of these problems. The programs studied by these techniques are all written in Java, but we believe the proposed techniques are general enough to also be applied to systems written in other object-oriented languages. With these techniques, we advocate the insight that a carefully-selected subset of high-level behavioral patterns and program semantics must be leveraged in order to perform practical program analyses for modern software. v To my parents vi ACKNOWLEDGMENTS I would like to thank my advisor Nasko Rountev for his support, patience and guidance throughout the duration of my Ph.D. study. He has always been ready to help, and has devoted enormous amount of time and effort in training me to become a computer science researcher. I would also like to thank members of the PRESTO group for all the discussions and collaborations. I especially want to thank Harry Xu, who has given me a lot of useful advice and insightful comments. I thank Prof. Feng Qin and Prof. Michael D. Bond for serving on my dissertation committee. I am grateful to Wei Huang and Yi Zhao for their mentoring during my internship at Google, which helped me become a better programmer. Finally, I would like to thank my parents for their unconditional support. The material presented in this dissertation is based upon work supported by the U.S. National Science Foundation under CAREER grant CCF-0546040 and grants CCF-1017204 and CCF-1319695, by an IBM Software Quality Innovation Faculty Award, and by a Google Faculty Research Award. Any opinions, findings, and con- clusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. vii VITA September 2009 { August 2014 . Graduate Teaching/Research Asso- ciate, The Ohio State University May 2013 . .M.S. Computer Science, The Ohio State University June 2009 . B.Eng. Software Engineering, Shang- hai Jiao Tong University PUBLICATIONS Research Publications Atanas Rountev and Dacong Yan. Static Reference Analysis for GUI Objects in Android Software. In International Symposium on Code Generation and Optimization (CGO'14), pages 143-153, February 2014. Dacong Yan, Guoqing Xu, Shengqian Yang, and Atanas Rountev. LeakChecker: Practical Static Memory Leak Detection for Managed Languages. In International Symposium on Code Generation and Optimization (CGO'14), pages 87-97, February 2014. Dacong Yan, Shengqian Yang, and Atanas Rountev. Systematic Testing for Resource Leaks in

Load more