Network Analysis of Large Scale Object Oriented Software Systems
Total Page:16
File Type:pdf, Size:1020Kb
Network Analysis of Large Scale Object Oriented Software Systems Doctor of Philosophy Anjan Pakhira 26-February 2013 SCHOOL OF COMPUTING SCIENCE Abstract The evolution of software engineering knowledge, technology, tools, and practices has seen progressive adoption of new design paradigms. Currently, the predominant design paradigm is object oriented design. Despite the advocated and demonstrated benefits of object oriented design, there are known limitations of static software analysis techniques for object oriented systems, and there are many current and legacy object oriented software systems that are difficult to maintain using the existing reverse engineering techniques and tools. Consequently, there is renewed interest in dynamic analysis of object oriented systems, and the emergence of large and highly interconnected systems has fuelled research into the development of new scalable techniques and tools to aid program comprehension and software testing. In dynamic analysis, a key research problem is efficient interpretation and analysis of large volumes of precise program execution data to facilitate efficient handling of software engineering tasks. Some of the techniques, employed to improve the efficiency of analysis, are inspired by empirical approaches developed in other fields of science and engineering that face comparable data analysis challenges. This research is focused on application of empirical network analysis measures to dynamic analysis data of object oriented software. The premise of this research is that the methods that contribute significantly to the object collaboration network's structural integrity are also important for delivery of the software system’s function. This thesis makes two key contributions. First, a definition is proposed for the concept of the functional importance of methods of object oriented software. Second, the thesis proposes and validates a conceptual link between object collaboration networks and the properties of a network model with power law connectivity distribution. Results from empirical software engineering experiments on JHotdraw and Google Chrome are presented. The results indicate that five considered standard centrality based network measures can be used to predict functionally important methods with a significant level of accuracy. The search for functional importance of software elements is an essential starting point for program comprehension and software testing activities. The proposed definition and application of network analysis has the potential to improve the efficiency of post release phase software engineering activities by facilitating rapid identification of potentially functionally important methods in object oriented software. These results, with some refinement, could be used to perform change impact prediction I and a host of other potentially beneficial applications to improve software engineering techniques. II Acknowledgements The last three years have been one of the most challenging, exciting, and stimulating periods of my academic life, and I owe a lot of thanks and gratitude to many who have contributed to my growth and celebrated in the process. I would like to express my sincere thanks to Peter Andras who gave me the opportunity to pursue this research. Peter has been a constant source of inspiration, guidance, and motivation all through this research. I have had the opportunity to interact with Andrian Marcus and Wahab Hamou-Lhadj on the research topic and have benefitted from their years of experience and insight. I would like to thank my wife and toddler son who have shown patience and understanding and provided me with the support without which this work would not have been possible. Finally, I would like to thank my father, who has been a friend, philosopher and guide, believing in me and facilitating this journey. This work was made possible because of EPSRC funded PhD studentship, and support provided by the School of Computing Science, Newcastle University, for which I will always be grateful. III Contents Abstract .............................................................................................................................. I Acknowledgements ......................................................................................................... III Contents .......................................................................................................................... IV List of Figures .............................................................................................................. VIII List of Tables................................................................................................................... IX Chapter 1. Introduction .................................................................................................... 1 1.1 Motivation .................................................................................................................. 1 1.2 Aim, objectives and contributions............................................................................... 4 1.2.1 Aim ....................................................................................................................... 4 1.2.2 Objectives ............................................................................................................ 5 1.2.3 Contributions ........................................................................................................ 6 1.3 Outline of the thesis .................................................................................................... 6 Publications ....................................................................................................................... 7 Chapter 2. Background...................................................................................................... 8 2.1 Introduction ................................................................................................................. 8 2.1.1 Scheme of review and tables of classifications .................................................... 9 2.2 Program analysis overview ....................................................................................... 11 2.2.1 Static analysis ..................................................................................................... 15 Software inspection and defect detection ................................................................ 18 Object oriented design............................................................................................. 20 Feature location ....................................................................................................... 23 Design patterns and method stereotypes ................................................................. 25 2.3 Network analysis ....................................................................................................... 27 2.3.1 Network Graph theory basic definitions and measurements.............................. 27 2.3.2 Complex networks ............................................................................................. 29 Erdős-Rényi (ER): Random graphs ........................................................................ 29 IV Watts-Strogatz (WS): Small-world networks ......................................................... 30 Barabási-Albert (SF): Scale-free networks ............................................................. 32 Network analysis measures ..................................................................................... 34 2.3.3 Use of Complex network modelling .................................................................. 38 Social and socio-technical systems ......................................................................... 39 Computational and systems biology ....................................................................... 40 Complex networks and program analysis ............................................................... 41 2.4 Use of the Cloud ....................................................................................................... 43 Chapter 3. Review of Dynamic Analysis Literature ...................................................... 47 3.1 Introduction ............................................................................................................... 47 3.2 Dynamic analysis ...................................................................................................... 47 3.2.1 Software testing and profiling ............................................................................ 50 3.2.2 Data collection ................................................................................................... 51 3.2.3 Object oriented design quality ........................................................................... 54 3.2.4 Program comprehension .................................................................................... 54 3.2.5 Complex networks based modelling .................................................................. 57 3.3 Mixed mode analysis................................................................................................. 58 3.3.1 Software testing and profiling ............................................................................ 58 3.3.2 Data collection ................................................................................................... 59 3.3.3 Program comprehension ...................................................................................