An Empirical Study on Encapsulation and Refactoring in the Object-Oriented Paradigm
Total Page:16
File Type:pdf, Size:1020Kb
Declaration of Authorship I certify that the work presented here is, to the best of my knowledge and belief, original and the result of my own investigations, except as acknowledged, and has not been submitted, either in part or whole, for a degree at this or any other University. Rajaa Najjar Date: 10 September 2008 An Empirical Study on Encapsulation and Refactoring in the Object-Oriented Paradigm by Rajaa Najjar A Thesis Submitted in Fulfilment of the Requirements for the Degree of Doctor of Philosophy in the University of London September, 2008 School of Computer Science and Information Systems Birkbeck, University of London 1 Abstract Encapsulation lies at the heart of the Object-Oriented (OO) paradigm by regulating access to classes through the private, protected and public declaration mechanisms. Refactoring is a relatively new software engineering technique that is used to improve the internal structure of software, without necessarily affecting its external behaviour, and embraces the OO paradigm, including encapsulation. In this Thesis, we empirically study encapsulation trends in five C++ and five Java systems from a refactoring perspective. These systems emanate from a variety of application domains and are used as a testbed for our investigations. Two types of refactoring related to encapsulation are investigated. The ‘Encapsulate Field’ refactoring which changes a declaration of an attribute from public to private, thus guarding the field from being accessed directly from outside, and the ‘Replace Multiple Constructors with Creation Methods’ refactoring; the latter is employed to remove code ‘bloat’ around constructors, improve encapsulation of constructors and improve class comprehension through the conversion of constructors to normal methods. Both types of refactoring have a strong bond with the need for proper encapsulation of classes and objects as well as with other OO constructs such as inheritance. Overall results demonstrate both quantitative and qualitative benefits in terms of code removal, improved encapsulation and better understanding of system traits in both C++ and Java. 2 In the name of God, the Most Merciful the Most Compassionate Acknowledgements In writing this Thesis, there have, of course, been many people who have inspired, guided and supported me in various ways and so deserve mention and credit. In particular, I am very grateful to my supervisor Professor George Loizou not only for his very insightful and encouraging comments and suggestions on the Thesis, but also for our long discussions at the beginning of the project right through the end. I also owe a great debt of gratitude to my supervisor Dr Steve Counsell for his great support, patience, and guidance that helped shape and inform my knowledge and understanding of this material. Both George and Steve remain for me an incomparable model of scholarly diligence and generosity. I cannot begin to express how much I have learned from them and how much I continue to learn. I would also like to thank Professor Martin Shepperd and Dr Tracy Hall for their constructive suggestions. I also owe special thanks to Dr Peter Sozou at the London School of Economics for his statistical expertise, and to Phil Gregg and the staff at Birkbeck College and all my friends for their support and assistance. My greatest thanks go to my husband Mataz for his support and to my daughter Sarah who is still too young to read this work, but made life such a pleasure despite its moments of frustration. It is to these two people that this Thesis is dedicated, with thanks, appreciation and love. To my Mum and Dad, brothers and sisters, I say I am very grateful for your moral support and sincere prayers. 3 TABLE OF CONTENTS TABLE OF CONTENTS................................................................................................. 4 LIST OF TABLES ........................................................................................................... 9 LIST OF FIGURES ....................................................................................................... 13 LIST OF ABBREVIATIONS........................................................................................ 14 CHAPTER 1 Introduction ............................................................................................ 15 1.2 Introduction................................................................................................. 15 1.3 Motivation................................................................................................... 18 1.4 Objectives and Contribution ........................................................................ 19 1.5 Application Domains................................................................................... 21 1.6 A Research Framework ............................................................................... 21 1.7 Overview of the Thesis................................................................................ 25 CHAPTER 2 A Survey of Related Work...................................................................... 27 2.1 Introduction................................................................................................. 27 2.2 Empirical Software Engineering .................................................................. 27 2.3 Software Metrics ......................................................................................... 31 2.4 Encapsulation.............................................................................................. 35 2.5 Refactoring.................................................................................................. 39 2.6 Patterns and Antipatterns............................................................................. 42 CHAPTER 3 Research Methodology ........................................................................... 44 3.1 Introduction................................................................................................. 44 3.2 Research Methods in Software Engineering................................................. 45 3.3 Research Design.......................................................................................... 46 4 3.3.1 Identifying research objectives .................................................................... 47 3.3.2 Generating hypotheses................................................................................. 48 3.3.3 Identifying testbed software systems ........................................................... 48 3.3.3.1 Sampling techniques and criteria of choosing the software systems ............. 49 3.3.3.2 Software systems description....................................................................... 50 3.3.4 The chosen refactorings............................................................................... 52 3.3.5 Software metrics definitions ........................................................................ 53 3.3.5.1 Criteria of choosing the software metrics..................................................... 54 3.3.6 Data collection ............................................................................................ 54 3.3.7 Data analysis ............................................................................................... 56 3.3.7.1 Statistical techniques ................................................................................... 56 3.3.7.1.1 Cohen’s kappa............................................................................................. 56 3.3.7.1.2 Cronbach’s alpha......................................................................................... 57 3.3.7.1.3 One proportion test...................................................................................... 58 3.3.7.1.4 Two proportions test.................................................................................... 58 3.3.7.1.5 Kruskal-Wallis test...................................................................................... 59 3.3.7.2 Drawing conclusions and the generalisation issue........................................ 60 3.4 Manual and Automatic Collection of Data................................................... 60 3.4.1 Motivation and related work........................................................................ 62 3.4.2 Empirical investigation................................................................................ 63 3.4.3 Data collected.............................................................................................. 64 3.4.3.1 Example using priPA................................................................................... 65 3.4.4 Three hypotheses......................................................................................... 65 3.4.5 Data analysis ............................................................................................... 66 3.4.6 Hypotheses re-visited .................................................................................. 71 5 3.4.6.1 Hypothesis one re-visited ............................................................................ 71 3.4.6.2 Hypothesis two re-visited ............................................................................ 72 3.4.6.3 Hypothesis three re-visited .......................................................................... 73 3.4.7 Cost versus accuracy ................................................................................... 74 3.4.8 Discussion..................................................................................................