Supporting Component-Based Software Development with Active Component Repository Systems

Supporting Component-Based Software Development with Active Component Repository Systems by Yunwen Ye B.Sc., Fudan University, China, 1987 M.S., Fudan University, China, 1990 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer Science 2001 This thesis entitled: Supporting Component-Based Software Development with Active Component Repository Systems written by Yunwen Ye has been approved for the Department of Computer Science Gerhard Fischer James Martin Date The final copy of this thesis has been examined by the signatories, and we find that both the content and the form meet acceptable presentation standards of scholarly work in the above mentioned discipline. Ye, Yunwen (Ph.D., Computer Science) Supporting Component-Based Software Development with Active Component Repository Sys- tems Thesis directed by Prof. Gerhard Fischer It is widely believed and empirically proven that component reuse improves both the quality and productivity of software development. Before software components are reused, however, they must be located. Component repository systems provide a means to locate software components. Current component repository systems are designed to support the paradigm of development-with-reuse, which views reuse as a process independent of the whole software development process and relies on programmers to take the reuse initiative. Such systems fall short in supporting programmers who make no attempt to reuse because they do not know the existence of reusable components or they perceive reuse costs more than programming from scratch. This dissertation advocates a paradigm shift from development-with-reuse to reuse-within- development, which views reuse as an integral part of software development, and component repository systems as information systems that augment programmers’ insufficient knowledge about reusable components and assist them in accomplishing their tasks. Active component repository systems—component repository systems equipped with active information delivery mechanisms—support reuse-within-development. They can be seamlessly integrated with programming environments. Through this integration, their active information delivery mechanism delivers task-relevant and user-specific components, without being given explicit reuse queries, to help programmers reuse unknown components and to reduce the cost of reuse. An active component repository system, CodeBroker, has been developed and evaluated. CodeBroker runs continuously in the background of a programming environment and infers programmers’ needs for reusable components by monitoring their interactions with the environ- iv ment. Potentially reusable components that match reuse queries extracted from comments and signatures in the programming environment are autonomously located and actively delivered to programmers. Formal evaluations of the CodeBroker system have indicated that it motivated programmers to reuse once relevant components were delivered, and that it was able to deliver components relevant to both the task and the background knowledge of programmers. Acknowledgments I feel very fortunate that my employer, Software Research Associates, Inc. (SRA), Tokyo, Japan, provided me the time and financial support to complete this research. In particular, I thank Kouichi Kishida, executive vice president and technical director of SRA, for his lasting support and encouragement, without which I could not have finished this research. Yoshitaka Matsumura, Kaoru Hayashi, and Yoshikazu Hayashi have been excellent managers who have gone to great lengths to provide the best conditions for me to complete my research. I also want to thank my colleague Tomohiro Oda for his help. I am grateful to the members of my thesis committee. Gerhard Fischer, my advisor, is simply the best advisor I could have found. His conceptual frameworks on Domain-Oriented Design Environments and on learning have provided the foundations for this research. Without his excellent skills in challenging my ideas and motivating me to think deeper, I could not have finished the research in this manner. Kumiyo Nakakoji, my mentor and role model, has provided immeasurable support, both emotionally and intellectually. She has been always there when I needed help. Brent Reeves has spent much time patiently listening to my sometimes rough ideas and reading my immature manuscripts, and has provided frank, yet friendly, critical feedback. His constructive criticism has been invaluable in guiding me to frame the research problem, prioritize my resources, and present my ideas clearly. The support from other members of my thesis committee, Ken Anderson, James Martin, and Walter Kintsch helped me to clarify my understanding, and their input is very much appreciated. In particular, I thank James Martin for his excellent course on Natural Language Processing, which introduced me to the research field vi of information retrieval. That was one of the best courses I have ever taken. Members of Center for LifeLong Learning and Design have been very supportive. I thank Taro Adachi for numerous, wide-ranging discussions that I have greatly enjoyed over the years. Jonathan Ostwald generously offered many times to listen to my thoughts and read my writings. His encouragement and feedback is greatly appreciated. I was extremely delighted to have as an officemate Eric Scharff, who had an answer to every computer problem I had, no matter whether it was a Mac, Windows or Linux problem. Many discussions with Rogerio de Paula helped me structure my thoughts. I thank Gerry Stahl, Hal Eden, Andy Gorman, and Francesca Iovine for their support. Finally, I would like to thank my family members. I thank my parents, who have taught me the joy of learning and have always urged me to do my best. I thank my eldest daughter, Hanlu, for understanding when she had to spend many weekends being bored because dad had to work, and my 5-month-old daughter, Hanlei, for her innocent and sweet smiles which provided the best comfort after a day’s hard work. Most of all, I wholeheartedly acknowledge the endless love, understanding, and support that my wife, Yonghong Pan, has given to me. In particular, I thank her for her unabated confidence in me, which has cheered me greatly at times of frustration. Contents Chapter 1 Introduction 1 1.1 Motivation . 1 1.2 Goal of the Research . 3 1.3 Active Component Repository Systems . 6 1.4 The CodeBroker System . 7 1.5 Organization of the Dissertation . 9 2 Roles of Reusable Components in Programming 11 2.1 A Process Model of Programming . 11 2.2 Programming Knowledge . 13 2.3 Opportunistic Programming . 17 2.4 Benefits of Software Components in Programming . 19 3 Challenges of Software Reuse 22 3.1 Overview of Software Reuse . 22 3.2 General Issues of Component Reuse . 25 3.3 Creating Reusable Components . 29 3.4 Understanding the Cognitive Difficulties of Component Reuse . 32 4 The Component Locating Problem 40 4.1 No Attempt to Reuse . 40 viii 4.2 Paradigm Shift: From Development-with-Reuse to Reuse-within-Development 46 4.3 Information-Enriched Workspaces . 49 4.4 Active Component Repository Systems . 51 5 Active Information Systems 55 5.1 Basic Issues of Active Information Systems . 55 5.2 Acquiring Information of User Tasks . 59 5.3 Personalizing Information Delivery . 73 5.4 Dealing with Partial, Imprecise Queries . 75 5.5 Comparing Active Information Systems with an Example in the Real World . 78 5.6 The Spectrum of Support for Locating Information . 79 6 Indexing and Retrieval Mechanisms in CodeBroker 82 6.1 Indexing and Retrieval Mechanisms . 83 6.2 Creating the Component Repository . 94 7 Locating and Delivering Components in CodeBroker 99 7.1 System Architecture . 99 7.2 Listener . 100 7.3 Fetcher . 103 7.4 Presenter . 105 7.5 The Retrieval-by-Reformulation Mechanism . 113 7.6 Summary of CodeBroker . 117 8 Evaluations of CodeBroker 119 8.1 Evaluating the Retrieval Mechanisms . 120 8.2 Empirical Evaluations of the CodeBroker System . 123 8.3 Findings about the Usage of CodeBroker . 128 8.4 Other Findings about Programming in General . 139 ix 8.5 Problems of CodeBroker and Needed Improvements . 143 8.6 Summary of Evaluations . 147 9 Related Work 149 9.1 Active Information Systems . 149 9.2 Component Repository Systems . 151 9.3 Intelligent Programming Environments . 154 10 Future Work and Conclusions 155 10.1 Future Work . 155 10.2 Summary . 157 10.3 Contributions . 159 Bibliography 161 Appendix A The List of Queries and Relevant Components 173 B Questions Asked in the Post-Experiment Interview 176 C Abbreviations 178 D Glossary 179 x Tables Table 1.1 The rapid growth of the Java Core API library . 2 4.1 Relations between reuse mode, knowledge sources, and tool support . 54 5.1 A comparison between plan recognition and similarity analysis . 66 8.1 Average precision and recall values for LSA, Mixed (average of LSA and Okapi), and Okapi . 122 8.2 Programming knowledge and expertise of subjects . 125 8.3 Overall results of evaluation experiments with programmers . 129 8.4 Subjective evaluations of the CodeBroker system . 130 8.5 Experiment data regarding user models . 136 8.6 Experiment data about discourse models . 137 Figures Figure 1.1 The location-comprehension-modification process of reusing components . 3 1.2 Software reuse failure modes . 4 1.3 Overview of CodeBroker . 9 2.1 The process model of programming . 14 2.2 A program and its program plans . 16 2.3 Orthogonality between program plans and software components . 17 2.4 The role of components in problem framing . 21 3.1 A cognitive model of the component reuse process . 34 4.1 Different levels of programmers’ knowledge about a component repository . 42 4.2 The development-with-reuse paradigm . 47 4.3 The reuse-within-development paradigm . 50 5.1 Feedforward information delivery . 57 5.2 Autocompletion in Internet Explorer . 57 5.3 Feedback information delivery . 58 5.4 Two assumptions of similarity analysis .

Load more