Design Information Recovery from Legacy System COBOL Source Code

Nova Southeastern University NSUWorks CEC Theses and Dissertations College of Engineering and Computing 1996 Design Information Recovery from Legacy System COBOL Source Code: Research on a Reverse Engineering Methodology Robert Lee Miller Nova Southeastern University, [email protected] This document is a product of extensive research conducted at the Nova Southeastern University College of Engineering and Computing. For more information on research and degree programs at the NSU College of Engineering and Computing, please click here. Follow this and additional works at: http://nsuworks.nova.edu/gscis_etd Part of the Computer Sciences Commons Share Feedback About This Item NSUWorks Citation Robert Lee Miller. 1996. Design Information Recovery from Legacy System COBOL Source Code: Research on a Reverse Engineering Methodology. Doctoral dissertation. Nova Southeastern University. Retrieved from NSUWorks, Graduate School of Computer and Information Sciences. (727) http://nsuworks.nova.edu/gscis_etd/727. This Dissertation is brought to you by the College of Engineering and Computing at NSUWorks. It has been accepted for inclusion in CEC Theses and Dissertations by an authorized administrator of NSUWorks. For more information, please contact [email protected]. Design Information Recovery from Legacy System COBOL Source Code: Research on a Reverse Engineering Methodology by Robert Lee Miller A Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy School of Computer and Information Sciences Nova Southeastern University 1996 We hereby certify that this dissertation, submitted by Robert L. Miller, confonns to acceptable standards and is fully adequate in scope and quality to fulfill the dissertation requirements for the degree of Doctor of Philosophy. ~ J~ Date Chairperson of Dissertation Committee Jacques Levin, Ph.D. Dissertation Committee Member Dissertation Committee Member Approved: ~---..... Edward Lieblein, Ph.D. Date Dean, School of Computer and Infonnation Sciences School of Compeer and Infonnation Sciences Nova Southeastern University 1996 Certification Statement I hereby certify that this dissertation constitutes my own product and that the words or ideas of others, where used, are properly credited according to accepted standards for professional publications. An Abstract of a Dissertation Submitted to Nova Southeastern University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Design Information Recovery from Legacy System COBOL Source Code: Research on a Reverse Engineering Methodology by Robert L. Miller October 1996 Much of the software in the world today was developed from the mid-1960s to the mid- 1970s. This legacy software deteriorates as it is modified to satisfY new organizational requirements. Currently, legacy system maintenance requires more time than new system development. Eventually, legacy systems must be replaced. IdentifYing their functionality is a critical part of the replacement effort. Recovering functions from source code is difficult because the domain knowledge used to develop the system is not routinely retained. The source code is frequently the only reliable source of functional information. This dissertation describes functional process information recovery from COBOL source code in the military logistics system domain. The methodology was developed as an information processing application. Conceptual and logical models to convert source code to functional design information were created to define the process. A supporting data structure was also developed. The process reverse engineering methodology was manually applied to a test case to demonstrate feasibility, practicality, and usefulness. Metrics for predicting the time required were developed and analyzed based on the results of the test case. The methodology was found to be effective in recovering functional process information from source code. A prototype program information database was developed and implemented to aid in data collection and manipulation; it also supported the process of preparing program structure models. Recommendations for further research include applying the methodology. to a larger test case to validate findings and extending it to include a comparable data reverse engineering procedure. ACKNOWLEDGMENTS Completing a doctoral dissertation is a long and arduous process that requires extraordinary dedication, commitment, and perseverance. Throughout this effort, support was provided by a group of people to whom lowe a deep debt of gratitude. For without their support, this research project would have never been completed. I am indebted to my dissertation advisor, Dr. Junping Sun, who offered critical encouragement, advice, and criticism. I also wish to thank Dr. Jacques Levin and Dr. Marlyn Kemper Littman, dissertation committee members. They guided me in the transformation of a fuzzy concept into a worthwhile research project. I am grateful to the staff and faculty of the Nova Southeastern University School of Computer and Information Science. They are, without exception, dedicated to assisting the doctoral student. A co-worker, JoAnn Gody) Morley listened patiently and provided valuable ideas and suggestions during preliminary applications of the concepts upon which this research is based. Thanks for your confidence, jody. Many thanks to Heather Pierce, also a co-worker, who read the early drafts and made sure they were written in English. Her help was invaluable in keeping my writing concise and to the point. Finally, my deepest gratitude and appreciation to my wife, Fumiko, for her understanding, tolerance, and support. R.L.M. Table of Contents Abstract iii Acknowledgments IV Table of Contents V List of Tables xiii List of Figures xv Chapter L Introduction 1 Statement of the Problem 1 Barriers and Issues 2 Barriers to Reverse Engineering 2 Issues in Reverse Engineering 3 Importance of the Topic 4 Significance of the Research 7 Definition of Terms 8 Brief History of Reverse Engineering 9 Objectives of the Research 10 Scope of the Research 11 Research Questions Investigated 12 Research Methodology 13 Phase I: Approach Selection 13 Phase II: Methodology Development 13 Phase III: Case Problem Selection 14 Phase IV: Methodology Application 14 Phase V: Methodology Assessment 15 Limitations and Delimitations of the Research 15 Limitations of the Research 15 Delimitations of the Research 16 Contributions of the Research 17 Theoretical Contribution 17 Managerial Contribution 17 Criteria for Success 19 Documented Reverse Engineering Methodology 20 Validated Methodology 20 Program Information Database Conceptual and Logical Data Models 21 Work Load Estimation Metrics 21 Summary 21 n. Review of the Literature 23 Introduction 23 Programming Languages 25 Programming Language Syntax 26 Programming Language Semantics 27 Programming Language Components 28 Job Control Language (JCL) 29 The COBOL Language 30 COBOL ffistory 31 COBOL Structure 32 Nature of COBOL 41 COBOL Disadvantages 43 COBOL Advantages 49 Program Understanding 49 The Meaning of Program Understanding 52 Software Psychology 53 Factors Affecting Program Understanding 55 Domains and Program Understanding 59 Approaches to Program Understanding 61 The Need for Reverse Engineering 63 Legacy Systems 64 Software Aging 66 Business Changes 68 Business Process Reengineering 68 Client/Server Technology 69 Object-Oriented Technology 70 Software Maintenance 70 Reverse Engineering Economics 71 Reverse Engineering 74 Reverse Engineering Objectives 82 The Basis for Reverse Engineering 83 Reverse Engineering Problems 84 Design RecoverylInverse Engineering 90 Existing Reverse Engineering Procedures 91 Software Physical Structure 97 Knowledge-based Program Understanding Tools 98 Transformation Tools 99 Program Plans 99 Data Flow Diagrams 100 Functional Abstraction Tools 100 Computer Assisted Reverse Engineering (CARE) Tools 101 Summary 104 m. Methodology 109 Research Methods Employed 110 Specific Procedures Employed 111 Description of Phase 1 - Reverse Engineering Approach Selection 111 Description of Phase 2 - Reverse Engineering Methodology Development 113 Description of Phase 3 - Case Study Subject Selection 114 Description of Phase 4 - Reverse Engineering Methodology Application 114 Description of Phase 5 - Methodology Assessment 115 Execute the Reverse Engineering Synthesis Plan 115 Problem Definition 116 The Operational Environment 116 Operational Problems 117 COBOL Program Environment 118 A Forward Engineering Model 120 Essential Points in the Forward Engineering Model 121 Information Loss in Forward Engineering 128 A Model of the Reverse Engineering Process 131 Differences in Forward and Reverse Engineering 13 1 Problems Associated with Reverse Engineering 134 A Formal Method of Reverse Engineering - Clean-SpecifY-SimplifY 140 A Structural Approach to Reverse Engineering - Program Schematics 143 A Program Understanding Approach to Reverse Engineering - DESIRE 151 A Data-Oriented Reverse Engineering Technique - Component Extraction 155 A Data Repository Approach to Reverse Engineering - System Description Database 160 Reverse Engineering Approach for Air Force Logistics Systems 164 A Model On-line Program 166 A Model Batch Program 169 A Model Fourth Generation Language (4GL) Program 171 Reverse Engineering Process Output Products 173 Developing the Reverse Engineering Methodology 178 Purpose 178 Scope 178 Strategy 179 Goals 183 The Conceptual Process Model 184 The Conceptual Data Model

Load more