Rule-Based Detection of Design Patterns in Program Code

Software Tools for Technology Transfer manuscript No. (will be inserted by the editor) Rule-based Detection of Design Patterns in Program Code Awny Alnusair1?, Tian Zhao2, Gongjun Yan3 1 Indiana University, Kokomo, IN 46904, USA, e-mail: [email protected] 2 University of Wisconsin-Milwaukee, Milwaukee, WI 53201, USA, e-mail: [email protected] 3 University of Southern Indiana, Evansville, IN 47712, USA, e-mail: [email protected] Received: date / Revised version: date Abstract. The process of understanding and reusing was popularized by the Gang-of-Four (GoF) [15]. Since software is often time-consuming, especially in legacy then, patterns have been widely used for documenting code and open-source libraries. While some core code and structuring new software libraries 1 as well as com- of open-source libraries may be well-documented, it is prehending and re-engineering existing legacy software. frequently the case that open-source libraries lack in- During development, design patterns can be used to formative API documentation and reliable design infor- understand the libraries being reused and to speed-up mation. As a result, the source code itself is often the the development process by providing tested design tem- sole reliable source of information for program under- plates that can be used to solve design dilemmas. On the standing activities. In this article, we propose a reverse- other hand, during the process of refactoring and main- engineering approach that can provide assistance during tenance, design knowledge can provide a better glimpse the process of understanding software through the au- into design intent and thus establishes a better founda- tomatic recovery of hidden design patterns in software tion for the high-level view of the structure, the organi- libraries. Specifically, we use ontology formalism to rep- zation, and the interaction between the components of resent the conceptual knowledge of the source code and the system being maintained. semantic rules to capture the structures and behaviors of the design patterns in the libraries. Several software Unfortunately, for some legacy systems and large open- libraries were examined with this approach and the eval- source libraries, such design knowledge is not richly doc- uation results show that effective and flexible detection umented, which leaves the source-code itself as the only of design patterns can be achieved without using hard- source for capturing design intent. Therefore, the soft- coded heuristics. ware engineering community has been interested in providing effective reverse-engineering approaches to ana- lyze source code and to recover the lost design rationale that can be depicted in the form of design patterns. Key words: Design Patterns { Design Recovery { Soft- To that end, we propose an effective and usable reverse- ware Maintenance { Ontology Formalisms { Knowledge engineering approach that is based on semantic tech- Representation { Semantic Inference niques to detect design pattern instances from source code. This approach provides a formal, explicit, and semantic- based representation of the conceptual knowledge of source 1 Introduction code. It is solely reliant on ontology modeling and semantic- based techniques from the emerging field of Semantic Design patterns provide reusable solutions to common Web [4]. The fundamental hypothesis that we explore in object-oriented design problems. They are viewed as a this article is that representing software knowledge using natural way of decoupling the bindings between system ontology formalism and semantic techniques is effective components so that systems can be more understand- in improving the precision of recovering design pattern able, scalable, and adaptable [7]. The concept of de- information. sign patterns in the context of object-oriented design ? Corresponding author. Awny Alnusair was partially supported 1 Although they have a few distinguishing features, we use the by the Indiana University Kokomo fellowship research grant terms `library', `framework', and `software system' interchangeably 2 Awny Alnusair et al.: Rule-based Detection of Design Patterns in Program Code 1.1 Semantics-enabled Design Recovery { We present updated ontology models that include an enhanced source-code ontology as well as new design Semantic Web technologies have been a great asset in pattern ontologies. providing solutions for various domain specific problems. { We include additional support for detecting a wider Ontologies are the backbone technology for formal and range of design patterns. explicit representation of knowledge in the Semantic Web. { We enhance the process of knowledge population with Due to their formal reasoning foundation, ontologies can a new parser that captures additional aspects of source- play an important role in domain engineering. One can code that allows higher levels of source-code analysis. use ontologies to structure and build a source-code knowl- { We present a more detailed performance evaluation edge base that can be used by software agents and serve and empirical case studies using a new tool imple- as an effective base for semantic queries [13,26]. mented as a plugin for the Eclipse IDE. This research contributes to the proper linking of the established field of program understanding with the In the following section, we provide some background emerging field of Semantic Web by developing ontol- for understanding the proposed methodology. In partic- ogy models for the problem of design pattern recov- ular, we introduce the concepts, the ontology languages, ery and semantic knowledge-representation. Our ontol- and Semantic Web technologies that are employed in our ogy model includes a Source Code Representation Ontol- approach to program understanding and design recovery. ogy (SCRO). This ontology is created to provide an ex- Furthermore, this section provides a detailed discussion plicit representation of the conceptual knowledge struc- of our source-code ontology model and the process that ture found in source code. SCRO provides an effective we use to instantiate this model and generate the knowl- base model for understanding the relationships and de- edge base. pendencies among source-code artifacts in a software system. Therefore, SCRO serves as a basis for design 2 Ontology Modeling for Design Pattern pattern recovery and can be utilized by other applica- Recovery tions that require semantic knowledge at the source code level. Since SCRO is primarily used for building a software knowledge base, a set of parsing tools have been The research presented in this article seeks to leverage developed to instantiate the ontology with ontology in- program understanding and design recovery using on- stances that represent source code artifacts. tological modeling with the help of various Semantic In the context of design pattern recovery, we have Web techniques. Central to our approach is an ontol- also designed and developed a design pattern ontology ogy model that captures heterogeneous sources of con- sub-model. This sub-model extends SCRO's vocabulary ceptual knowledge found in source code by modeling the and includes an upper design-pattern ontology that is most important aspects of its internal structure. This further extended with a specific ontology for each indi- kind of modeling serves as the basis for providing ef- vidual design pattern. This setup provides flexibility and ficient and common access to relevant information re- transparency during the pattern detection process. The sources. Once the ontology model is formally defined and state-of-the-art tools in design pattern recovery hard- represented using an appropriate taxonomy, the model code the descriptions and roles of individual design-pattern needs to be automatically populated with ontological in- participants. This limits the usability and flexibility of stances. These instances are the basic building blocks for such tools since users have no control over these descrip- constructing and populating a knowledge base that can tions. In contrast, using ontology representation, design be accessed through semantic queries. The following sub- patterns can be specified externally within their respec- sections provide an overview of ontology modeling and tive ontologies and their participants are depicted us- discuss the process of structuring and building a seman- ing semantic rules that can be easily relaxed or for- tic knowledge base for source-code artifacts. tified by users according to their needs. Therefore, we further hypothesize that ontology representation of soft- 2.1 Ontologies and the Semantic Web ware knowledge enables a flexible mechanism of design recovery through rule relaxation, which may detect more Ontology is a term originally used in Philosophy. It de- pattern instances. scribes the nature of being or the kinds of existence and Our earlier work [1] explores the notion of using se- their basic categories and relationships. Computer scien- mantic techniques for design pattern recovery. This work tists have borrowed this term and used it in many differ- delves into the details and extends the previous work in ent areas, including Knowledge Engineering, Database the following primary directions: Theory, Artificial Intelligence, Software Engineering, and { We provide a refined description of the proposed ap- Information Retrieval and Extraction. More recently, this proach that further investigates the issue of ontology term has been exploited by the Semantic Web commu- modeling and how it can be properly used to

Load more