Domain Ontology Based Class Diagram Generation from Functional Requirements

International Journal of Computer Information Systems and Industrial Management Applications. ISSN 2150-7988 Volume 6 (2014) pp. 227 - 236 © MIR Labs, www.mirlabs.net/ijcisim/index.html Domain Ontology Based Class Diagram Generation From Functional Requirements Jyothilakshmi M S and Philip Samuel Information Technology, School of Engineering Cochin University of Science and Technology Kochi-22 Kerala India [email protected]; [email protected] Abstract —Domain ontology formally represents knowledge as a attributes, operations (or methods), and the relationships set of concepts within a domain, and the relationships among among the classes. It is the main building block of object those concepts. This paper proposes a method to generate class oriented modeling and design. In the diagram, classes are diagram from functional requirement specification which is represented with boxes which contain three parts. The upper written in natural language by using Domain ontology and part holds the name of the class, the middle part contains the Natural language processing techniques. Major steps in this attributes of the class and the bottom part gives the methods method are identification of nouns and verbs from requirement or operations the class can take or undertake. The relationship specification statements and linking them with the ontology. among the classes can be association, dependency, From the ontology we get information about core concepts and aggregation, generalization and composition. In the design of relationships among those concepts. Thus domain ontology a system, a number of classes are identified and grouped helps in the identification of classes, attributes and relationships together in a class diagram which helps to determine the for the particular domain for which the system is to be static relations between those objects. developed. The aim of this paper is to automate the extraction of Keywords— Domain Ontology, Natural language processing, class diagrams from functional requirement specification UML Class Diagram. which is represented in natural language statements and thus I. INTRODUCTION to reduce the problems associated with manual extraction. In this approach we utilize the benefits of domain ontology. Ontology is a “formal explicit specification of a Requirements are generally expressed in the form of shared conceptualization” [2]. It formally represents natural language statements [1]. Analysis and Design Process knowledge as a set of concepts with in a domain and may be covers two major phases of software development life cycle. used to describe the domain. All the core concepts and their The two phases are often overlapped due to the recursive relationships are represented in the ontology in a formal way nature of requirements. Process usually starts with getting so that we can automate the process of understanding the know-how of customer business processes and existing domain specific terms. Formal languages used to construct systems, understanding customer needs, expectations, ontology are known as ontology languages. Web Ontology constraints and elicitation of requirements. Requirements Language (OWL) is one the most recent development in conflicts are removed, and issues and concerns are addressed standard ontology language from World Wide Web to develop a clear understanding of customer requirements. Consortium (W3C). In our method ontology is provided in These requirements are then used to define the functionality OWL format. We can identify the classes, attributes and inter of proposed system and to determine what the systems are class relationships by comparing with the terms in the intended to do. System functionality and architecture is ontology with that of functional requirement specification. documented, and after reviewing internally, communicated to Also ontology helps us to identify certain domain specific customer for the purpose of validation. This document then terms which are not explicitly mentioned in the functional gets mature as a result of customer feedback and serves as the requirements. When ontology cannot identify the concepts or basis for further development. For manual requirement terms in the natural language statements, we go for some analysis and design process, Software engineers spent lot of natural language processing techniques to extract the classes, time. attributes and inter class relationships. A class diagram in the Unified Modeling Language Our paper is organized as follows. Section 2 analyzes (UML) is a type of static structure diagram that describes the previous works on class diagram generation from natural structure of a system by showing the system's classes, their language requirements. Section 3 is the description of proposed approach. Section 4 is the conclusion. Dynamic Publishers, Inc., USA 228 Jyothilakshmi and Samuel II. RELATED WORKS Parallel Structure uses the feature of English sentence structure to help elicit more attributes of a class, more child Few approaches are already developed for the extraction classes of a super class, and more aggregation part of a class. of class diagram from natural language requirement specification. In this section we survey those works. Throughout this method NLP techniques are applied. Part-of-speech (POS) tagger and sentence parser are A. Auto-generation of Class Diagram from Free-text used to produce preliminary class candidates while word Functional Specifications and Domain Ontology by sense disambiguation (WSD) and semantic network are Xiaohua Zhou and Nan Zhou employed to refine preliminary candidates. Linkage distance produced by Link Grammar Parser is used to find concept Zhou and Zhou [3] propose a methodology that uses NLP pairs within strong semantic association within a sentence. and domain ontology. In this method some initial processing Different weights are assigned to each concept pair according of the natural language requirements is done with the help of to the linguistic patterns the concept pair belongs to. some linguistic patterns. Also in the domain ontology each Linguistics patterns are used to evidence class-attribute concept is provided in some specific format as input and it is relationship, aggregation relationship, and generalization used to refine the result. relationship, and distinguish three types of association relationships. Parallel structure serves as a clue to find more The approach applies a NLP-based spider model to search attributes of a class, more child classes of a super class, and classes of interest, that is, they identify core classes of the more parts of a composite class. domain as starting point, and further find classes that are related with the identified ones. The major components of the Secondly, this approach well integrates domain ontology approach are described below. with class generation for the first time. The domain ontology contains important or core domain knowledge. The ontology Domain Ontology serves as domain knowledge to improve feeds the system important classes and their attributes. the performance of conceptual modeling. However this method needs a set of linguistic pattern as input. Candidate Class Identification module outputs Preliminary Candidates and Refined Candidates, which serve as input of B. Class diagram extraction from textual requirements using spider model. The part of speech (POS) tagger and sentence Natural language processing (NLP) techniques by Mohd parser produce preliminary candidates. Further, word sense Ibrahim, Rodina Ahmad disambiguation technique and semantic network are employed to produce refined candidates, which are Mohd Ibrahim, Rodina Ahmad [4] proposes a semantically related to the concepts defined in the ontology. methodology that also uses NLP and domain ontology. The workflow of this method is as follows. From the requirements Relationship Identification module is the most important document stop words are identified and stored in list. With component of the spider model. It uses linkage distance to the help of taggers sentences are tagged and thus nouns, noun identify all concept pairs with strong semantic association phrases and verbs are identified. Then by applying RACE within a sentence. Then assigns different weight to each stemming algorithm the root form of each word is find out concept pair to indicate how strong the connection is and stored. Now by applying Class Identification Rules, according to the constituents the concepts serve as in the Attribute Identification Rules and Relationship Identification sentence. Rules the components are identified. The rules are given below. Attribute Identification module distinguishes attributes from classes. After two concepts are found to be strong 1) Class Identification Rules: associated to each other, then whether the concepts stand for a class or an attribute of a class is identified. C-Rule1: If a concept is occurred only one time in the document and its frequency is less than 2 %, then ignore as Naming Relationship module applies linguistic patterns to class. find aggregation relationship and generalization relationship. C-Rule2: If a concept is related to the design elements then It uses numeric relationship understanding technique to ignore as class. Examples: “application, system, data, distinguish one-to-one association p, one-to-many computer, etc…” association, and many-to-many association. Domain Ontology Based Class Diagram Generation From Functional Requirements 229 C-Rule3: If a concept is related to Location name, People Dependency

Load more