Using Relational Databases in the Engineering Repository Systems
Total Page:16
File Type:pdf, Size:1020Kb
USING RELATIONAL DATABASES IN THE ENGINEERING REPOSITORY SYSTEMS Erki Eessaar Department of Informatics, Tallinn University of Technology, Raja 15,12618 Tallinn, Estonia Keywords: Relational data model, Object-relational data model, Repository, Metamodeling. Abstract: Repository system can be built on top of the database management system (DBMS). DBMSs that use relational data model are usually not considered powerful enough for this purpose. In this paper, we analyze these claims and conclude that they are caused by the shortages of SQL standard and inadequate implementations of the relational model in the current DBMSs. Problems that are presented in the paper make usage of the DBMSs in the repository systems more difficult. This paper also explains that relational system that follows the rules of the Third Manifesto is suitable for creating repository system and presents possible design alternatives. 1 INTRODUCTION technologies in one data model is ROSE (Hardwick & Spooner, 1989) that is experimental data "A repository is a shared database of information management system for the interactive engineering about the engineered artifacts." (Bernstein, 1998) applications. Bernstein (2003) envisions that object- These artifacts can be software engineering artifacts relational systems are good platform for the model like models and patterns. Repository system contains management systems. ORIENT (Zhang et al., 2001) a repository manager and a repository (database) and SFB-501 Reuse Repository (Mahnke & Ritter, (Bernstein, 1998). Bernstein (1998) explains that 2002) are examples of the repository systems that repository manager provides services for modeling, use a commercial ORDBMS. ORDBMS in this case retrieving, and managing objects in the repository is a system which uses a database language that and therefore must offer functions of the Database conforms to SQL:1999 or later standard. Management System (DBMS) and also additional Relational data model was introduced by Codd functions. (1970) and has been extensively studied since then. The repository manager uses custom-built data One notable revision is the Third Manifesto (Date & manager or general-purpose DBMS in order to Darwen, 2000), (Date, 2003). Relational model and manage artifacts. System Arcadia (Taylor et al., relational systems are often criticized because they 1988) is an example of the system that uses custom- are arguably not powerful enough for the repository built object manager. Omega (Linton, 1984), The C system. Information Abstraction System (Chen et al., 1990), In this work we show that these problems are PARSE (Gray, 1997) and CommonKADS repository caused by the shortages of SQL standard and (Allsop et al., 2002) are examples of the repository systems that implement that standard. There exists systems that use a Relational DBMS (RDBMS). analyzes of SQL and comparisons of it with the RDBMS in this case is a system which uses a proposals of the Third Manifesto (Date & Darwen, database language that conforms to SQL:1992 2000), (Pascal, 2000), (Date, 2003) but they don't standard or is even earlier relational database discuss problems in the context of specific language. Object-Oriented DBMSs have been often application areas like engineering repositories. seen as a suitable platform for building repository In this paper we also explain how a relational systems (Bernstein, 1998) (Dittrich et al., 2000). system that follows the rules of the Third Manifesto Object-Relational DBMS (ORDBMS) combine can be used in order to create a repository database. features of the relational data model and object- In this case we wouldn't have problems that are oriented programming languages. An example of an present in current ORDBMSs. early attempt to combine relational and object 30 Eessaar E. (2006). USING RELATIONAL DATABASES IN THE ENGINEERING REPOSITORY SYSTEMS. In Proceedings of the Eighth International Conference on Enterprise Information Systems - DISI, pages 30-37 DOI: 10.5220/0002455800300037 Copyright c SciTePress USING RELATIONAL DATABASES IN THE ENGINEERING REPOSITORY SYSTEMS We adopt definition (Date, 2003) according to Table 1: Problems of the relational model. which the relational data model consists of an Problem Authors who mention extensible collection of scalar types, relation type that problem generators, facilities for defining relation variables It is not powerful, Hardwick and Spooner (relvars) of such types, assignment operations for flexible and expressive (1989),Constantopoulos assigning relation values (relations) to relvars and an enough. et al. (1995). extensible collection of generic relational operators Fragmentation. Data Liu et al. (1996), for deriving relation values from other relation about the object is in Gray (1997). values. the different relations. The rest of the paper is organised as follows. Performance problems Hardwick and Spooner Section 2 contains results of the literature study due to fragmentation. (1989), Gray (1997). about the problems of the relational model and Lack of powerful type Taylor et al. (1988), RDBMSs that hamper their usage in the repository system. Miguel et al. (1990), systems. Section 3 explains how a RDBMS which Emmerich et al. (1992), data model follows the rules of the Third Manifesto Liu et al. (1996), can be used in a repository system. Nowadays much Gray (1997). attention is paid to the ORDBMSs. Section 4 Poor support to data Hardwick (1984), describes problems of the current ORDBMSs that that represents graph Miguel et al. (1990), make their usage in the repository systems more structures. Lack of Katz (1990), difficult. Section 5 summarizes and points to the facilities for making Emmerich et al. (1992), future work with the current topic. queries based on such Gray (1997), data including finding Lange et al. (2001). transitive closure. 2 LITERATURE STUDY Inappropriate Hardwick and Spooner transaction models for (1989) Next we classify problems of the relational model the engineering and RDBMSs based on the literature study and systems. analyze these problems in terms of the Third Detailed semantics of Engle (2003) Manifesto. Researchers and developers often present the relvars have to be these problems as reasons why relational database is captured outside the not the best choice to use in the engineering systems. relational system. Table 1 presents problems of the relational model Lack of possibility to Zhang et al. (2001) that have been identified by the researchers. preserve semantics of Some issues that have been raised by the the relationships. researchers are actually orthogonal to the relational model. We adopt the approach taken by Date and Fragmentation increases complexity to the user Darwen (2000, p. 21): "The question as to what data of database according to Gray (1997). Virtual relvars types are supported is orthogonal to the question of (views) help to overcome this problem in the system support for the relational model." Even Codd (1970) that follows the rules of the Third Manifesto. View- acknowledged possibility of the nonsimple domains defining expression can join values of relvars that which permitted values are relations. One reason contain information about the object. It can have why he argues for eliminating nonsimple domains is relation-valued attributes which values are that they require more complicated data structures at calculated using relational operator GROUP that the storage level than simple domains. Transaction provides relation "nest" capability (Date & Darwen, model is also orthogonal to the relational model. 2000). Gray (1997) writes that fragmentation may Date and Darwen (2000) have requirement for cause performance problems. But "performance is nested transactions in the section of the Other fundamentally an implementation issue, not a model Orthogonal Prescriptions. issue." (Date, 2005) Hierarchic and networked information can be Semantics of the relvar that is understandable to represented relationally (Pascal, 2000, chap. 7) Issue the human user is specified by the external predicate of making queries based on data that represents of the relvar (Date & Darwen, 2000, p. 179). It could graph structure is addressed in the Third Manifesto. well be recorded in the catalogue of the database that Relational Model Very Strong Suggestion no. 6 must be part of the database that it describes. (Date & Darwen, 2000, p. 213) requires that Semantics of the data that is understandable to the relational language should provide shorthand for system is represented by the internal predicate of the expressing generalized transitive closure query. relvar (Date, 2003, p. 262). Constraints to the value 31 ICEIS 2006 - DATABASES AND INFORMATION SYSTEMS INTEGRATION of relvar specify internal predicate. Therefore 3 USING RELATIONAL DBMS IN RDBMS should provide means for defining tuple- THE REPOSITORY SYSTEM attribute-, relvar- and database constraints. Semantics of the relationship determine constraints that are used in order to implement this A repository system permits management of artifacts relationship and operations that may be performed that are created using some language that belongs to with the data that participate in the relationships the set of its supported languages. Repository system (Zhang et al., 2001). Relational language may should allow to add new languages to this set in contain shorthand statements that cause creation of order to be most useful. Abstract syntax of the the database objects that implement particular type language can be specified using metamodel