On Systemics, C~B Emetics and Informatics
Total Page:16
File Type:pdf, Size:1020Kb
The 6th World Mailti.conference on Systemics, C~b _emetics and Informatics July 14-18,2002 r Orlando, Florida, USA Volume VII Information Systems Development II Organized by IIIS EDITED BY International N agib Callaos 1 Institute of John Porter Informatics and N aphtali Rishe Systemics Member of International Federation of Systems Research IFSR Semantic Design and Oracle Implementation of the Vegetation Database for the Everglades National Park• Naphtali D. RISHE, Maxim V. CHEKMASOV, Marina V. CHEKMASOVA, Natalia Y. TEREKHOVA, Anatoliy A. ZHYZHKEVYCH, Gustavo PALACIOS High Perfonnance Database Research Center School of Computer Science Florida International University Miami, Florida 33199, U.S.A. ABSTRACf This dataset is closely related to the plant species classification dataset. The latter contains taxonomy Several environmental studies have been conducted in information on the vegetation: categories, families, South Florida, which are related to vegetation patterns in genus, and species of the plants to be found in South that area. These studies have accumulated valuable data Florida. As we see below the important feature of this on plant species with the goal to use this data in further dataset is introduction of a unique 16-character code, biological analysis and research. To facilitate this, we derived from the first four letters of the genus, specific have designed and implemented the vegetation database. epithet, and infraspecific name (if applicable) for each Semantic modeling was chosen as an efficient and plant species in classification. flexible approach to the database design. The Oracle relational database management system was chosen as The solution holes dataset is related to the study of land the platform to implement and maintain the system. In coverage with ·plants on several geographical locations the present paper we discuss the vegetation data sets, (sites) in the ENP. At each site several holes were database requirements, design and implementation. We artificially prepared for the study. The resulting dataset also describe the tool built in-house to automatically map contains physical parameters of the holes like soil depth a conceptual semantic database design into a physical as well as hole coverage information with live and dead relational database\. design. plants. Keywords: Environmental study, semantic schema, Length measurements for plants at the Dry Tortugas relational database. National Park is yet another dataset to be stored in the database. Several transects were prepared in the Park to conduct measurements. Each transect is divided into 1. VEGETATION DATASETS seven segments. The dataset records the starting/ending dates of the plant measurements and the heights of the The datasets to be loaded and maintained in the plants. vegetation database vary greatly in their periods of data collection, geography, methods used by the researchers, The study conducted on the plots in Taylor Slough and the ways the datasets were formally presented for resulted in a separate dataset. Unlike the previously data loading. We will briefly describe the major datasets. described dataset, the plots here were setup using other methodology. The plot records contain data on the The herbarium dataset contains information on the plants physical parameters of the local area like water depth as that are displayed in the Everglades National Park's well as percentage of coverage of the plots with the (ENP) herbarium. This collection of plants is constantly vegetation. growing. The idea behind having this dataset in the database is to facilitate fast and easy search of the A special study is conducted to monitor restoration specimens in the herbarium and to check the availability efforts in the ENP. The study produces the so called of a particular plant in the collection. At this time the 'hole-in-the-donut' dataset. The measurements are dataset has no digital photos of the herbarium plants, but conducted on a set of plots, each being precisely gee the photos may be added at a later stage. located and referenced to the neighboring plots. Record information includes physical parameters and plant 'This research was supported in part by NASA (under grants NAGS-9478, NAGW-4080, NAGS-5095, NASS-97222, and NAGS-6830), NSF (CDA-9711582, IRI-9409661, HRD-9707076, and ANI-9876409), ONR (N00014-99-1-0952), the Department oflnterior, and the FSGC. 432 conditions. Additionally plant phenology data and tree technically treated as subsets of the semantic model. islands' study results are included. Since the restoration Therefore, the concepts and languages of the semantic effort in the Park has a complex effect or the whole area, model may apply to them. the study also produced data on soil and water depth, as well as wildlife monitoring records. For our project, semantic modeling may be considered as a tool for database design in the relational database Finally, exotic plants treatment using herbicides is model. The top-down relational database methodology regularly conducted in the ENP. The data on site visits, may be presented as follows. First, we analyze and plant treatments, personnel, and herbicides used are specify our project semantically. This produces a concise, combined to form the exotic plants treatment dataset. flexible, user-oriented specification of the vegetation database, unconstrained by computer-oriented concerns. 2. USER REQUIREMENTS TO THE DATABASE In the second stage this specification is converted into a relational database schema with all its integrity From the previous section it is clearly seen that the constraints. The semantic description, which we call the vegetation studies differ greatly in the data objects semantic schema, remains as a high-level documentation collected and the methodology used to conduct the of the database. studies. One of the requirements for the database was to combine all the data in a single database. To create a Semantic modeling for the database project starts with the database design where all the data components are identification of the objects. An object may be any item logically related between themselves is a challenging task in the real world. It can be either a concrete object in this case. However we were allowed to choose the (value) or an abstract object (non-value). A category is methodology to create the design of the vegetation a concept of the database design that is a unary property database which proved to be helpful in accomplishing the of objects. In other words, a category describes a set of task. objects which possess the property. Abstract categories are categories whose objects are always abstract, Another requirement was to implement the system using concrete categories are the ones whose objects are Oracle RDBMS (relational database management always concrete. A binary relation is a concept of the system), (1]. Oracle was chosen because the clients have database design that is a binary property of objects. It has already used it for their database needs and possess a the meaning of relationship or connection between two license for this software package. The database server is objects. Attributes are binary rel~ions whose range is a installed under a UNIX operating system, whereas most concrete category. These concepts are used in the of the database users will use a Microsoft Windows semantic design of the database, which is represented by platform for their work. the semantic schema. We refer to [2] for a thorough discussion of the semantic modeling approach. Certain user requirements are related to the applications to be constructed to facilitate the users' needs in querying For visual presentation and printouts, the semantic the database, saving the query results, importing the schema may be depicted as a set of diagrams called sub results into third-party software tools, and making schemas of the semantic schema. A category is displayed modifications to the data. It is not expected that an on the schema as a box with the category name in average database user, being a specialist in biology and uppercase bold letters. Attributes appear inside the environmental sciences, will possess skills in database corresponding category boxes in italic letters. Relations query languages such as SQL (structured query language) between the categories are denoted by arrows. The names or programming. It was decided to create a set of intuitive of the relations appear in bold letters. Cardinality of the and easy-to-use forms to assist clients in their work with relations as well as additional constraint information is the data utilizing the software tools provided in the placed on the semantic schema in italics. Oracle package. Figure presents a sub-schema for taxonomic Some user requirements are related to data maintenance. classification of plant species fl)>m the semantic schema In particular, new data has to be regularly uploaded to the for the vegetation database. Categories CATEGORY, database, preferably automatically. FAMILY, GENUS, and SPECIES store the taxonomy structure for plant species. In particular, each species 3. SEMANTIC SCHEMA FOR THE DATABASE belongs to a certain genus, which in tum is part of tlte family, and families comprise a category. We also store The semantic modeling approach was chosen for the information on the subspecies in category SPECIES, database design. This approach allows the unification of with the relation is-part-of linking subspecies to the different database models into one framework. Most of corresponding species record. The relations, the concepts and languages may be presented in terms of corresponding to the above-mentioned categories, namely a unifying semantic model. The other models may be the-genus, the-family, the-category, are marked as total, 433 indicating mandatory status of the relationships the categories PLANT and SPECIES has to be changed infonnation. Thus each plant avoiding reload of the whole record for this plant. Finally, the category SYNONYMS was introduced to store the alternative names for the plant species used in FAMILY the works of some researchers.