Design, Construction and Implementation of a Web-Based Database System for Tumor Suppressor Genes
Total Page:16
File Type:pdf, Size:1020Kb
DESIGN, CONSTRUCTION AND IMPLEMENTATION OF A WEB-BASED DATABASE SYSTEM FOR TUMOR SUPPRESSOR GENES By YANMING YANG A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2003 Copyright 2003 by Yanming Yang ACKNOWLEDGMENTS I would like to express my gratitude to Dr. Li M. Fu, my major adviser, for his guidance in the establishment of the research project and advice on my study progress. My thanks also go to Dr. Mark Yang of the Department of Statistics and Dr. Donald McCarty of the Department of Horticultural Science for serving as my committee members and their suggestions for finalizing the thesis. I would like to thank my wife, Xidan Zhou, and my daughters, JingRu and Kathleen, for their support to my personal life, their encouragement to my studies, and their sharing of frustration and happiness with me. iii TABLE OF CONTENTS Page ACKNOWLEDGMENTS ................................................................................................. iii TABLE OF CONTENTS................................................................................................... iv LIST OF FIGURES ........................................................................................................... vi ABSTRACT...................................................................................................................... vii CHAPTER 1 INTRODUCTION ...........................................................................................................1 1.1 Internet and Database Development ........................................................................ 1 1.2 On-line Gene Databases........................................................................................... 2 1.3 Tumor Suppressor Genes in Human Cancer............................................................ 3 1.4 Necessity and Significance of TSGDB.................................................................... 6 1.5 Thesis Organization ................................................................................................. 6 2 UNDERLYING TECHNOLOGIES FOR DBMS...........................................................7 2.1 Distributed Database Systems.................................................................................. 7 2.1.1 Distributed Environment Architecture........................................................... 7 2.1.2 Distributed Database Implementing Approaches......................................... 10 2.1.3 Pros and Cons of Distributed DBMSs ......................................................... 11 2.2 Database Models.................................................................................................... 12 2.2.1 Hierarchical Model ...................................................................................... 12 2.2.2 Network Model ............................................................................................ 13 2.2.3 Relational Model.......................................................................................... 14 2.2.4 Object Model................................................................................................ 14 2.2.5 Object-relational Model ............................................................................... 15 2.2.6 Semistructured Model .................................................................................. 15 2.2.7 Associative Model........................................................................................ 16 2.2.8 Context Model.............................................................................................. 17 2.3 Web-based DBMS Applications............................................................................ 18 2.3.1 Client/Server Architecture ........................................................................... 19 2.3.2 Java and Web-based Application................................................................. 21 3 DATA ACQUISITION AND WEB SITE CONSTRUCTION.....................................28 3.1 Data Acquisition .................................................................................................... 28 iv 3.1.1 Online Search for TSGs ............................................................................... 28 3.1.2 Gene Feature Selection ................................................................................ 32 3.2 Web Page Construction.......................................................................................... 33 3.2.1 TSGDB Homepage Creation......................................................................... 33 3.2.2 Construction of Individual Gene Web Pages............................................... 35 4 IMPLEMENTATION OF TSGDB WITH RELATIONAL MODEL...........................37 4.1 Relational Model Concept ..................................................................................... 37 4.2 Implementation of A Standalone Database............................................................ 39 4.2.1 Table Creation.............................................................................................. 39 4.2.2 Data Treatment and Bulk-loading................................................................ 40 4.2.3 SQL Manipulation........................................................................................ 42 5 IMPLEMENTATION OF A WEB-BASED DATABASE SYSTEM ..........................44 5.1 Building Web-based Information System.............................................................. 44 5.1.1 System Architecture..................................................................................... 44 5.1.2 Using Servlet as A Middle-layer Application.............................................. 46 5.2 Query Mechanism and Result Formatting ............................................................. 48 5.2.1 Transferring User Inputs to Queries............................................................. 48 5.2.2 Establishing Connections to TSGDB........................................................... 50 5.2.3 Querying Database and Formatting Results.................................................. 50 6 CONCLUSIONS AND FUTURE WORK ....................................................................53 APPENDIX UTILITY PROGRAMS AND HUMAN GENE FUNCTIONS ...................55 A.1 ServletUtilities ...................................................................................................... 55 A.2 TSGDBUtilities..................................................................................................... 55 A.3 Human TSGs and Their Functions........................................................................ 59 LIST OF REFERENCES...................................................................................................69 BIOGRAPHICAL SKETCH .............................................................................................72 v LIST OF FIGURES Figure page 2-1 Three tier client/server architecture.........................................................................21 3-1 Sample query result of the NCBI nucleotide database............................................29 3-2 TSGDB home page showing the major functions of the database..........................34 3-3 JavaScript and form method for pull-down window function.................................35 3-4 A sample Web page for a tumor suppressor gene. ..................................................36 4-1 TSG schema.............................................................................................................40 4-2 SQL statements defining the TSG schema..............................................................40 4-3 Bulk loader control file............................................................................................41 4-4 Sample result of an SQL query to TSGDB .............................................................43 5-1 Architecture of Web-powered TSGDB ...................................................................45 5-2 Partial code of the interface servlet program...........................................................46 5-3 A sample query input for the TSGDB.....................................................................51 5-4 A sample Web page displaying the result of a typical query. .................................52 vi ABSTRACT Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science DESIGN, CONSTRUCTION AND IMPLEMENTATION OF A WEB-BASED DATABASE SYSTEM FOR TUMOR SUPPRESSOR GENES By Yanming Yang May, 2003 Chairman: Li M. Fu Major Department: Computer and Information Science and Engineering The rapid growth of Internet technology has revolutionized every area of the information world. The Web-based informatics system, as one of the most emphasized research and application fields, has experienced a booming development ever since the emergence of Internet technology. Information trafficked through the Web is involved in almost every aspect of modern society. One of the most remarkable benefits brought up by Internet information exchanges is the application of numerous databases. On the Web, many subject-specific databases have been developed, such as those in genome sequences and specific diseases. However, a database as a comprehensive information source for tumor suppressor genes has not come into being, although those genes are extremely