Geographical 237

Geographical Bioinformatics Systems

Tariq Rahim Soomro, Fujairah College, Fujairah, UAE, [email protected] Ghassan Al-Qaimari, Fujairah College, Fujairah, UAE, [email protected] Hassan Wahba, Al-Ain University of Science & Technology, Al-Ain, UAE, [email protected]

Abstract store, analyze and integrate biological and genetic Geographical Bioinformatics Systems (GBS) . represent the marriage between two distinct and well established fields, GIS & Bioinformatics. The collaboration of these two distinct fields, GIS Bioinformatics refer to the use of to and Bioinformatics was initiated by handle biological information—that is, the use Office of GIS and Remote Sensing (OGIS) and the computers to store , compare , retrieve , analyze and Virginia Bioinformatics Institute (VBI), in 2001, predict the composition or structure of bio- when a conference focused on the interface between . On the other hand, the term Geographic GIS and Bioinformatics was organized to study the Information Systems (GIS) refers to the prospects of research in these two fields. The computerized management for the participants of conference strongly indorsed the capture, storage, retrieval, analysis, and display of merger between the two fields, as GIS and spatial . GIS is no longer restricted to the Bioinformatics have much in common, most notably fields of science, geography, cartography digital maps, large , and research involving or environmental science, but nowadays GIS tools , , and most and techniques are successfully used and significantly analysis of data [3]. implemented in many other fields of life, such as advertising, agribusiness, banking, entertainment, This paper explores the future research in this new estate management, finance, food services, health multidisciplinary field. The paper is organized as services, insurance, manufacturing, pharmaceutical follow: section 2 reviews the applications of GBS; services, retail and utilities, social services, travel section 3 highlights future research trends in GBS; services and many more. This paper explores the section 4 discusses future directions and concluding merger of these two distinct fields GIS & remarks. Bioinformatics, and also explores the prospect for research and development in these two areas. 2. Applications of GBS 1. Introduction The authors propose the term Geographical In this paper we use the term Geographical Bioinformatics Systems (GBS) to describe a new Bioinformatics Systems (GBS) to describe the field of research and development that utilizes merger between GIS and Bioinformatics. expertise from two well-established fields, GIS and Geographical Information Systems (GIS) refer to a Bioinformatics. The term GBS refers to the complex technology, starting with the digital application of in the management representation of landscapes captured by cameras, of biological information using the spatial and digitizers or scanners, in some cases transmitted by temporal maps and databases to collect and study satellite, and then with the help of computer the complex patterns and analyze its effects. systems, storing, checking, manipulating and Researchers in [3] agreed to use GIS applications to enhancing images to be analyzed and displayed as find and track large pattern, for example, geographic data referenced to the earth. GIS today is a well distribution of and other diseases in human, established field with many practical applications. animal and plant etc. They are also The power of GIS not only in the ability to exploring the potential of using GIS to visualize and visualize spatial relationship, but beyond the space enhance the presentation of Bioinformatics data, and to a holistic view of the world with its many to identify the opportunities Bioinformatics presents interconnected subsystems and complex relationship to the GIS research . They are also [1]. Bioinformatics on the other hand is an exciting exploring from another perspective the use of and rapidly growing field that uses computer Bioinformatics methodologies in GIS aiming at technology and to study enhancing current GIS techniques, and identifying biological information and structures. More new approaches to pattern recognition and data specifically, it is the science of developing computer analysis that could be used specifically for GBS databases and to facilitate and expedite purposes. biological research, particularly in [2]. In simple terms, it is an application of computer The following is a short review of GIS applications technology to the management of biological that analyze and visualize Bioinformatics data to information. Computers here are used to gather, help Bioinformatics experts study biological

Communications of the IBIMA Volume 5, 2008 238 Tariq Rahim Soomro, Ghassan Al-Qaimari and Hassan Wahba

phenomena. These applications represent the a biomedical text-mining system focused on four solution for various Bioinformatics databases from types of -related information: biological GIS perspectives. functions, associated diseases, related and gene-gene relations. The aim of this system is to READ-IT (Research, Education, and Dissemination provide researchers with a user-friendly bio- via ) is a information service [8]. Bioinformatics/GIS program aimed at protecting chimpanzees. It is a system for the integration of Fungi is designed at University of Michigan to technology; research and education, through a explore spatial relationships between specimen data project designed to study and protect chimpanzees and other data using a GIS as one of the in Tanzania. This research is designed to develop bioinformatics method. This project makes it “interesting and compelling content for integrated possible to create maps showing the relationship research and educational opportunities for between the location of specimens and political undergraduate, graduate and professional students at units (states and counties of USA), major highways, Virginia Tech.” Some of those are involved with major lakes and rivers, and federal land ownership the “Bush to Base Bio-” and the GIS [9]. components of the program. In this program, field researchers gather physiological data on the chimps, Map Viewer is a tool designed at the National and code and the data using handheld Center for Information (NCBI) that modules with Global Positioning Systems (GPS) allows a user to view an ’s complete capability. The modules are designed to interface , integrated maps for each chromosome with a Virginia Tech-based Web-enabled server (when available), and/or sequence data for a designed to share information with authorized users genomic region of interest [10]. [4]. MultiStore is a research infrastructure designed at Micro-Mar is a database designed especially for University of Buffalo for supporting integrated dynamic representation of marine microbial research in specific targeted areas of computer . It is created to collect DNA diversity science in the fields including Multimedia, information form marine prokaryotes for bio- Visualization, GIS and Bioinformatics. The research geographical and ecological analysis. The database objectives is to develop computational theories and aims to integrate molecular data and taxonomic algorithms for storing, managing, analyzing, affiliation with bio-geographical and ecological querying, and visualizing multi-dimensional data features that will allow experts to have dynamic sets that are generated from the related fields [11]. representation of the marine microbial diversity embedded in a user friendly web interface. In this ETI Projects at ETI Bioinformatics, in Amsterdam, program, GIS option provides an interface for is an NGO in operational relations with UNESCO, selecting a particular location on the world participating in two GIS projects funded by NWO, map and getting all the sequences from the location the Dutch national science foundation. The first and their details [5]. project records and identifies the impact of global change on the biological diversity of the North Sea GenoSIS (Genome Data Interpretation Using GIS) and second project monitors climate changes in the is a genome spatial based on the Indonesian coral reef biotas. For these projects, ETI idea of “spatial genomics”, applying concepts and is building databases, programming queries and tools of spatial analysis and GIS to the interpretation preparing maps and map layers, using MapIt of genome data. It uses “off the shelf” GIS software package [12]. software, for example, ESRI’s ArcGIS and Orcale Spatial, to reuse existing spatial analysis, The majority of the above applications in classification, querying, and visualization tools for Bioinformatics are in the form of genome data analysis [6]. (e.g. DNA), genome annotation, computational evolutionary , measuring biodiversity, GEOBASE is a simple geographical information analysis of , analysis of regulations, system designed for personal computers. This analysis of expression, analysis of system allows representation and elementary in cancer, prediction of , analysis of geographically coded information, and , modeling biological handles two kinds of data: maps and facts, where systems, high-throughput and map data describe the basis on which the fact data protein-protein docking; all these research areas are located [7]. have analysis, prediction, comparative studies and modeling. On the other hand, GIS is basically a Gene Information System is a biomedical text- spatial database software tool, which has also mining system for gene information discovery. It is analysis, prediction, comparison and modeling

Communications of the IBIMA Volume 5, 2008 Geographical Bioinformatics Systems 239

capabilities. Given that there are many cases where environment phenomena (such as climate changes, the spatial organization of genome features has been emerging and resurgent infectious diseases) has on shown to have biological significance; this human-health. In public health, GIS can play a vital consequently leads researchers to consider role to resolve issues that required spatial analysis extending GIS to handle Bioinformatics. Recent and spatial [16]. studies related to gene finding and genome annotation, provide evidence of complex spatial GIS, which emerged several decades ago, has interrelationships of genome features, such as genes matured to the point that the incorporation of its being alternatively spliced, or nested, or intelligent techniques in GBS will potentially help overlapping. In addition, conservation of gene order to effectively track diseases in in microbes has been a subject of a great deal of populations. The integration of Bioinformatics data, analysis [6]. An example of the this integration genomics and data will be telling the trend between the two fields is the recent work to story of what takes place at the and sub-cell use “predictive methods” for the analysis of level in individual diseased and will soon complex medical data in relation to the environment enable epidemiologists using GIS techniques to [13]. capture the how of disease outbreaks. Both fields are heavily relying on mining, managing, accessing, and analyzing large amounts of data [3]. 3. Future of GBS Bioinformatics is a rising field that seeks to apply the tools and techniques 4. Conclusion of computer science to the management and analysis GIS and Bioinformatics share some common of . Bioinformatics involves the characteristics, most importantly is the fact that their incorporation of computers, software tools, and applications relate to environment, that they are databases in an effort to address biological characterized by large dedicated databases; and that questions. Bioinformatics approaches are often used most of their data manipulation takes place using for major initiatives that generate large data sets. widely adopted and well-supported commercial or Bioinformatics is a new science and a new way of open source applications [17]. Since disease thinking that could potentially lead to many relevant mapping is a major focus of spatial , biological discoveries. The future of GBS is in the we believe the future role of GIS will become more integration of multiple Bioinformatics databases and crucial in handling disease outbreak surveillance in the utilization of the visualization techniques systems and in helping decision makers interpret available in GIS for the purpose of identifying and analyze both, routine- and outbreak-related biological trends and relationships. For example, health data. That is, it will help to provide insight the integration of a wide of data sources into possible causes of diseases, clusters of disease such as clinical and genomic data will allow outbreaks across particular geographic areas, and the specialists to use disease symptoms to predict of disease outbreaks. GIS will effectively genetic mutations and vice versa. Another future help scientists interpret disease patterns with the area of research in GBS is large-scale comparative help of built-in query capabilities, and they will be genomics. For example, the development of tools able to select different data layers to query for that can do multiple ways of comparisons between disease-related information by selecting will push forward the discovery rate in geographical areas of interest [18]. bioinformatics. Along these lines, the modeling and visualization of full networks of complex systems The Bioinformatics arena is very broad and could be used in the future to predict how the encompasses many problems, such as finding gene system reacts to a drug, for example [14]. sequences, construction of molecular pathway, prediction of protein structure etc. Extending There are many aspects where intelligent GIS Bioinformatics applications with a GIS interfaces, techniques played an important role when utilized in with their spatial tools and techniques, enable these existing Bioinformatics system to compute accurate, applications to overcome their limitations and best, error-free, and fast results [15]. The trend will enhances their ability to visualize data. GBS is a continue toward building intelligent GBS systems hybrid and emerging field of research and that incorporate , collaborative agents, development, capable of handling spatial biological and sophisticated visualization techniques to queries, and analyzing data for the purpose of analyze, predict and present information more modeling and predicting disease behavior and other effectively. biological phenomena.

Geographical Bioinformatics Systems (GBS) will allow researchers to do a far better job of monitoring, quantifying, and predicting the impact

Communications of the IBIMA Volume 5, 2008 240 Tariq Rahim Soomro, Ghassan Al-Qaimari and Hassan Wahba

5. References http://herbarium.usu.edu/fungi/FunFacts/bioinform. [1] Designing Issues of Components-Based htm ZGisObjects, Tariq Rahim Soomro, Kougen Zheng and Yunhe Pan, 4th International Conference on [10] A Science Primer, Just the facts: a basic Computational & Multimedia introduction to the science underlying NCBI Applications, 2001, Proceeding by IEEE Computer resources, 2004, Retrieved April 2008, Society. www.ncbi.nlm.nih.gov/About/primer/bioinformatics http://csdl.computer.org/comp/proceedings/iccima/2 .html 001/1312/00/13120338abs.htm [11] Overview, Research [2] UNO Bioinformatics Home page, Group, University of Buffalo, Retrieved April 2008, Retrieved April 2008, www.cse.buffalo.edu/DBGROUP/Infrastructure/ove http://www.unomaha.edu/bioinformatics/ rview.html

[3] VBI Virginia Tech Conference Puts New [12] ETI BioInformatics – Geographical Research Area on the Map, GIS Expert Michael information systems (GIS), Retrieved April 2008, Goodchild Echos Its Value, 2001, Retrieved April www.eti.uva.nl/Products/gis.php 2008, www.vbi.vt.edu/pr/press_releases/press_20010628_ [13] MPBA, Machine , GIS and news_gisexpert.htm Bioinformatics, Retrieved April 2008, http://mpa.itc.it/ [4] Bioinformatics/GIS program aim is to protect chimpanzees, Science Blog, 2003, Retrieved [14] What is Bioinformatics, Joanne Fox (UBC April 2008, Bioinformatics Centre), Retrieved April 2008, www.scienceblog.com/community/older/2003/A/20 http://bioinformatics.ubc.ca/about/what_is_bioinfor 037439.html matics/

[5] Micro-Mar: a database for dynamic [15] GIS: A weapon to combat the crime, Tariq Rahim Soomro, Syed Mehmood Raza Naqvi and representation of marine microbial biodiversity, th Ravindra Pushker, Giuseppe D’Auria, Jose Carlos Zheng Kougen, 5 World Multiconference on , and Informatics (SCI-2001) Alba-Casado and Francisco Rodrigues-Valera, th Journal of BioMed Central (BMC) Bioinformatics and 7 International Conference on Information 2005, Retrieved April 2008, and Synthesis (ISAS-2001), July http://www.biomedcentral.com/1471-2105/6/222 22-25, 2001, Orlando, USA. http://200.75.139.138:1081/ProceedingSCI/volumeI [6] GenoSIS: Genome Data Interpretation 2001.htm Using GIS, Mary E. Dolan, Constance Holden, M Kate Beard, and Carol J, Bult, 2002, Retrieved April [16] Public Health GIS News and Information, 2008, 2007, Retrieved April 2008, http://gis.esri.com/library/userconf/proc02/pap0719/ http://www.cdc.gov/nchs/about/otheract/gis/gis_pub p0719.htm lichealthinfo.htm

[7] GEOBASE: a simple geographical [17] Cyberinfrastructure, Data, and Libraries, information system on a personal computer, Part 2, Libraries and the Data Challenge: Roles and Philippe Blaise and Cesare Gessler, Oxford Actions for Libraries, Anna Gold, Massachusetts University Press, Oxford Journals of Bioinformatics Institute of Technology, D-Lib Magazine, 1990, Retrieved April 2008, September/October 2007, Volume 13 Number 9/10, http://bioinformatics.oxfordjournals.org/cgi/content/ ISSN 1082-9873, abstract/7/2/155 http://www.dlib.org/dlib/september07/gold/09gold- pt2.html [8] GIS: a biomedical Text-mining system for gene information discovery, Jung-Hsien Chiang, [18] Real-Time Syndromic Surveillance, 2006, Hsu-Chun Yu and Huai-Jen Hsu, Oxford University Retrieved April 2008, Press, Oxford Journals of Bioinformatics 2004, http://www.esri.com/news/arcuser/0206/geostat2of2 Retrieved April 2008, .html http://bioinformatics.oxfordjournals.org/cgi/content/ abstract/20/1/120

[9] Fun facts about Fungi, Robert Fogel, Ivins, 2000, Retrieved April 2008,

Communications of the IBIMA Volume 5, 2008 Geographical Bioinformatics Systems 241

Copyright © 2008 by the International Association (IBIMA). All rights reserved. Authors retain copyright for their manuscripts and provide this journal with a publication permission agreement as a part of IBIMA copyright agreement. IBIMA may not necessarily agree with the content of the manuscript. The content and proofreading of this manuscript as well as and any errors are the sole responsibility of its author(s). No part or all of this work should be copied or reproduced in digital, hard, or any other format for commercial use without written permission. To purchase reprints of this article please e-mail: [email protected] .

Communications of the IBIMA Volume 5, 2008