A Geospatial Web Approach to Exploring Online Epidemiological

A Geospatial Web Approach to Exploring Online Epidemiological Information Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Arts in the Graduate School of The Ohio State University By Qian Hao, M.A. Graduate Program in Geography The Ohio State University 2011 Thesis Committee: Ningchuan Xiao, Advisor Mei-Po Kwan Ola Ahlqvist Copyright by Qian Hao 2011 Abstract The World Wide Web provides a tremendous amount of information about diseases and their environments, and much of the information has its geographic contexts. Effectively exploring such information, however, presents a significant challenge to GIScience research because the data is often ill-organized on the web. Commonly used search engines such as Google can only provide a list of raw web pages which often do not contribute to discovering knowledge about the diseases. In this thesis, a geospatial web approach will be developed to efficiently exploring online epidemiological information. A geospatial web organizes information based on the geographic and ontological relationships rather than merely key words. We will focus on news articles about the foot and mouth disease and construct an ontology that specifies the relationship between relevant diseases, geographical terms, social and economical concepts, and cultural contexts. A prototype of this geospatial web approach will contain several components, such as a list of a few foot and mouth disease news, a map showing where these news articles are reported or happened, an ontology graph of this domain, a list of news with topics related to the term we are searching and a list of news happened nearby. This prototype not only allows people to explore closely related information in terms of semantics and locations but provides an effective way to visualize and analyze such information. ii Key words: geospatial web, ontology, epidemiology, knowledge, foot-and-mouth disease iii Dedication Dedicated this thesis to my parents iv Acknowledgments I would like to give many thanks to my advisor, Dr. Ningchuan Xiao, for his great help on my thesis and spending a lot of time on discussing and advising me on this research. Also I want to thank Dr. Mei-Po Kwan and Dr. Ola Ahlqvist for their comments and suggestions on my thesis. Special thanks to the Dr. Rebecca Garabed, Dr. Laura Pomero and Dr. Mark Moritz for their helpful inputs for the foot and mouth disease ontology used in this research. I also give my gratitude to my friend, Yanfei Yin and Rong Cong, for their support and help. And my department colleagues, Wei Chen, for his help on some technical issues. Especially thank to my mother, without her support and love, I cannot finish my thesis and study. v Vita Jul. 2009.................................................. B.S. Geography, Central South University Jan. 2011 to Aug. 2011........................... Graduate Research Associate, Department of Geography, The Ohio State University Fields of Study Major Field: Geography vi Table of Contents Abstract ........................................................................................................................ ii Dedication ................................................................................................................... iv Acknowledgments........................................................................................................ v Vita............................................................................................................................... vi Table of Contents ....................................................................................................... vii List of Tables .............................................................................................................viii List of Figures ............................................................................................................. ix CHAPTER 1: Introduction ........................................................................................... 1 CHAPTER 2: Methodology and Framework.............................................................. 10 CHAPTER 3: Implementation of the Geospatial Web Application………………….22 CHAPTER 4: Application Interface and Results ....................................................... 44 CHAPTER 5: Conclusions.......................................................................................... 53 REFERENCE ..............................................................................................................57 Appendix A: User‟s Manual………………………………………………………….61 Appendix B: Sample saved news………………………………………………….....63 vii List of Tables Table 1. The results of locations extraction methods‟ precision ................................ 34 Table 2. The summary of main location detection method .........................................36 Table 3. Thesaurus for concepts in the ontology ........................................................ 39 viii List of Figures Figure 1. A simple communication between client and server……………………... 12 Figure 2. An example of using CGI technique…………………………………...… 12 Figure 3. A common communication between client and server……………………13 Figure 4. An example of how Ajax technique works………………………………. 17 Figure 5. A framework of a geospatial web application……………………………. 20 Figure 6. Workflow of developing a geospatial web application for exploring online epidemiological information………………………………………………………... 21 Figure 7. Code snippet of using Perl to make a news search request……………..... 23 Figure 8. Code snippet of saving the news as a text file…………...……………….. 24 Figure 9. Code snippet of encoding process…………...…………………………… 25 Figure 10. Encoded html file on the server…………………...…………………….. 25 Figure 11. Webpage table for storing attributes for every news……………………..26 Figure 12. Html table for storing html source code of each news………………….. 27 Figure 13. Code snippet of saving news into database using Perl………………….. 28 Figure 14.Code snippet for saving location information into database….................. 29 Figure 15. Code snippet for locations extraction……………………...……………. 30 Figure 16. Locations extraction results for one particular news…………...……….. 31 Figure 17. Example of equal search criteria and sub-string search criteria……...…..32 Figure 18. Records contains “Columbus”………………………………...………… 33 ix Figure 19. Location table for storing locations in each news……………...……….. .34 Figure 20. A tentative ontology system for foot-and-mouth disease domain…..........38 Figure 21. Ontology graph drawn by HMTL canvas element…………...…………. .41 Figure 22. Logical process for the implementation of the web application……...…. .43 Figure 23. The overview of the semantic geospatial web application for exploring epidemiological information………………………………………………………... .44 Figure 24. The section for active news……………………………...……………… .45 Figure 25. The section for showing map with locations………………...………….. .46 Figure 26. Example of exploring information by clicking the map…………...……. .48 Figure 27. The section for list five geographic related news to the active news…..... 48 Figure 28. Example of before clicking on the geographic related news……………. 49 Figure 29. Example of after clicking on the geographic related news……………….50 Figure 30. Sections for showing ontologically related news and the ontology graph.50 Figure 31. Example of before clicking on the ontology graph……………………….52 Figure 32. Example of after clicking on the ontology graph…………………………52 x Chapter 1: Introduction 1.1 Background The World Wide Web (WWW) has evolved to be an enormous data repository for all kinds of purposes and it has become increasingly difficult to retrieve desirable information from the web efficiently. The traditional information retrieval methods, typically built around Internet search engines such as Google and Yahoo!, often do not satisfy users‟ information needs, because they do not capture the syntactic and semantic aspects of words. The current information searching and presenting methods are limited to key-word searches and present the results in lists of web pages many of which are irrelevant to the searched concepts. With the growth and development of the Internet, more efficient information retrieval techniques and presentation methods are needed. The problem with the current information searching and presenting methods is that it only considers the searched terms as strings without meanings so that the computers cannot filter out the irrelevant information. To fill this gap, Tim Berners-Lee introduced the concept of the “semantic web” as the next generation of the current web and as an environment which enables computers and human cooperate based on well-defined meanings (Berners-Lee et al. 2001). In this way, the web will become a knowledge-based web which will provide qualitative services with a consideration of the underlying semantics for the data, web pages, and other web sources (Ding et al. 1 2002). The semantic web will be able to understand, identify, integrate, and filter different kinds of information from various sources and to return more relevant results than the current web searching method. For example, in a semantic web, when users purchase their flight tickets or book a hotel, the system could compare their calendars and schedules and return those results without a time conflict. A geospatial semantic web is one kind of semantic web with particular attention on geospatial context (Egenhofer, 2002). The spatial properties of objects can be considered an important

Load more