Hyperlink Network System and Image of Global Cities: Webpages and Their Contents
Total Page:16
File Type:pdf, Size:1020Kb
HYPERLINK NETWORK SYSTEM AND IMAGE OF GLOBAL CITIES: WEBPAGES AND THEIR CONTENTS by Jae Soen Son A dissertation submitted to the faculty of The University of North Carolina at Charlotte in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Geography and Urban Regional Analysis Charlotte 2014 Approved by: Dr. Jean-Claude Thill Dr. Qingfang Wang Dr. Eric Delmelle Dr. Min Jiang ii © 2014 Jae Soen Son ALL RIGHTS RESERVED iii ABSTRACT JAE SOEN SON. Hyperlink network system and image of global cities: webpages and their contents. (Under the direction of Dr. JEAN-CLAUDE THILL) A distinctive trend of globalization research is a conceptual expansion that mirrors the penetration of globalization in various aspects of life. The World Wide Web has become the ultimate platform to create and disseminate information in this era of globalization. Although the importance of web-based information is widely acknowledged, the use of this information in global city research is not significant yet. Therefore, the purpose of this research is to extend the concept of globalization to the efficiency of information networks and the thematic dimensionality of the conveyed images from webpages. To this end, 264 global and globalizing cities are selected. The city hyperlink networks are constructed from the web crawling results of each city, and hyperlink network analysis measures the effectiveness of these hyperlink networks. The textual contents are also extracted from the crawled webpages, and the thematic dimensionality of the textual contents is measured by quantified content analysis and multidimensional scaling. The efficiency of the hyperlink network in information flow is confirmed to be a new consideration that shapes the globality of cities. The cities with high efficiency of connections have faster and easier access, which means better structure for city image formation. Specifically, social networking websites are the center of this information flow. This means that social interactions on the Web play a crucial role to form the images of cities. Apart from the positivity and the negativity of the city image, the iv dimensionality of cities on the thematic space denotes how they are expressed, discussed, and shared on the Web. The image status based on dimensions of globalization is an important starting point to city branding. It is concluded that a research framework handling information networks and images simultaneously deepens the understanding of how the structure and the contents on the Web affect the formation and maintenance of global city networks. Overall, this research demonstrates the usefulness of information networks and images of cities on the Web to overcome data inconsistency and scarcity in global city research. v DEDICATION This dissertation is dedicated to my brilliant and supportive wife, Kyeong Jin Lee, for all of her constant devotion and encouragement. vi ACKNOWLEDGEMENTS First and foremost, I would like to express my sincere gratitude to my advisor Knight Distinguished Professor Jean-Claude Thill for the continuous support of my PhD study and research, for his profundity, enthusiasm, motivation, mentorship, and solicitude. His endless guidance helped me in all of the stages of research and writing this dissertation. I could not have imagined having a better advisor and mentor for my PhD study. Besides my advisor, I would like to thank the rest of my dissertation committee for their encouragement and insightful comments. Prof. Qingfang Wang stimulated me to draw a big picture of my research and to have confidence in my research. Prof. Eric Delmelle advised me to elaborate important methodological parts of my research. Prof. Min Jiang suggested invaluable comments for the publication of this research in the near future. During my PhD program I received financial support from the UNC Charlotte graduate school, the department of geography, and the graduate and professional student government (GPSG). Graduate Assistant Support Plan (GASP) of the graduate school and my department provided tuition waivers and assistantships. My department and GPSG also provided me travel funding for conferences. With these financial supports I could better concentrate on research. My sincere thanks also go to Dr. Elizabeth Delmelle for her support as friend, officemate, and instructor of the courses for which I was a TA. I thank my former and present colleagues in the Urban and Regional System Analysis Laboratory (URSAL): Dr. Diep Dao, Zhaoya Gong, Mona Kashiha, Pooya Najaf, Ran Tao, Kailas vii Venkitasubramanian, Daniel Yonto, and Yuhong Zhou, for the stimulating discussions, for the technical solutions for this research (especially Zhaoya and Mona), and for sharing the PhD life for the past six years. In particular, I am grateful to Emeritus Professor Young-Woo Nam in Korea University for guiding me through the master’s program in human geography and encouraging me to keep studying at UNCC for my PhD. Last but not the least, I would like to thank my parents Gap-Soo Son and Seong- Sook Hong, my parents-in-law Young-Hoon Lee and Hyun-Ja Jang, for their continuous spiritual and financial support to me and my family. Also, I thank my daughters Serene and Arin for their endless love for me. They motivated me to complete PhD program. viii TABLE OF CONTENTS LIST OF TABLES xi LIST OF FIGURES xiii LIST OF ABBREVIATIONS xv CHAPTER I: INTRODUCTION 1 1.1. Statement of Research 3 1.2. Structure of the Dissertation 5 CHAPTER II: LITERATURE REVIEW 7 2.1. Globalization and Global Cities 7 2.1.1. Defining Globalization 7 2.1.2. Global City Where Globalization Occurs 13 2.2. Development of Measurements (Indices) for the Global City 22 2.2.1. Intrinsic Indices 25 2.2.2. Relational Indices 26 2.2.3. Qualitative Indices 28 CHAPTER III: IMAGE ON THE WEB FOR GLOBAL CITY STRATEGY 32 3.1. Image and Cities 33 3.2. The Web (the Internet) 35 3.3. Branding, Image, and Global City 37 3.4. Conceptualization of Image Extraction from Textual Contents 42 CHAPTER IV: RESEARCH QUESTIONS AND CONCEPTUAL FRAMEWORK 45 4.1. General Characteristics of Hyperlink Networks 47 4.2. Characteristics and Classification of Nodes 49 ix 4.3. Characteristics of the Quantified Text of Webpages 51 4.4. Synthesis of the Hyperlink Network and the Quantified Contents 54 CHAPTER V: RESEARCH DESIGN 56 5.1. Cities under Study 57 5.2. Data 74 5.2.1. Linguistic Characteristics 74 5.2.2. Search Engines and Web Crawlers 80 5.2.3. Data Extraction and Pre-processing 83 5.3. Methods for Hyperlink Network and Contents of Webpages 94 5.3.1. Graph Theory, Social Network, and Hyperlink Network Analysis 94 5.3.2. Quantitative Content Analysis 97 5.4. Research Objective 4.1 Methodology 100 5.4.1. Network Connectivity 100 5.4.2. Cluster Analysis 105 5.5. Research Objective 4.2 Methodology 107 5.5.1. Network Centrality 107 5.6. Research Objective 4.3 Methodology 111 5.6.1. Frequency Analysis 112 5.6.2. Text Mining Analysis 114 5.7. Research Objective 4.4 Methodology 117 CHAPTER VI: GENERAL CHARACTERISTICS OF HYPERLINK NETWORKS 119 6.1. Hyperlink Network Measurements and Its Discrepancy 119 6.2. Comparison to Established Ranking of City Globality 121 x 6.3. Composite Global City Hyperlink Index 125 CHAPTER VII: CHARACTERISTICS AND CLASSIFICATION OF NODES 132 7.1. Overview of High Centrality Websites 132 7.2. Centrality of Nodes 135 7.3. Premium Nodes 138 CHAPTER VIII: CHARACTERISTICS OF QUANTIFIED TEXT OF WEBPAGES 142 8.1. Selection of Words for Code Scheme Design 143 8.2. Distribution of Frequencies 145 8.3. Similarity of Global Cities 147 CHAPTER IX: SYNTHESIS OF HYPERLINK NETWORKS 162 AND THE QUANTIFIED CONTENTS 9.1. Correlation between Similarities of HNA and QCA 163 9.2. Distribution of Cities Considered Structure and Content 166 9.3. Contribution of the Composite Approach 171 CHAPTER X: CONCLUSIONS, LIMITATIONS, AND FUTURE RESEARCH 172 REFERENCES 176 APPENDIX A: MYSQL SCRIPT FOR DMOZ DB AND SEED URL 188 APPENDIX B: MYSQL SCRIPT FOR RESEARCH DATA TABLE 190 APPENDIX C: LIST OF THE TOP CENTRALITY NODES 192 APPENDIX D: RANKINGS BASED ON PROPORTION OF EACH CATEGORY 199 TO TOTAL WORDS APPENDIX E: WORD FREQUENCY AND PROPORTION FOR QCA 206 APPENDIX F: CORRELATION COEFFICENT FOR EACH CITY 213 xi LIST OF TABLES TABLE 2.1: World views of globalization 9 TABLE 2.2: Thematic classifications of global city researches from 1981 to 1998 14 TABLE 2.3: Conceptualizing inter-city linkages: a typology 17 TABLE 2.4: Examples of indices 23 TABLE 3.1: Five major components of webpages 36 TABLE 3.2: Six components of CBI 41 TABLE 3.3: Categories and factors of European City Brand Barometer 42 TABLE 3.4: Comparison between corporate branding and city branding 44 TABLE 5.1: Study cities from past global city research 58 TABLE 5.2: Distribution of population by continent 72 TABLE 5.3: Top ten languages used in the Web 76 TABLE 5.4: Number of final seed URLs 86 TABLE 5.5: Settings for crawling 93 TABLE 5.6: Descriptive analysis of networks 97 TABLE 5.7: Questions for refining research design of content analysis 100 TABLE 6.1: Correlation coefficient matrix among network measurements 121 TABLE 6.2: Correlation between 2012 GCI and hyperlink network measurements 122 TABLE 6.3: Descriptive statistics of selected network measurements 123 TABLE 6.4: City ranking based on new hyperlink index (GHI) 128 TABLE 6.5: Comparison between GHI and 2012 GCI 131 TABLE 7.1: Top websites by in-degree centrality and in-closeness centrality 136 xii TABLE 7.2: Top website by out-degree centrality, out-closeness centrality,