Spatial Data Mining Deren Li · Shuliang Wang · Deyi Li
Total Page:16
File Type:pdf, Size:1020Kb
Spatial Data Mining Deren Li · Shuliang Wang · Deyi Li Spatial Data Mining Theory and Application 1 3 Deren Li Deyi Li Wuhan University Tsinghua University Wuhan Beijing China China Shuliang Wang Beijing Institute of Technology Beijing China ISBN 978-3-662-48536-1 ISBN 978-3-662-48538-5 (eBook) DOI 10.1007/978-3-662-48538-5 Library of Congress Control Number: 2016930049 © Springer-Verlag Berlin Heidelberg 2015 Translation from the Chinese language second edition: 空间数据挖掘理论与应用 (第二版) by Deren Li, Shuliang Wang, Deyi Li, © Science Press 2013. All Rights Reserved This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper This Springer imprint is published by SpringerNature The registered company is Springer-Verlag GmbH Berlin Heidelberg Foreword I Advances in technology have extended computers from meeting general comput- ing needs to becoming sophisticated systems for simulating human vision, hear- ing, thinking and other high-level intelligence needs. Developers who envisioned building computerized expert systems in the past, however, quickly encountered the problem of establishing a suitable expert knowledge base. Subsequently, they turned to machine learning in an effort to develop a computer with self-learning functions similar to humans. But they too faced a question: from what source does the computer learn? Meanwhile, large amounts of data were becoming available more and more rapidly, and the developers came to the realization that a lot of knowledge is hidden inside the data. The information can be retrieved from data- bases via database management systems, and the knowledge also can be extracted from databases via data mining (or knowledge discovery), which then might fur- ther constitute a knowledge base for a computerized expert system automatically. In 1993, Prof. Deren Li and his brother, Prof. Deyi Li, discussed the trends of data mining in the computer science community. At that time, Deren Li believed that the volume, size, and complexity of spatial databases were increasing rapidly. Spatial data for a specific problem or environment appeared to be an accumula- tion of primitive, chaotic, shapeless states, and using it as a source for creating rules and order was questionable. They decided to explore the possibilities of spa- tial data mining. At the Fifth Annual Conference on Geographical Information Systems (GIS) in Canada in 1994, Deren Li proposed the idea of knowledge dis- covery from a GIS database (KDG) for uncovering the rules, patterns, and outliers from spatial datasets for intelligent GIS. Subsequently, the brothers planned to co- supervise a doctoral candidate to study spatial data mining, which was achieved when Kaichang Di began to study under them in 1995. After Kaichang received his doctorate in 1999, both brothers further supervised Shuliang Wang to continue the research because he also had a strong interest and foundation in spatial data mining research. Shuliang’s doctoral thesis was awarded as the best throughout China in 2005. v vi Foreword I In the scientific community of China, an academician who is a member of the Academy of Science or the Academy of Engineering is considered the most reputable scientist. The number of academicians who are brothers is very small, and even fewer brother academicians from different subjects have co-supervised one doctoral student. Furthermore, it is rarer still for both academician brothers and their doctoral student to co-author a monograph. Deren Li and Deyi Li are an admirable example of all these achievements! They are leading and inspiring the entire spatial data mining community to move forward by their quick thinking, profound knowledge, farsighted vision, pragmatic attitude, persistent belief, and liberal treatment. The achievements from their enduring relentless study of spatial data mining have attracted more and more researchers (i.e., academicians, professors, associ- ate professors, post-doctoral assistants, and doctoral and master’s students from geospatial information science, computer science, and other disciplines). Their research team—comprised of experienced, middle-career, and young scientists— are studying and working together. Their problem-oriented theoretical research is being practically applied to serve society and benefit the people using their own innovative research results rather than simply using existing data mining strate- gies. Most of their research results have engendered invitations to make presen- tations at many international conferences to the acclaim of many international scholars. Also, their research is published in international journals, and respected institutions are utilizing their work. Thanks to their great efforts, spatial data min- ing has become a great crossover, fusion, and sublimation of geo-spatial informa- tion science and computer science. This monograph is the first summary of the collective wisdom from their research after more than ten years in the field of spatial data mining, as well as a combination of their major national innovative research and the application of horizontal outcomes. In this monograph, there are interesting theoretical contributions, such as the Deren Li Method for cleaning spatial observed data, the data field method for modelling the mutual interactions between data, the cloud model method for transforming between a qualitative concept and quantitative data, and the mining view method for depicting different user demands under different mining tasks. The methodology is presented as a spatial data mining pyramid, and the mining mechanism of rules and outliers are well explained. GIS and remote sensing data are highlighted in their applications along with a spatial data mining prototype sys- tem. Having engaged in theoretical studies and applications development for many years, the authors did their best to focus closely on the characteristics of spatial data, both in theory and practice, and solving practical problems. The interdiscipli- nary characteristics of this monograph meet the requirements of geo-spatial infor- mation science academics looking to expand their knowledge of computer science and for computer science academics seeking to more fully understand geo-spatial information science. To produce this excellent book, they invited many people from different fields to proofread the draft many times. All of the helpful comments provided to them were acknowledged as they revised and refined the manuscript. Their meticulous approach is commendable and produced a book worth learning. Foreword I vii In this era of data and knowledge economy, spatial data mining is attracting increasingly more interest from scholars and is becoming a hot topic. I wish suc- cess to the authors with this book as well their continuing research. May they attain even greater achievements! May more and more brightly colored flowers open upon the theory and application of spatial data mining. January 2005 Shupeng Chen A member of the Chinese Academy of Science A Scientist in Geography, Cartography, and Remote Sensing Foreword II Rapid advances in the acquisition, transmission, and storage of data are produc- ing massive quantities of big data at unprecedented rates. Moreover, the sources of these data are disparate, ranging from remote sensing to social media, and thus possess all three of the qualities most often associated with big data: volume, velocity, and variety. Most of the data are geo-referenced—that is, geospatial. Furthermore, internet-based geospatial communities, developments in volunteered geographic information, and location-based services continue to accelerate the rate of growth and the range of these new sources. Geospatial data have become essen- tial and fundamental resources in our modern information society. In 1993, Prof. Deyi Li, a scientist in artificial intelligence, talked with his older brother, Prof. Deren Li, a scientist in geographic information science and remote sensing, about data mining in computer science. At that time, Deren believed that the volume, variety, and velocity of geospatial data were increasing rapidly and that knowledge might be discovered from raw data through spatial data mining. At the Fifth Annual Conference on Geographic Information Systems (GIS) in Canada in 1994, Deren proposed the idea of knowledge discovery from GIS data- bases (KDG) for uncovering the rules, patterns, or outliers from spatial datasets in intelligent GIS. Subsequently, both brothers were farsighted in co-supervising