Predicting Future Locations and Arrival Times of Individuals
Total Page:16
File Type:pdf, Size:1020Kb
Predicting Future Locations and Arrival Times of Individuals Ingrid E. Burbey Dissertation submitted to the faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy In Computer Engineering Thomas L. Martin, Chair Mark T. Jones Scott F. Midkiff Manuel A. Perez-Quinones Joseph G. Tront 26 April, 2011 Blacksburg, Virginia Keywords: Context Awareness, Location Awareness, Context Prediction, Location Prediction, Time-of-Arrival Prediction Predicting Future Locations and Arrival Times of Individuals Ingrid E. Burbey ABSTRACT This work has two objectives: a) to predict people's future locations, and b) to predict when they will be at given locations. Current location-based applications react to the user‘s current location. The progression from location-awareness to location-prediction can enable the next generation of proactive, context-predicting applications. Existing location-prediction algorithms predict someone‘s next location. In contrast, this dissertation predicts someone‘s future locations. Existing algorithms use a sequence of locations and predict the next location in the sequence. This dissertation incorporates temporal information as timestamps in order to predict someone‘s location at any time in the future. Sequence predictors based on Markov models have been shown to be effective predictors of someone's next location. This dissertation applies a Markov model to two-dimensional, timestamped location information to predict future locations. This dissertation also predicts when someone will be at a given location. These predictions can support presence or understanding co-workers‘ routines. Predicting the times that someone is going to be at a given location is a very different and more difficult problem than predicting where someone will be at a given time. A location-prediction application may predict one or two key locations for a given time, while there could be hundreds of correct predictions for times of the day that someone will be in a given location. The approach used in this dissertation, a heuristic model loosely based on Market Basket Analysis, is the first to predict when someone will arrive at any given location. The models are applied to sparse, WiFi mobility data collected on PDAs given to 275 college freshmen. The location-prediction model predicts future locations with 78-91% accuracy. The temporal-prediction model achieves 33-39% accuracy. If a tolerance of plus/minus twenty minutes is allowed, the prediction rates rise to 77%-91%. This dissertation shows the characteristics of the timestamped, location data which lead to the highest number of correct predictions. The best data cover large portions of the day, with less than three locations for any given timestamp. Dedication This dissertation is dedicated to my family. To my parents, for sacrificing for my education and demonstrating perseverance, To my husband, for standing with me through my many (many!) ups-and-downs of the doctoral process, and To my children, who continue to delight and inspire me with their creativity, their insight and their joy. Grant Information Portions of this research were supported by the U.S. National Science Foundation under Grant DGE-9987586. Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the National Science Foundation. iii Ingrid Burbey Table of Contents CHAPTER 1 INTRODUCTION............................................................................................. 1 1.1 CONTRIBUTIONS .............................................................................................................. 2 1.2 SUMMARY ........................................................................................................................ 3 CHAPTER 2 BACKGROUND ............................................................................................... 5 2.1 MOTIVATION .................................................................................................................... 5 2.1.1 Single-User Applications for Prediction ..................................................................... 6 2.1.2 Privacy ........................................................................................................................ 8 2.1.3 Predicting Time ........................................................................................................... 9 2.2 OVERVIEW OF PREDICTION TECHNIQUES ....................................................................... 11 2.2.1 Supervised Learning ................................................................................................. 12 2.2.2 Unsupervised Learning ............................................................................................. 14 2.2.3 Number of Outputs .................................................................................................... 14 2.3 TEMPORAL AND SPATIO-TEMPORAL DATA MINING....................................................... 15 2.3.1 Taxonomies of Temporal Data Mining ..................................................................... 17 2.3.2 Enhancing a Sequence with Temporal Information.................................................. 19 2.4 LOCATION DETERMINATION .......................................................................................... 20 2.4.1 Symbolic Location ..................................................................................................... 21 2.5 RELATED WORK IN LOCATION- AND TEMPORAL-PREDICTION ....................................... 22 2.5.1 MavHome .................................................................................................................. 22 2.5.2 Using GPS to Determine Significant Locations........................................................ 24 2.5.3 Dartmouth College Mobility Predictions.................................................................. 25 2.5.4 Smart Office Buildings .............................................................................................. 26 2.5.5 Reality Mining at MIT ............................................................................................... 27 2.5.6 Other Related Work in Location-Prediction ............................................................. 27 2.5.7 Summary ................................................................................................................... 30 2.6 REPRESENTATION .......................................................................................................... 31 2.6.1 Representing Music ................................................................................................... 31 2.7 THE PREDICTION BY PARTIAL MATCH ALGORITHM ....................................................... 32 2.8 SUMMARY ...................................................................................................................... 34 CHAPTER 3 PREDICTING FUTURE LOCATIONS....................................................... 35 3.1 LOCATION-PREDICTION PROBLEM STATEMENT ............................................................. 35 3.2 THE DATA—FRESHMEN AT UCSD ................................................................................ 36 3.2.1 Preprocessing ........................................................................................................... 40 3.2.2 The MoveLoc (1-minute) Dataset ............................................................................. 41 3.2.3 The SigLoc (10-minute) Dataset ............................................................................... 42 3.3 THE EXPERIMENT .......................................................................................................... 44 3.3.1 Determining the Amount of Training Data ............................................................... 44 3.3.2 Training and Testing the Model ................................................................................ 46 3.3.3 Results ....................................................................................................................... 49 3.3.4 Entropy of Movement Patterns ................................................................................. 54 3.3.5 Analysis of Predictable Data .................................................................................... 57 3.3.6 Drawbacks of the Markov Model for Predicting Location ....................................... 67 3.4 SUMMARY ...................................................................................................................... 68 iv Ingrid Burbey CHAPTER 4 PREDICTING ARRIVAL TIMES ............................................................... 70 4.1 TEMPORAL-PREDICTION USING THE MARKOV MODEL .................................................. 70 4.2 THE ―TRAVERSING THE SEQUENCE‖ MODEL ................................................................. 76 4.3 THE TRAVERSING-THE-SEQUENCE ALGORITHM ............................................................ 77 4.3.1 An Example of the SEQ Model .................................................................................. 82 4.3.2 Advantages and Disadvantages of the SEQ Model................................................... 85 4.4 EXPERIMENT AND RESULTS ........................................................................................... 89 4.5 SCORING INDIVIDUAL PREDICTIONS .............................................................................