Historical Vehicle Traffic Analysis and Commute Time Prediction Using Web Mining
Total Page:16
File Type:pdf, Size:1020Kb
University of Calgary PRISM: University of Calgary's Digital Repository Graduate Studies The Vault: Electronic Theses and Dissertations 2015-06-16 Historical Vehicle Traffic Analysis and Commute Time Prediction Using Web Mining Kaur, Charanjeet Kaur, C. (2015). Historical Vehicle Traffic Analysis and Commute Time Prediction Using Web Mining (Unpublished master's thesis). University of Calgary, Calgary, AB. doi:10.11575/PRISM/26373 http://hdl.handle.net/11023/2302 master thesis University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission. Downloaded from PRISM: https://prism.ucalgary.ca UNIVERSITY OF CALGARY Historical Vehicle Traffic Analysis and Commute Time Prediction Using Web Mining by Charanjeet Kaur A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE GRADUATE PROGRAM IN ELECTRICAL AND COMPUTER ENGINEERING CALGARY, ALBERTA MAY, 2015 © Charanjeet Kaur 2015 Abstract Analyzing historical vehicle traffic data has many applications including urban planning and intelligent in-vehicle route prediction. A common practice to acquire this data is through roadside sensors. This approach is expensive because of infrastructure and planning costs and cannot be easily applied to new routes. A Web mining approach is proposed to address these limitations. The proposed system gathers information about vehicle commute times, accidents, and weather reports from heterogeneous Web sources. This information is combined to support vehicle traffic analytics. Clustering analysis is performed on historical data that investigates the traffic patterns of highways and arterial roads with factors having the most impact on commute time. A commute time prediction model is built on historical vehicle traffic data analytics. Commute time prediction model is trained with the traffic problems faced in the past and forecasts the commute time incorporating the impact of external factors such as weather and accidents. ii Preface Conference Proceeding: Kaur, C., Krishnamurthy, D., Far, B.H., Using Web Mining to Support Low Cost Historical Vehicle Traffic Analytics, 26th International Conference on Software Engineering and Knowledge Engineering, SEKE 2014. iii Acknowledgements I would like to take the opportunity to thank all the people who made this work possible. My deepest gratitude and appreciation goes to my supervisor, Dr. Diwakar Krishnamurthy, for his remarkable guidance. I am sincerely grateful to my co-supervisor, Dr. Behrouz H. Far, for teaching and inspiring me over the past two years. I would like to thank both the professors for their valuable time and efforts to make this research possible and also for providing financial support throughout my research. I would also like to thank my friend, Sukhpreet Dhaliwal, for helping me to refine the thesis and providing suggestions. Last but not the least, I am grateful to my parents for supporting me spiritually throughout my life. I thank my husband for standing by me through the good and bad times. iv Dedication I dedicate this thesis to my parents for making me be who I am, and my husband, Amandeep Sekhon, for supporting me all the way. v Table of Contents Abstract ............................................................................................................................... ii Preface ................................................................................................................................ iii Acknowledgements ............................................................................................................ iv Dedication ............................................................................................................................v Table of Contents ............................................................................................................... vi List of Tables ................................................................................................................... viii List of Figures and Illustrations ......................................................................................... ix List of Symbols, Abbreviations and Nomenclature ........................................................... xi CHAPTER ONE: INTRODUCTION ..................................................................................1 1.1 Background and Motivation ......................................................................................1 1.2 Research Objectives ...................................................................................................4 1.3 Contributions .............................................................................................................5 1.4 Thesis Organization .................................................................................................11 CHAPTER TWO: LITERATURE REVIEW ....................................................................12 2.1 Background ............................................................................................................12 2.2 Related Work .........................................................................................................16 2.1.1 Sensor Infrastructures for Traffic Monitoring .................................................17 2.1.2 Traffic Characterization ...................................................................................21 2.1.3 Travel Time Prediction Models .......................................................................23 2.3 Summary ................................................................................................................26 CHAPTER THREE: WEB DATA COLLECTION METHODOLOGY ..........................29 3.1 Data Collection Process ...........................................................................................29 3.1.1 Google Maps ...................................................................................................31 3.1.2 Twitter Search API ..........................................................................................35 3.1.3 Historical Weather Reports .............................................................................38 3.2 Data Overlaying .......................................................................................................38 3.3 Summary ..................................................................................................................40 CHAPTER FOUR: HISTORICAL VEHICLE TRAFFIC ANALYSIS OF DEERFOOT TRAIL .......................................................................................................................41 4.1 Deerfoot Trial Commute Time Analysis .................................................................44 4.1.1 Q1. What are the peak and off-peak traffic hours? ........................................44 4.1.2 Q2. What are the traffic conditions at peak hours? ........................................48 4.1.3 Q3. How are peak hours related to number of accidents? ..............................54 4.1.4 Q4. What are the bottleneck road segments during peak hours? ...................58 4.2 Summary ..................................................................................................................71 CHAPTER FIVE: UNIQUE TRAFFIC PATTERNS WITH K-MEANS CLUSTERING73 5.1 Clustering Analysis ..................................................................................................73 5.2 Commute time patterns ............................................................................................76 5.3 Summary ..................................................................................................................88 vi CHAPTER SIX: COMMUTE TIME PREDICTION MODEL .........................................90 6.1 Prediction Model ......................................................................................................90 6.2 Prediction Performance ............................................................................................96 6.3 Summary ................................................................................................................106 CHAPTER SEVEN: CONCLUSIONS AND FUTURE WORK ....................................107 7.1 Conclusions ............................................................................................................107 7.2 Future work ............................................................................................................111 REFERENCES ................................................................................................................113 vii List of Tables Table 4.1: Deerfoot Trail - north to south commute time (minutes) statistical results at morning hours ....................................................................................................................... 45 Table 4.2: Deerfoot Trail - north to south commute time (minutes) statistical results at evening hours ........................................................................................................................ 45 Table 4.3: Deerfoot Trail – south to north commute time (minutes) statistical results at morning hours ......................................................................................................................