Estimating Footfall from Passive Wi-Fi Signals

THERE IS NO LOGIC THAT CAN BE SUPERIMPOSED ON THE CITY; PEOPLE MAKE IT, AND IT IS TO THEM, NOT BUILDINGS, THAT WE MUST FIT OUR PLANS. JANE JACOBS, THE DEATH AND LIFE OF GREAT AMERICAN CITIES INFORMATION IS THE OIL OF THE 21ST CENTURY, AND ANALYTICS IS THE COMBUSTION ENGINE PETER SONDERGAARD,SVP, GARTNER ERRORS USING INADEQUATE DATA ARE MUCH LESS THAN THOSE USING NO DATA AT ALL CHARLES BABBAGE THE EDUCATION INDUSTRY BALAMURUGAN SOUNDARARAJ ESTIMATING FOOTFALL FROM PASSIVE WI-FI SIGNALS CASE STUDY WITH SMART STREET SENSOR PROJECT DOCTOR OF PHILOSOPHY UNIVERSITY COLLEGE LONDON - UCL doctor of philosophy department of geography, ucl I, Balamurugan Soundararaj confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. .......................................................................... Submitted on, November 2019 5 Dedicated to my parents, S. Kalavathy and K. Soundararaj. Contents 1 Introduction 25 1.1 Challenges 26 1.2 Research Question & Methodology 28 1.3 Outline 29 1.4 Impacts & Applications 30 2 Review of Literature 33 2.1 Research Themes 35 2.2 Research Trends 42 2.3 Techniques and technology 43 2.4 Research Gaps and Opportunities 53 3 Collecting Wi-Fi Data 59 3.1 Wi-Fi as a Source of Data 60 3.2 Initial Experiments 63 3.3 Pilot Study 74 3.4 Smart Street Sensor Project 77 3.5 Discussion 80 4 Processing the Data into Footfall 87 4.1 Data Toolkit 89 4.2 Data processing 109 4.3 Data pipeline 130 8 5 Applications and Visualisations 133 5.1 Footfall Indices 133 5.2 Event Detection 140 5.3 Pedestrian Flows 141 6 Discussion and Conclusions 145 6.1 Summary of Findings 145 6.2 Research Question 147 6.3 Further Work 148 7 Appendix 151 7.1 Manual Counting 153 7.2 Pilot Study 160 7.3 Data Pipeline 166 7.4 Benchmarking Data Tookit 184 7.5 Sample Probe Request 188 7.6 Open-source Software Used 196 Bibliography 201 List of Figures 2.1 Growth of research in the area of ’Understanding distribution and dynamics of human activity’ since 1980 34 2.2 Tree-map showing the volume of research conducted under each major themes and their sub-themes. 36 2.3 The evolution of research since 1980 categorised based on their major theme. 42 2.4 Distribution of research across various techniques and technologies 43 2.5 The evolution of research since 1980 in terms of the the technology used in the research. 44 3.1 Structure of a probe request frame. 61 3.2 Number of probe requests collected every minute on 15 October 2017 65 3.3 Sequence numbers plotted against timestamps showing clear pat- terns corresponding to unique devices. 68 3.4 Sequence number patters in Samsung devices showing the diversity of MAC addresses showing that they are not randomised. 68 3.5 Illustration showing the configuration of the sensor at UCL south cloisters 69 3.6 Composition of probe requests in terms of the vendor names and their type 70 3.7 Comparison between the manual footfall count and aggregated counts from sensor collected data at UCL South Cloisters. 70 3.8 Comparison between the manual footfall count and aggregated counts from sensor collected data at UCL South Cloisters after filtering probes with low signal strength 71 3.9 Density distribution of the signal strengths of the probe requests collected at UCL South Cloisters along with class intervals. 71 3.10 Location and configuration of Wi-Fi data collection carried out in Oxford Street, London. 72 10 3.11 Comparison of the counts from aggregated probe requests and MAC addresses with manual counts at Oxford street, London. 73 3.12 Hardware setup used to collect data in the pilot studies. 74 3.13 Schematic diagram showing the data collection process in the pilot study. 75 3.14 Pilot study locations in London along with their corresponding sensor installation configurations. 76 3.15 Outline of the ‘Medium data toolkit’ devised to collect, process, visualise and manage the Wi-Fi probe requests data. 77 3.16 Hardware setup used to collect data in the pilot studies. 78 3.17 Distribution of locations with Smart Street Sensors installed. 78 3.18 Cross section showing a typical installation of Smart Street Sensor in a retail frontage. 79 3.19 The decay of signal strength (RSSI) with respect to distance. 82 3.20 Increase in the share of randomised MAC addresses compared to non-randomised original ones over the years. 83 3.21 Smartphone penetration by age group in United Kingdom. 84 4.1 Comparison of volumes of data across various disciplines. 94 4.2 Missing data from five locations at Tottenham Court Road, London on 15 January 2018 demonstrating the veracity of the data. 96 4.3 Number of probe requests collected for every five minute interval at Tottenham Court Road, London on the year 2018 showing the visual complexity of data in the time dimension. 97 4.4 Big data characteristics of the Wi-Fi probe request datasets in their corresponding dimensions 98 4.5 Characteristics of types of Wi-Fi data collection tools at each end of the spectrum compared to an ideal candidate 100 4.6 Exponential increase in the processing time when using traditional methods. 102 4.7 The increase in processing time with the Unix pipeline is linear thus improves the scalability compared to R based processing 105 4.8 The scalability of the processing pipeline could be further improved with parallelising it. 105 4.9 Outline of the ‘Medium data toolkit’ devised to collect, process, visualise and manage the Wi-Fi probe requests data 107 4.10 The long term effect of MAC randomisation on average weekly footfall estimated at sensors in Cardiff. 110 11 4.11 Thematic diagram showing the idea behind filtering using signal strength distribution. 111 4.12 Thematic diagram showing the idea behind grouping sensors using their sequence numbers. 114 4.13 Finding the optimum time threshold (s) a and sequence threshold b through trial and error. 118 4.14 Sample showing the result of sequence numbers based clustering algorithm on data collected at Oxford Circus, London. 119 4.15 A comparison of estimated footfall at Oxford Circus during various stages of filtering with the actual manual counts. 120 4.16 Distribution of signal strengths at locations covered under the pilot studies along with the corresponding configurations of the sensors. 122 4.17 A comparison of estimated footfall at pilot study locations during various stages of filtering with the actual manual counts. 124 4.18 Comparison between distribution of signal strengths in probe requests collected by 126 4.19 Comparison between the footfall estimates from Smart street sensor and pilot study after filtering probe requests of low signal strength along with manual footfall counts. 127 4.20 The result of the adjustment using device to probes ratio in non- randomising devices shown through average weekly footfall estimates for locations in Cardiff. 128 4.21 The complete data processing pipeline which takes in raw probe requests from Smart Street Sensor project and outputs footfall estimations. 130 5.1 A weekly footfall index for United Kingdom showing the change in footfall from 2017 to 18 134 5.2 The profiles can be tracked longitudinally to reveal nature and change. 135 5.3 The change (%) in average weekly footfall of towns across the UK in 2018 compared to 2017. 136 5.4 The change (%) in monthly average footfall in towns across the UK in April and May 2009. 137 5.5 Intra-day footfall profile of major cities in United Kingdom 137 5.6 Footfall calendar showing the profiles of daily volumes of footfall at Old Street, London. 138 5.7 Normalised weekly footfall index at locations across Cardiff from 2017 to 2018 140 12 5.8 The difference in footfall distribution at Leicester square, London after the FIFA World Cup quarterfinal and semifinal matches. Source: Oliver Uberti and James Cheshire 141 5.9 Illustration of transfer entropies between set of locations along Edgware Road, London. 142 List of Tables 2.1 Evaluation of different technologies or approaches that can be used for data collection. 53 3.1 Significant information included in a probe request 62 3.2 Number of unique values present in each field captured from the probe requests aggregated by the vendor names 67 3.3 Locations of data collection in the pilot study and the amount of data collected at each location. 77 3.4 Regional distribution of Smart Street Sensor locations across UK 78 3.5 Summary of the collected datasets. 81 4.1 Comparison of volume or size of the datasets of Wi-Fi probe requests. 93 4.2 Comparison of velocity or speed of the datasets of Wi-Fi probe requests. 94 4.3 Examples of different types of Wi-Fi based data collection solutions. 99 4.4 Various data storage approaches and their characteristics. 102 4.5 Various types of big data processing tools and corresponding examples. 103 4.6 Tasks in the processing pipeline, corresponding R libraries and equivalent Unix tools 104 4.7 Comparison of clustering algorithms with a sample of 40000 probe requests 117 4.8 Results of footfall estimation at each location as Mean Absolute Percentage Error (MAPE) after each step of the filtering process. 123 Glossary • Active vs Passive Collection - Active collection is where the data collection process involves the active participation from the study subjects. In passive data collection, no such participation is required.

Estimating Footfall from Passive Wi-Fi Signals

Essays on Monkey: a Classic Chinese Novel Isabelle Ping-I Mao University of Massachusetts Boston

Novella Salon

Contributors

2 “Nothing Is More Real Than Nothing” a Reading of Beckett's

Teaching the Short Story: a Guide to Using Stories from Around the World. INSTITUTION National Council of Teachers of English, Urbana

Posthuman Beckett English 9169B Winter 2019 the Course Analyzes the Short Prose That Samuel Beckett Produced Prior to and After

Samuel Beckett: the V Anishing Voice of Fiction

Collected Shorter Plays of Samuel Beckett Murphy

Bomb Threat Rocks Hullihen, but Fizzles Roselle Hopes Suspect Will Come Forward for Psychiatric Treatment

The Beckett Collection Ruby Cohn Correspondence 1959-1989

Samuel Beckett and Europe

Narrative Strategies in Beckett's Post-Trilogy Prose