Building Blocks for Tomorrow’s Mobile App Store
by
Justin G. Manweiler
Department of Computer Science Duke University
Date:
Approved:
Romit Roy Choudhury, Supervisor
Jeffrey S. Chase
Landon P. Cox
Victor Bahl
Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science in the Graduate School of Duke University 2012 Abstract (0984) Building Blocks for Tomorrow’s Mobile App Store
by
Justin G. Manweiler
Department of Computer Science Duke University
Date:
Approved:
Romit Roy Choudhury, Supervisor
Jeffrey S. Chase
Landon P. Cox
Victor Bahl
An abstract of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science in the Graduate School of Duke University 2012 Copyright c 2012 by Justin G. Manweiler All rights reserved Abstract
In our homes and in the enterprise, in our leisure and in our professions, mobile computing is no longer merely “exciting;” it is becoming an essential, ubiquitous tool of the modern world. New and innovative mobile applications continue to inform, entertain, and surprise users. But, to make the daily use of mobile technologies more gratifying and worthwhile, we must move forward with new levels of sophistication. The Mobile App Stores of the future must be built on stronger foundations. This dissertation considers a broad view of the challenges and intuitions behind a diverse selection of such new primitives. Some of these primitives will mitigate exist- ing and fundamental challenges of mobile computing, especially relating to wireless communication. Others will take an application-driven approach, being designed to serve a novel purpose, and be adapted to the unique and varied challenges from their disparate domains. However, all are related through a unifying goal, to provide a seamless, enjoyable, and productive mobile experience. This dissertation takes view that by bringing together nontrivial enhancements across a selection of disparate-but- interrelated domains, the impact is synergistically stronger than the sum of each in isolation. Through their collective impact, these new “building blocks” can help lay a foundation to upgrade mobile technology beyond the expectations of early-adopters, and into seamless integration with all of our lives.
iv For Jane.
v Contents
Abstract iv
List of Tables xv
List of Figures xvi
List of Abbreviations and Symbols xxviii
Acknowledgements xxxi
1 Introduction 1
2 TransmissionReorderinginWirelessNetworks 14
2.1 Introduction...... 14
2.2 VerifyingMIM ...... 18
2.3 MIM:OptimalityAnalysis ...... 20
2.3.1 OptimalSchedulewithIntegerProgram ...... 20
2.3.2 Results...... 24
2.4 Shuffle:SystemDesign ...... 24
2.4.1 ProtocolDesign...... 25
2.4.2 DesignDetails...... 33
2.4.3 RateControl ...... 33
2.4.4 UploadTraffic...... 34
2.4.5 ControllerPlacement ...... 35
2.5 Shuffle:Implementation ...... 35
vi 2.5.1 TestbedPlatform ...... 35
2.5.2 TimeSynchronizationandStagger ...... 36
2.5.3 CoordinationandDispatching ...... 37
2.6 Evaluation...... 39
2.6.1 Throughputwith2AccessPoints ...... 40
2.6.2 Throughputwith3AccessPoints ...... 41
2.6.3 Fairness ...... 41
2.6.4 PerformanceonLargerTopologies...... 42
2.6.5 CompleteShufflewithRateControl...... 42
2.6.6 SimulationResults ...... 46
2.6.7 ImpactofAPdensity...... 46
2.6.8 ImpactofFading ...... 47
2.7 LimitationsandDiscussion...... 48
2.7.1 ExternalNetworkInterference ...... 48
2.7.2 Latency ...... 48
2.7.3 ClientMobility ...... 49
2.7.4 TransportLayerInteractions...... 49
2.7.5 Compatibility ...... 49
2.7.6 Small-scaleTestbed...... 49
2.8 RelatedWork ...... 50
2.8.1 CaptureandMIM ...... 50
2.8.2 SpatialReuse ...... 50
2.8.3 EnterpriseWirelessLANsandScheduling ...... 50
2.8.4 Characterizing and Measuring Interference ...... 51
2.9 Conclusion...... 52
vii 3 MonitoringtheHealthofHomeWirelessNetworks 53
3.1 Introduction...... 54
3.2 RxIPArchitecture ...... 60
3.3 HiddenTerminalDiagnosis...... 62
3.3.1 EnsuringHiddenTerminalsaretheCause ...... 62
3.3.2 IsolatingtheHiddenTerminal ...... 63
3.4 RecoverybyCoordination ...... 66
3.4.1 CopingwithInternetLatencies ...... 66
3.4.2 MultiplePartnerships...... 67
3.4.3 ProvablePropertiesofCoordination...... 68
3.5 AdditionalConsiderations ...... 70
3.5.1 CopingwithTokenLoss ...... 70
3.5.2 AddressTranslation ...... 71
3.5.3 UploadTraffic...... 71
3.5.4 IncrementalDeployability ...... 71
3.6 Evaluation...... 72
3.6.1 TestbedPlatform ...... 72
3.6.2 Methodology ...... 73
3.6.3 HiddenTerminalDiagnosisandRecovery ...... 73
3.6.4 Microbenchmarks ...... 77
3.6.5 Scalability of Partnership-based TDMA...... 79
3.7 RelatedWork ...... 82
3.7.1 EnterpriseNetworkManagement ...... 82
3.7.2 HiddenTerminalMitigation ...... 82
3.7.3 NetworkMeasurement ...... 82
viii 3.7.4 RelatedTechniques ...... 83
3.8 Conclusion...... 83
4 WiFiEnergyManagementviaTrafficIsolation 84
4.1 Introduction...... 85
4.2 BackgroundandMeasurements ...... 89
4.2.1 ChoiceofDevice ...... 89
4.2.2 MeasurementSet-up ...... 89
4.2.3 Terminology...... 90
4.2.4 PSMEnergyProfiling ...... 91
4.2.5 ImpactofNetworkContentiononEnergy...... 93
4.3 SleepWellDesign ...... 96
4.3.1 BasicSleepWell ...... 96
4.3.2 CopingwithTrafficDynamics ...... 101
4.3.3 SeamlessBeaconRe-adjustment ...... 103
4.3.4 MultipleClientsperAP ...... 104
4.3.5 Compatibility with Adaptive-PSM Clients ...... 105
4.4 Evaluation...... 105
4.4.1 Implementation ...... 105
4.4.2 Methodology ...... 106
4.4.3 PerformanceResults ...... 106
4.5 LimitationsandDiscussion...... 117
4.5.1 ImpactofHiddenTerminals ...... 117
4.5.2 IncrementalDeployability ...... 119
4.5.3 InteractiveTraffic...... 120
4.5.4 TSFAdjustment ...... 120
ix 4.6 RelatedWork ...... 120
4.6.1 WiFiPSMsleepoptimization ...... 120
4.6.2 WiFiDutyCycling ...... 121
4.6.3 SensornetworkTDMA ...... 121
4.7 Conclusion...... 122
5 A Matchmaking System for Multiplayer Mobile Games 123
5.1 Introduction...... 124
5.2 MotivationandPriorWork ...... 126
5.2.1 LatencyinMultiplayerGames ...... 126
5.2.2 MatchmakinginOnlineGames ...... 127
5.2.3 P2PGamesOverCellularNetworks...... 129
5.2.4 CellularNetworkPerformance ...... 130
5.2.5 Grouping ...... 131
5.3 EstimatingCellularLatency ...... 132
5.3.1 Predicting Future Latency Based on Current Latency . . . . . 134
5.3.2 Using One Phone to Predict the Future Latency of a Different Phone ...... 139
5.3.3 PredictingtheLatencyBetweenPhones ...... 145
5.4 Switchboard...... 147
5.4.1 ArchitectureofSwitchboard ...... 147
5.4.2 ClientAPIforLobbyBrowser ...... 148
5.4.3 LatencyEstimator ...... 150
5.4.4 GroupingAgent...... 152
5.4.5 GroupingAlgorithm ...... 153
5.5 Evaluation...... 155
5.5.1 Implementation ...... 155
x 5.5.2 EvaluationofGrouping...... 156
5.5.3 End-to-endEvaluation ...... 161
5.5.4 SummaryofEvaluationResults ...... 165
5.6 Conclusion...... 166
6 An Object Positioning System using Smartphones 168
6.1 Introduction...... 169
6.2 MotivationandOverview...... 172
6.2.1 ApplicationsbeyondTagging ...... 172
6.2.2 SystemOverview ...... 173
6.3 PrimitivesforObjectLocalization ...... 175
6.4 OPS:SystemDesign ...... 181
6.4.1 ExtractingaVisualModel ...... 181
6.4.2 Questions ...... 185
6.4.3 Point Cloud toLocation: Failed Attempts ...... 186
6.4.4 TheConvergedDesignofOPS...... 188
6.5 Discussion ...... 192
6.5.1 ExtendingtheLocationModelto3D ...... 192
6.5.2 AlternativesforCapturingUserIntent ...... 193
6.6 Evaluation...... 194
6.6.1 Implementation ...... 194
6.6.2 AccuracyofObjectLocalization...... 196
6.7 RoomforImprovement...... 200
6.7.1 Live Feedback to Improve Photograph Quality ...... 201
6.7.2 Improving GPS Precision with Dead Reckoning ...... 201
6.7.3 Continual Estimation of Relative Positions with Video . . . . 201
xi 6.8 RelatedWork ...... 202
6.8.1 Localization through Large-Scale Visual Clustering ...... 202
6.8.2 Aligning Structure from Motion to the Real World ...... 202
6.8.3 BuildingObjectInventories ...... 203
6.8.4 ApplyingComputerVisiontoInferContext ...... 203
6.9 Conclusion...... 203
6.10 ReferenceEquations ...... 204
6.10.1 EquationofVisualTrilateration ...... 204
6.10.2 EquationofVisualTriangulation ...... 205
7 PredictingClientDwellTimeinWiFiHotspots 209
7.1 Introduction...... 209
7.2 NaturalQuestions...... 212
7.3 ToGoPredictionEngine ...... 214
7.3.1 DesignOverview ...... 214
7.3.2 Components...... 215
7.4 BytesToGo: AnApplicationofToGo ...... 218
7.4.1 Motivation...... 218
7.4.2 NaturalQuestions ...... 219
7.4.3 ExtendingfromToGo ...... 221
7.5 ImplementationandEvaluation ...... 223
7.5.1 PrototypeImplementation ...... 223
7.5.2 ToGo Performance: Dwell Prediction Accuracy ...... 224
7.5.3 BytesToGo Performance: Offloading 3G to WiFi ...... 230
7.6 Discussion ...... 233
7.6.1 ConsiderationsforallToGoSystems ...... 234
xii 7.6.2 Considerations particular to BytesToGo ...... 235
7.7 RelatedWork ...... 236
7.7.1 WiFiandCellular...... 236
7.7.2 MobilityPrediction ...... 236
7.7.3 ActivityRecognition ...... 237
7.8 Conclusion...... 237
8 Encounter-Based Trust for Mobile Social Services 238
8.1 Introduction...... 239
8.2 TrustandThreatModel ...... 241
8.2.1 AdversarialCapabilities ...... 242
8.2.2 AdversarialLimitations...... 243
8.3 SMILESystemDesign ...... 243
8.3.1 EncounterDetection ...... 245
8.3.2 Missed-Connection Reestablishment ...... 247
8.3.3 K-anonymityPreservation ...... 249
8.3.4 ImplementationConsiderations ...... 257
8.4 DecentralizedArchitecture ...... 260
8.4.1 DistributedOperation ...... 261
8.4.2 IdentifierSetSelection ...... 262
8.5 Evaluation...... 265
8.5.1 KeyAdvertisementDetection ...... 265
8.5.2 CraigslistClassification...... 267
8.6 RelatedWork ...... 268
8.6.1 LocationProofs...... 268
8.6.2 LocationPrivacy ...... 269
xiii 8.6.3 AnonymousMessaging ...... 270
8.7 Conclusion...... 270
9 Conclusion 271
Bibliography 273
Biography 292
xiv List of Tables
2.1 Integer programming parameters and variables ...... 22
5.1 Network parameters observed at 6 different locations in each of the three experiments – (a) Seattle, (b) Redmond, (c) Durham. In all cases, the phones were connected to AT&T Wireless with MNC 410 and MCC 310, over HSDPA with 3-4 bars of signal strength. For the Seattle experiment, one phone was left at “S-home” while the other visited each of the 6 locations. For Redmond, the stationary phone was at “M-home”, and for Durham it was “R-home”...... 142
5.2 Experimental parameters for end-to-end experiments...... 161
6.1 OptimizationforTriangulation ...... 206
6.2 OPSOptimizationonGPSError ...... 207
6.3 OPSFinalObjectLocalization ...... 208
8.1 AnalyticalModelParameters ...... 250
xv List of Figures
1.1 Building blocks by design approach. In the first half of the disserta- tion, we take a bottom-up perspective, seeking to optimize existing wireless communication for enhanced performance, reliability, and en- ergy efficiency. In the second half, we take an application-driven top- down approach, considering possible future mobile applications and the supporting primitives required to enable them...... 4
2.1 AP1ÑR1 must start before AP2ÑR2 to ensure concurrency. If AP2 starts first, R1 locks onto AP2 and cannot re-lock onto AP1 later... 17
2.2 Testbed confirms MIM capability. Rx receives from Tx (at 5 positions) inthepresenceofinterference(Intf)...... 19
2.3 MIM can provide large concurrency gains. These graphs show the number of links that can meet SINR requirements with and without MIM enabled. Gains improve with increasing (a) number of clients and(b)numberofAPs...... 24
2.4 Flow of operations in the Shuffle system. Data packets arrive from the network gateway and are enqueued at an AP. The AP notifies the controller of the waiting outbound packet. The controller inserts the corresponding AP-client pair into a network-wide link queue, and eventually schedules this link as part of a concurrent batch. The AP dequeues and transmits the packet according to the controller’s prescribed schedule, and subsequently notifies the controller of all fail- ures. The controller utilizes this feedback for loss recovery and conflict diagnosis...... 26
2.5 Per-link data structure maintained at the controller for scheduling transmissions. AP2 is the transmitter for link li...... 28 2.6 HeuristicsforMIM-awarescheduling...... 32
2.7 Illustration of a scheduled batch of packets with the staggered trans- mission times. AP1 starts first, followed by AP3, then AP2...... 33
xvi 2.8 AP-to-controller clock synchronization error and transmission de- viance from the assigned schedule, relative to the local clock. AP and controller were separated by approximately 20 m of CAT-5 cable, 1 switch, and 1 hub. Margin of error ď 5 s, attributable to 802.11 TSFinaccuracy...... 38
2.9 Concurrencygainswithonlytwolinks...... 40
2.10 Multiple Shuffle orders provide higher throughput than both TDMA and802.11...... 41
2.11 Shuffleschedulingimprovesfairness...... 42
2.12 (a) Example 10 link topology in our building (b) Throughput and (c) fairnessonentireShuffletestbed...... 43
2.13 Throughput for Shuffle versus TDMA using 802.11g with 6-54 Mbps ratecontrolenabled...... 44
2.14 A classroom environment with 54 seats. Leaving the AP and one client fixed, we tested with a client placed on the desk in front of each chair...... 45
2.15 CDFofthroughputforclassroomtest...... 45
2.16 Performance evaluation on real and synthetic topologies...... 46
2.17 Throughput improvement under different channel fading conditions – Shuffle performs well under Rayleigh and Ricean fading...... 47
3.1 TCP download throughput contour on two floors of the same apart- ment. A hidden terminal is placed in an adjacent apartment (not shown). Removal of the interference provides 8-12 Mbps in the living room (versus ă1Mbpsshown)...... 56
3.2 As AP2 is moved away from the AP1ÑC1 link, graph shows decreasing and then increasing performance for AP1. AP2 becomes a hidden terminal at 5m, causing significant losses up to 35m away...... 57
3.3 As C2 moves towards its AP, it becomes less susceptible to hidden terminal interference from AP1. TCP more-fully utilizes the channel, and correspondingly, C1 is severely impacted by AP2...... 58
3.4 Hiddenterminalconditions...... 61
xvii 3.5 Timeline of wired token exchange and wireless timeslots. AP1 pur- chases timeslot t5 to t6 by giving the token to AP2. AP1 may not be able to transmit at t7 (due to some other partnership, not shown). AP1 abstains from a token pass at t7, allowing AP2 to transmit. How- ever,AP1silencesAP2att8instead...... 67
3.6 Rotating channel access rights, established by token exchanges across multiplepartnerships...... 68
3.7 (a) With TCP, RxIP provides a median 57% gain over 802.11 under symmetric hidden terminals. (b) RxIP extracts the majority of avail- able gain. (c) Despite the already-symmetric conditions, RxIP further improvesfairness...... 74
3.8 TCP throughput and fairness under asymmetric hidden terminals. (a) Coordination balances the asymmetry, closely approximating an ideal 50-50 channel share. (b) Fairness improves dramatically...... 76
3.9 RxIP protects the AP1-C1 link from performance degradation regard- lessofAP2position...... 76
3.10 As C2 moves from position 0 to 20m, its link strengthens, becoming less susceptible to hidden terminal interference from AP1. TCP more- fully utilizes the channel, and correspondingly, C1 is severely impacted byAP2. Coordinationprotectsbothlinks...... 77
3.11 (a) RTT between APs across an apartment complex using 1.5Mbps cable. (b) AP-to-client delivery latency exhibits a linear relationship to the Internet RTT between partnered APs (2x AP-to-AP delay). . . 78
3.12 (Inset) Intermediate APs relay clock offsets for time synchronization between hidden terminals. (Graph) Second-hop time synchronization error attributable to wired relay mechanism latency...... 79
3.13 Scalability test, 30 random 6-link topologies. CDF (a) throughput, (b)jitterand,(c)fairness...... 81
4.1 Shows experimental setup with Nexus One phone connected to power meter via copper tape and DC leads. The phone is entirely powered by the power meter, using the lithium battery only as ground. The computer, connected via USB, records current and voltage at 5000 hertz...... 90
4.2 (a) Screenshot from Monsoon power meter; (b) Power draw over time forPandoramusicstreaming...... 92
xviii 4.3 Energy consumed under bulk data transfer and YouTube replay with varying contention (i.e., increasing number of APs in the vicinity). . . 94
4.4 Proportion of time spent in each power level. (a) 8 MB TCP Iperf; (b)YouTubew/Tcpreplay...... 95
4.5 AP1 and AP3’s traffic maps during bootstrap (AP2’s map, not shown, is identical to AP1’s). The circle denotes one BEACON INTERVAL of 100ms. The ticks on the circle denote when an AP has overheard beacons from other APs, as well as the time of its own beacon. The traffic maps clearly depend on the neighborhood...... 97
4.6 APs 1, 2, and 3 migrate their traffic per the SleepWell heuristic. Over time, the beacons are spread in time, alleviating contention between APs...... 98
4.7 SleepWell APs distributedly stagger their beacons to reduce con- tention. Each AP preempts its traffic to honor another AP’s schedule. 100
4.8 (a, b) Two SleepWell clients converge to non-overlapping activity cy- cles, one sleeping when the other is active. (c) Under same experiment settings, 802.11 client stays awake for entire TCP download...... 108
4.9 Adjustment rounds until a SleepWell AP reaches a converged beacon placement...... 109
4.10 Overall energy performance of SleepWell...... 109
4.11 8 MB Iperf TCP download. With higher contention, SleepWell spends a larger fraction of time in light-sleep, whereas, 802.11 spends most of the time in the idle/overhear state (see Fig. 4.4a)...... 110
4.12 Proportion of time spend in each activity level with YouTube traffic. ComparetoFigure4.4...... 111
4.13 (a) Iperf, (b) YouTube, (c) Pandora. CDF comparison of in- stantaneous power showing that SleepWell better matches the zero-contentioncurve...... 112
4.14 CDF of instantaneous power consumption, YouTube with contention fromYouTubeclients...... 113
4.15 Bulkdatatransferon4AP/clienttestbed...... 114
4.16 Performance of beacon adjustment: (a) CDF of beacon separation; (b) separation by network density; (c) CDF of proportion of an AP’s traffic that can be satisfied before the end of its beacon share. . . . . 115
xix 4.17 TCP throughput on 4 AP/client testbed. Distribution reflects per-link goodputforalllinks...... 116
4.18 Per-packet latency on 8 AP/client testbed. Latency measured as 10 ICMP pings per second on one link, 7 others contend with TCP. . . . 116
4.19 SleepWell fairness: (a) TCP Jain’s fairness on 4 AP/client testbed. Note X-intercept at 0.9.; (b) Jain’s fairness for simulated beacon shares withunboundedtraffic...... 117
4.20 SleepWell performance by AP density: (a) rounds until convergence at 90th percentile; (b) median beacon separation; (c) beacon separation at5thpercentile...... 118
4.21 SleepWell performance by proportion of legacy APs: (a) rounds un- til convergence at 90th percentile; (b) median beacon separation; (c) beaconseparationat5thpercentile...... 119
5.1 CDF of ping latency between two phones on 3G HSDPA connectivity in Redmond, WA, either direct, or via a nearby University of Wash- ington server, or via the best server offered by geo-distributed Bing Search. Horizontalaxiscroppedat600ms...... 129
5.2 CDF of ping latency between two phones on 3G HSDPA connectivity in Durham, NC, either direct, or via a nearby Duke University server, or via a distant University of Washington server, or via the best server offered by geo-distributed Bing Search. Horizontal axis cropped at 600ms...... 130
5.3 Simplified architecture of a 3G mobile data network. RNC is a Radio Network Controller, and handles radio resource management. SGSN is a Serving GPRS Support Node, and it handles mobility manage- ment and authentication of the mobile device. GGSN is a Gateway GPRS Support Node, and interfaces with the general IP network that amobileoperatormayhaveandtheInternet...... 134
5.4 RTT from a phone in Princeville, HI on AT&T Wireless to the FRH. Each point is the median latency over 15 seconds. Graph is zoomed into a portion of the data to show detail. Data from Redmond, Seattle, Durham,andLosAngelesarevisuallysimilar...... 135
xx 5.5 RTT from a phone in Redmond, WA on AT&T Wireless to the FRH. On the horizontal axis, we vary the length of time window over which we calculate the latency at the various percentiles indicated by the different lines. On the vertical axis, we show the difference in ms be- tween two consecutive time windows at the different percentiles, av- eraged over the entire trace. Data from Princeville, Seattle, Durham, Los Angeles for AT&T Wireless are visually similar...... 136
5.6 RTT from a phone in Durham, NC on T-Mobile to the FRH. On the horizontal axis, we vary the length of time window over which we calculate the latency at the various percentiles indicated by the different lines. On the vertical axis, we show the difference in ms between two consecutive time windows at the different percentiles, averagedovertheentiretrace...... 136
5.7 For any given 15 minute time window, from how far back in time can we use latency measurements and still be accurate? The horizontal axis shows the difference in latency at the 95th percentile between a time window and a previous time window. The age of the previous time window is shown in the legend. The vertical axis shows the CDF across all the different 15 minute intervals in this trace. The horizontal axisisclippedontheright...... 137
5.8 CDF of Kolmogorov-Smirnov (KS) test Goodness-of-Fit P-values for successive time windows by window size (in minutes) for a phone in Redmond, WA on AT&T Wireless. Each data point represents a two- sample KS test using 100 points from each of two successive time windows. The percentage of null hypothesis rejection is shown as the intersection of a distribution with the chosen significance level. A lower percentage of rejected null hypotheses is an indication of greater stability across successive time windows. The horizontal axis is clipped on the left. For clarity, a limited set of window sizes are shown. Data from Princeville, Seattle, Durham, Los Angeles are visually similar. . 138
xxi 5.9 Impact of reducing the measurement sampling rate for a 15 minute window of latency from a phone in Durham, NC on AT&T Wireless to the FRH. The horizontal axis shows the difference in latency at the specified percentile between using a measurement rate of once per 1 second and using a measurement rate of once per 5 to 600 seconds as indicated. The vertical axis shows the CDF across all the different 15 minute intervals in this trace. Note that at a sampling rate of once per 90 seconds for 15 minutes, we have only 10 samples and hence we cannot calculate the 95th percentile. The horizontal axis is clipped on the right. Data from Princeville, Redmond, Seattle, Los Angeles are visuallysimilar...... 139
5.10 Maps showing measurement locations in (a) the Seattle area of Wash- ington, (b) the Redmond area of Washington, (c) the Durham and RaleighareasofNorthCarolina...... 142
5.11 Difference in latency between a stationary phone at “S-home” and a phone placed at a variety of locations in Seattle. Each line is a CDF of ((xth percentile latency over a 15-minute interval from stationary phone at “S-home”) - (xth percentile latency over the same 15-minute interval for the other phone at the location in the legend)) computed for all possible 15-minute windows, in 1 minute increments. The xth percentile is 50th for the top graph, 90th for the middle, and 95th for the bottom. Horizontal axis is cropped on the right...... 143
5.12 Difference in latency between a stationary phone at “M-home” and a phone placed at a variety of locations in Redmond. Each line is a CDF of ((xth percentile latency over a 15-minute interval from sta- tionary phone at “M-home”) - (xth percentile latency over the same 15-minute interval for the other phone at the location in the legend)) computed for all possible 15-minute windows, in 1 minute increments. For conciseness, we present only the 50th percentile graph. Horizontal axisiscroppedontheright...... 144
5.13 Difference in latency between a stationary phone at “R-home” and a phone placed at a variety of locations in Durham. Each line is a CDF of ((xth percentile latency over a 15-minute interval from stationary phone at “R-home”) - (xth percentile latency over the same 15-minute interval for the other phone at the location in the legend)) computed for all possible 15-minute windows, in 1 minute increments. For con- ciseness, we present only the 50th percentile graph. Horizontal axis is croppedontheright...... 144
xxii 5.14 CDF of RTT between a phone in Durham, NC and a phone in San Antonio, TX. Component latencies involving the respective FRH are also included. The FRH to FRH latency is calculated by the difference of pings. The horizontal axis is clipped on the right. Note that the phone-to-phone CDF is not a perfect sum of the other three CDFs due to small variations in latency in between traceroute packets issued at therateofoncepersecond...... 146
5.15 ArchitectureofSwitchboard ...... 148
5.16 C# client API of Switchboard as would be used in a hypothetical game called “Boom”. For brevity, base class definitions are not shown here...... 149
5.17 CDF of number of players in each group after grouping 50,000 play- ers split into buckets of 1,000 players each, with a latency limit of 250ms. The top four lines show results from grouping players based on geographic proximity, while the bottom line uses latency proximity. 158
5.18 CDF of number of players in each group after grouping 50,000 play- ers split into buckets of 1,000 players each, with a latency limit of 400ms. The top four lines show results from grouping players based on geographic proximity, while the bottom line uses latency proximity. 158
5.19 CDF of number of players in each group after grouping 50,000 players split into buckets of varying sizes, with a latency limit of 250ms. The “QT 500” line shows results with QT clustering on a bucket size of 500 players. The “Hier 1500” line shows results with hierarchical clustering onabucketsizeof1,500players...... 160
5.20 Runtime of grouping algorithms for grouping 50,000 players split into buckets of varying sizes, with a latency limit of 250ms. The “QT” bars on the left show results with QT clustering, while the “Hier” bars on the right show results with hierarchical clustering...... 160
5.21 Aggregate client-to-server bandwidth by client Poisson arrival rate for Switchboard running on Azure. The first 15 minutes reflects a warm- ing period with elevated measurement activity as the server builds an initialhistory...... 163
5.22 CDF of ICMP probes per client at different client Poisson arrival rates, as conducted by the Measurement Controller in Switchboard running on Azure. Data reflects hour-long experiments and exclude warming period...... 163
xxiii 5.23 CDF of resulting group sizes at different client Poisson arrival rates. Grouping uses 500-client buckets. Data reflects hour-long experiments andexcludewarmingperiod...... 164
5.24 Client time spent in measurement and grouping. Measurement re- flects the time from when a client joins a lobby until there is sufficient data for the client’s tower. Time required for grouping reflects the total time from when measurement data is sufficient until the client is placed into a viable group (one or more clustering attempts). Group- ing performed with randomized buckets of up to 500 clients...... 165
6.1 An architectural overview of the OPS system – inputs from computer vision combined with multi-modal sensor readings from the smart- phoneyieldtheobjectlocation...... 173
6.2 Compass-based triangulation from GPS locations px1,y1q, px2,y2q to object position pa, bq...... 176
6.3 The visual angle v relates the apparent size s of an object to distance d fromtheobserver...... 178
6.4 Visual Trilateration: unknown distances from GPS locations px1,y1q and px2,y2q to object position pa, bq are in a fixed ratio d2{d1. .... 178 6.5 Visual Triangulation: fixed interior angle from known GPS location px1,y1q to unknown object position pa, bq to known GPS position px2,y2q ...... 179 6.6 Intersection of the four triangulation curves for known points p0, 0q and p10, ´4q, localized point p4, 8q, distance ratio σ “ 6 p5q{4 p5q“ 1.5, and internal angle γ “ 2 ¨ arctanp1{2q« 53˝...... a a 180 6.7 OPS builds on triangulation and trilateration, each underpinned by computer vision techniques, and multi-modal sensor information. The noise from sensors affects the different techniques, and makes merging difficult...... 181
6.8 Two vantage points of the same object of interest. The “thought- bubbles” show the two different perspective transformations, each ob- servingthesamefourfeaturecornerpoints...... 182
6.9 Example of a 3D point cloud overlaid on one of the images from which itwascreated...... 184
xxiv 6.10 Sampled tests; circle denotes object-of-interest (top), Google Earth view (bottom): (a) smokestack of a coal power plant; (b) distant building with vehicles in foreground; (c) stadium seats near goal post. 197
6.11 CDF of error across all locations. Graph reflects four photos taken perlocation.50locations...... 198
6.12 OPS and triangulation error at 50 locations. Graph reflects four pho- tostakenperlocation...... 198
6.13 Error from ground-truth GPS camera locations. X-axis shows the standard deviation of introduced Gaussian GPS errors. Bars show median error; whiskers show first and third quartiles...... 199
6.14 Error from ground-truth GPS camera locations. X-axis shows the standard deviation of introduced Gaussian compass errors. Bars show median error; whiskers show first and third quartiles...... 199
6.15 OPS error by photo resolution. Keypoint detection is less reliable below1024x768pixels...... 200
7.1 Clients at a university cafe exhibit varied dwell times, reflecting multi- ple patterns of user behavior. Some long-dwell clients study for hours whilemoremobileuserstakeamealto-go...... 213
7.2 Periodic Sensor-Feature matrices feed the SVM sub-predictors to gen- erate short-term predictions. These time-indexed predictions form a growing sequence that are then used to predict the user’s long-term dwell time behavior. Sequences from other users are used as the train- ingset...... 217
7.3 Difference between WiFi and 3G TCP throughput at different hotspots. WiFi offers almost 6.5ˆ throughput compared to 3G. . . . 220
7.4 BTG prioritizes traffic of short-dwell mobiles. However, it compen- sates long-dwell laptops by exploiting slack periods...... 221
7.5 ToGo synthesizes client sensor feedback to estimate dwell duration for associated clients. Applications such as BTG can leverage these predictions as necessary, for example, ensuring that multiplayer games will complete before one party leaves or providing prioritized access tocloudlets[163]...... 222
xxv 7.6 Cross-validation on 15 real-user traces at the Cafe. Despite only 14 SVM training points, ToGo correctly classified users within 2.5 min- utes. Additional sensors reduce prediction error during convergence...... 226
7.7 Diagram shows user behavior along a representative path. User (i) walks up to McDonald’s to examine wall-mounted menu and wait in queue line (10-60 seconds); (ii) places order, waits for food (1-2 minutes); (iii) takes condiments (2-15 seconds); (iv) sits and eats food (5-15 minutes); (v) discards trash (1-10 seconds); and (vi) exits to lobby.227
7.8 Mean priority misprediction at 3 hotspots: (a) McDonalds; (b) Library; (c) Cafe. All ToGo variants perform better than Naive. NoFeedback performs reasonably well when there is enough RSSI diversity as in a large hotspot such as library...... 228
7.9 Prediction accuracy by priority class. Dwell duration (X-axis) is dif- ferent for each class (increasing by class number). Naive requires sub- stantially longer before convergence to the correct classification. . . . 229
7.10 Relative overlap of emulated user behaviors, live experiments. . . . . 232
7.11 Performance of BTG with live traffic shaping in Cafe. Traffic shaping benefitsclientswithshorterdwelltimes...... 232
7.12 3G data saved per hour by one AP. BTG prioritization improves WiFi utilization, providing substantial 3G network savings. Gains increase with larger HD files. Note that in some cases, RSSI based NoFeedback variant suffices to differentiate short dwelling users...... 233
8.1 An illustrated sequence of operations. Let H denote a cryptographic hash function and Expmq denote the encryption of message m with key x. Encounter keys x and y hash to the same value, leading the server to relay Expmq to participants in both encounters. However, only participants with key x can recover message m. A timestamp t nonceinthereplypreventsreplayattacks...... 244
8.2 In online missed-connections posting services (such as Craigslist), posting subjects are forced to manually browse up to hundreds of unrelated postings. By directly routing messages to encounter participants, SMILE is more efficient and less error-prone...... 245
8.3 Wireless encounter-key broadcasts provide co-located users with shared state that can later be used to prove participation in an encounter...... 246
xxvi 8.4 Classification of identity confirmation checks requested, among Craigslist posts requesting some check. Most checks rely on features observable to (and thus forgeable by) third parties, such as a personal description...... 248
8.5 Estimated Craigslist encounter distance. Only «5% of encounters occuroutsideofBluetoothrange...... 258
8.6 Estimated latency from time of encounter occurrence to Craigslist post...... 259
8.7 Distributed scheme operation. During an encounter, each peer shares k identifiers and an encounter key. Messages are sent using onion routing or an anonymous remailer to preserve anonymity...... 261
8.8 Estimated encounter duration implied by Craigslist posts, by geo- graphiclocale...... 265
8.9 Encounter-key discovery. Each detection scan begins 15 seconds after thecompletionofthepriorscan...... 267
xxvii List of Abbreviations and Symbols
3G Third Generation; standard for cellular telecommunications
802.11 IEEE standard for WLAN