Quantifying and Mitigating Privacy Threats in Wireless Protocols and Services Jeffrey Anson Pang CMU-CS-09-145 July 2009

Quantifying and Mitigating Privacy Threats in Wireless Protocols and Services Jeffrey Anson Pang CMU-CS-09-145 July 2009 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Thesis Committee: Srinivasan Seshan, Chair Adrian Perrig Peter Steenkiste David Wetherall, University of Washington Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy. Copyright c 2009 Jeffrey Anson Pang This research was sponsored by the National Science Foundation under grant numbers CNS-0435382, CNS- 0520192, CNS-0721857 and the US Army Research Office under grant number DAAD19-02-1-0389. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity. Keywords: privacy, security, anonymity, wireless, mobility, tracking, Wi-Fi, 802.11 For my parents and Ah Mah. iv Abstract The ubiquity of mobile wireless devices greatly magnifies the threats of clandestine physical tracking, profiling, and surveillance. This is because these devices often reveal their identities and locations to third parties, either inadvertently to eavesdroppers nearby or in reports to location-based services. In this dissertation, we address the challenges in building practical wireless protocols and services that protect users from these threats. To under- stand the nature of the problem, we first quantify how easily eavesdroppers can track devices that use 802.11, the dominant local area wireless protocol for the foreseeable future. Using wireless traffic from hundreds of real devices, we show that eavesdroppers can track 802.11 devices accurately even if explicit identifiers, such as MAC addresses, are changed over time. This is because implicit identifiers, or identifying characteristics of 802.11 traffic, can still identify many users with high accuracy. We develop an automated proce- dure that can identify users even when countermeasures, such as pseudonyms and encryption, are employed. In response to these shortcomings, we present the design and evaluation of an 802.11-like wireless link layer protocol that obfuscates all transmitted bits, rather than select fields, to increase privacy. By obscuring all bits, we greatly increase the difficulty of identifying or profiling users from their transmis- sions. Our design, called SlyFi, is nearly as efficient as existing schemes for discovery, link setup, and data delivery because transmission requires only symmetric key encryption and reception requires a table lookup followed by symmetric key decryption. Experiments using our implementation on Atheros 802.11 drivers show that SlyFi performs comparably with 802.11 using WPA. Finally, we demonstrate how to build wireless service directories that can not track users who submit location-aware reports. This problem is increas- ingly relevant for 802.11 hotspot directories, which may rely on users that submit accurate information about hotspot location and characteristics but want to remain anonymous. We present Wifi-Reports, a location-based service that provides Wi-Fi clients with historical information about AP location, performance, and application support. Wifi-Reports addresses two conflicting goals: preserving the privacy of users’ reports and limiting fraudulent reports. Our contributions demonstrate that future wireless protocols and services need not sacrifice users’ privacy in order to be practical. vi Acknowledgments Useful research can only be done by standing on the shoulders of some and holding the hands of others. This dissertation is no exception and I will be forever indebted to the colleagues, mentors, friends, and family members who provided invaluable technical contributions, advise, and inspiration. First and foremost, I thank my advisor, Srini Seshan, who both directed my research with a sure hand and let me have the freedom to pursue whatever most pleased me. I know that not every graduate student can say that their advisor provided a working environment that was generally free of stress and politics; I can about Srini. This dissertation would not have been possible without his guidance, inspiration, and criticisms. Secondly, I thank those that contributed directly to some of the technical content in this dissertation. Ben Greenstein and Damon McCoy were constant collaborators on nearly every part of this thesis and contributed working code, experiments, and domain expertise. Furthermore, I thank Ramki Gummadi, Michael Kaminsky, Yoshi Kohno, Bryan Parno, and David Wetherall for contributing invaluable insights that influenced the technical di- rection of this dissertation. I also thank my other thesis committee members Adrian Perrig and Peter Steenkiste for giving me valuable feedback. Numerous other people have given me useful feedback, including Dave Andersen, Michael Buettner, Ramon´ Caceres,´ Jason Flinn, Dongsu Han, Jaeyeon Jung, Xi Lu, Pratyusa Manadhata, Sergiu Nedevschi, Samir Sapra, Vyas Sekar, Anmol Sheth, Justin Weisz, and the anonymous reviewers from HotOS 2007, HotNets 2007, Mobicom 2007, Mobisys 2008, and Mobisys 2009. Thirdly, I want to thank my main collaborators in my other research efforts for helping to shape my own research vision: Aditya Akella, Ashwin Bharambe, John Douceur, James Hendricks, Jay Lorch, Bruce Maggs, Anees Shaikh, and Haifeng Yu. I want to thank Aditya and Ashwin in particular for letting me tag along on their research projects at the start of my research career. Fourthly, I must acknowledge my fortune in being able to perform my work at Carnegie Mellon University (CMU). There are few other premiere universities where I could have vii done this work while being so oblivious to the day-to-day chores performed behind the scenes to ensure a graduate student’s survival. For that, I am indebted to Sharon Burks, Deborah Cavlovich, Catherine Copetas, Barbara Grandillo, and the rest of the excellent staff and faculty at CMU. In addition, much of my dissertation work was done while I was employed by Intel Research in Seattle. Therefore, it is incumbent on me to also thank Intel for paying me a generous salary and providing equipment during the months of the year when I was away from CMU. Finally, I thank my family and friends for helping me get through my six years in Pittsburgh. viii Contents 1 Introduction 1 1.1 Location Privacy Threats . 3 1.1.1 A Taxonomy of Location Privacy Threats . 4 1.1.2 Legal and Economic Deterrents . 6 1.1.3 Desirable Technical Countermeasures . 8 1.2 Thesis and Approach . 8 1.3 Thesis Scope . 10 1.4 Our Contributions . 13 1.5 Dissertation Outline . 13 2 Quantifying Tracking Threats 15 2.1 The Implicit Identifier Problem . 16 2.2 Related Work . 17 2.3 Experimental Setup . 20 2.4 Wireless Traces . 22 2.5 Implicit Identifiers . 24 2.5.1 Identifying Traffic Characteristics . 24 2.5.2 Evaluating User Distinctiveness . 26 2.6 When can we be tracked? . 34 2.6.1 Q1: Did this Sample come from User U? . 35 2.6.2 Q2: Was User U here today? . 37 ix 2.7 Summary, Implications, and Limitations . 42 2.7.1 Implications . 43 2.7.2 Study Limitations . 44 3 Mitigating Eavesdropping Threats 45 3.1 Related Work . 47 3.2 Problem and Solution Overview . 50 3.2.1 Threat Model . 50 3.2.2 Security Requirements . 52 3.2.3 Challenges . 53 3.2.4 System Overview . 54 3.3 Identifier-Free Mechanisms . 56 3.3.1 Straw Man: Public Key Mechanism . 56 3.3.2 Straw Man: Symmetric Key Mechanism . 57 3.3.3 Discovery and Binding: Tryst . 58 3.3.4 Data Transport: Shroud . 61 3.3.5 Other Protocol Functions . 64 3.4 Implementation Details . 65 3.4.1 Tryst: Practical Considerations . 65 3.4.2 Shroud: Practical Considerations . 66 3.4.3 Prototype Implementation . 67 3.5 Formal Security Analysis . 67 3.5.1 Preliminaries . 68 3.5.2 Security of Each Part . 71 3.5.3 Security of Tryst and Shroud . 76 3.6 Robustness Against Attacks . 79 3.6.1 Replay and Active Attacks . 79 3.6.2 Denial-of-Service Attacks . 80 3.6.3 Logical Layer Side-Channel Attacks . 80 x 3.6.4 Physical Layer Side-Channel Attacks . 80 3.6.5 Tryst Key Compromise . 81 3.6.6 Shroud Key Compromise . 82 3.6.7 Adversarial Senders and Receivers . 82 3.7 Performance Evaluation . 82 3.7.1 Comparison Protocols . 82 3.7.2 Setup . 83 3.7.3 Discovery and Link Setup Performance . 84 3.7.4 Data Transport Performance . 91 3.8 Summary and Discussion . 96 3.8.1 Discussion . 97 4 Mitigating Threats from Crowd-sourced Location-based Systems 99 4.1 Measurement Study . 101 4.1.1 Setup . 102 4.1.2 Results . 105 4.1.3 Discussion . 107 4.2 Wifi-Reports Overview . 108 4.2.1 Challenges . 108 4.2.2 System Tasks . 108 4.3 Location Privacy . 110 4.3.1 Threat Model . 111 4.3.2 Straw Man Protocols . 112 4.3.3 Blind Signature Report Protocol . 113 4.3.4 Discussion . 117 4.4 Location Context . 118 4.4.1 Geographic Positioning . 118 4.4.2 Distinguishing Channel Conditions . 118 4.4.3 Discussion . 119 xi 4.5 Evaluation . 121 4.5.1 AP Selection Performance . 121 4.5.2 Report Protocol Performance . 125 4.5.3 Resistance to Fraud . 128 4.6 Related Work . 130 4.7 Summary and Discussion . 132 4.7.1 Discussion . 132 5 Conclusions and Future Work 135 5.1 Summary . 135 5.2 Contributions . 136 5.2.1 Implicit Identifier Tracking Threats in Practice . 136 5.2.2 Practical Identifier-free Link Layer Protocols . 137 5.2.3 Privacy-preserving Crowd-sourced Location-based Systems . 138 5.3 Future Work . 139 5.3.1 Private Key Distribution for Device Discovery . 139 5.3.2 Concealing Higher-layer Explicit Identifiers .

Quantifying and Mitigating Privacy Threats in Wireless Protocols and Services Jeffrey Anson Pang CMU-CS-09-145 July 2009

Digital Systems Modeling Chapter 2 VHDL-Based Design

Xilinx Synthesis and Verification Design Guide

Publication Title 1-1962

Designcon 2004 VHDL-200X and the Future of VHDL

Introduction

Intel Edge Computing Ecosystem Portfolio E-Booklet

Appendices Table of Contents

GHDL Documentation Release 1.0-Dev

High-Level Synthesis of Control and Memory Intensive Applications

High-Speed Data Acquisition and Optimal Filtering Based on Programmable Logic for Single-Photoelectron (SPE) Measurement Setup

Modeling & Simulating ASIC Designs with VHDL

VHDL Primer 23