Arxiv:1907.02142V1 [Cs.CR] 3 Jul 2019 After Arriving at an Airport, Restaurant, Shopping Mall, Hotel Or Similar Locations

On Privacy Risks of Public WiFi Captive Portals Suzan Ali, Tousif Osman, Mohammad Mannan, and Amr Youssef Concordia University, Montreal, Canada fa suzan,t osma,mmannan,[email protected] Abstract. Open access WiFi hotspots are widely deployed in many public places, including restaurants, parks, coffee shops, shopping malls, trains, airports, hotels, and libraries. While these hotspots provide an attractive option to stay connected, they may also track user activities and share user/device information with third-parties, through the use of trackers in their captive portal and landing websites. In this paper, we present a comprehensive privacy analysis of 67 unique public WiFi hotspots located in Montreal, Canada, and shed some light on the web tracking and data collection behaviors of these hotspots. Our study re- veals the collection of a significant amount of privacy-sensitive personal data through the use of social login (e.g., Facebook and Google) and registration forms, and many instances of tracking activities, sometimes even before the user accepts the hotspot's privacy and terms of service policies. Most hotspots use persistent third-party tracking cookies within their captive portal site; these cookies can be used to follow the user's browsing behavior long after the user leaves the hotspots, e.g., up to 20 years. Additionally, several hotspots explicitly share (sometimes via HTTP) the collected personal and unique device information with many third-party tracking domains. 1 Introduction Public WiFi hotspots are growing in popularity across the globe. Most users frequently connect to hotspots due to their free-of-cost service, (as opposed to mobile data connections) and ubiquity. According to a Symantec study [41] conducted among 15,532 users across 15 global markets, 46% of participants do not wait more than a few minutes before connecting to a WiFi network arXiv:1907.02142v1 [cs.CR] 3 Jul 2019 after arriving at an airport, restaurant, shopping mall, hotel or similar locations. Furthermore, 60% of the participants are unaware of any risks associated with using an untrusted network, and feel their personal information is safe. A hotspot may have a captive portal, which is usually used to communicate the hotspot's privacy and terms-of-service (TOS) policies, and collect personal identification information such as name and email for future communications, and authentication if needed (e.g., by asking the user to login to their social media sites). Upon acceptance of the hotspot's policy, the user is connected to the internet and her web browser is often automatically directed to load a landing page (usually the service provider's webpage). 2 S. Ali et al. Several past studies (e.g., [7,40]) focus on privacy leakage from browsing the internet or using mobile apps in an open hotspot, due to the lack of encryption, e.g., no WPA/WPA2 support at the hotspot, and the use of HTTP, as opposed to HTTPS for connections between the user device and the web service. However, in recent years, HTTPS adoption across web servers has increased dramatically, mitigating privacy exposure through plain network traffic. For example, according to the Google Transparency Report [17], as of Apr. 6, 2019, 82% of web pages are served via HTTPS for Chrome users on Windows. On the other hand, in the recent years, there have also been several comprehensive studies on web tracking on regular web services and mobile apps with an emphasis on most popular domains/services (see e.g., [14,4,3]). In contrast to past hotspot and web privacy measurement studies, we ana- lyze tracking behaviors and privacy leakage in WiFi captive portals and landing pages. We design a data collection framework (CPInspector) for both Win- dows and Android, and capture raw traffic traces from several public hotspots (in Montreal, Canada) that require users to go through a captive portal before allowing internet access. Challenges here include: manual collection of captive portal data by physically visiting each hotspot; making our test environment separate from the regular user environment so that we do not affect the user's browsing profiles; ensuring that our tests remain unaffected by the user's past browsing behaviors (e.g., saved tracking cookies); and creating and monitoring several test accounts in popular social media or email services as some hotspots mandate such authentication. CPInspector does not include any real user information in the collected dataset, or leak such information to the hotspots (e.g., by using fake MAC addresses). From each hotspot, we collect traffic using both Chrome and Firefox on Windows. In addition to the default browsing mode, we also use private browsing, and deploy two ad-blockers to check if such privacy-friendly environments help against captive portal trackers|leading to a total of eight datasets for each hotspot. We also use social logins (Facebook, LinkedIn, Google, Instagram, Twitter) if required by the captive portal, or provided as an option; we again use both browsers for social login tests (two to six additional datasets as we have observed at most three social login options per hotspot). Some hotspots also require the user to complete a registration form that collects the user's PII|in such cases, we collect two more datasets (from both browsers). Finally, some hotspots collect additional personal information as part of an optional survey. When reporting statistics on tracking domains and cookies, we accumulate the distinct trackers as observed in all the datasets collected for a given hotspot. On Android, we collect traffic only from the custom captive portal app (as opposed to Chrome/Firefox on Windows) as the cookie store of this app is separate from browsers. Consequently, tracking cookies from the Android captive portal app cannot be used by websites loaded in a browser. Recent Android OSes also use dynamic MAC addresses, limiting MAC address based tracking. However, we found that cookies in the captive portal app may remain valid for up to 20 years, allowing effective tracking by hotspot providers. On Privacy Risks of Public WiFi Captive Portals 3 We also design our framework to detect ad/content injection by hotspots; however, we observed no content modification attempts by the hotspots. Fur- thermore, we manually evaluate various privacy aspects of some hotspots, as documented in their privacy/terms-of-service policies, and then compare the stated policies against what happens in practice. Note: by default all our statistics refer to the measurements on Windows; we explicitly mention when results are for Android (mostly in Sec.5). Contributions and summary of findings. 1. We collected a total of 679 datasets from the captive portal and landing page of 80 hotspot locations between Sept. 2018 to Apr. 2019. 103 datasets were discarded due to some errors (e.g., network failure). We analyzed over 18.5GB of collected traffic for privacy exposure and tracking, and report the results from 67 unique hotspots (576 datasets), making this the largest such study to characterize hotspots in terms of their privacy risks. 2. Our hotspots include cafes and restaurants, shopping malls, retail businesses, banks, and transportation companies (bus, train and airport), some of which are local to Montreal, but many are national and international brands. 40 hotspots (59.7%) use third-party captive portals that appear to have many other business customers across Canada and elsewhere. Thus our results might be applicable to a larger geographical scope. 3. 27 hotspots (40.3%) use social login or a registration page to collect personal information (19 hotspots make this process mandatory for internet access). Social login providers may share several privacy-sensitive PII items|e.g., we found that LinkedIn shares the user's full name, email address, profile picture, full employment history, and the current location. 4. Except three, all hotspots employ varying levels of user tracking technologies on their captive portals and landing pages. On average, we found 7.4 third- party tracking domains per captive portal (max: 34 domains). 40 hotspots (59.7%) create persistent third-party tracking HTTP cookies (validity up to 20 years); 4.2 cookies on average on each captive portal (max: 34 cookies). Surprisingly, 26 hotspots (38.8%) create persistent cookies even before getting user consent on their privacy/TOS document. 5. Several hotspots explicitly share (sometimes even without HTTPS) personal and unique device information with many third-party domains. 40 hotspots (59.7%) expose the user's device MAC address; five hotspots leak PII via HTTP, including the user's full name, email address, phone number, address, postal code, date of birth, and age (despite some of them claiming to use TLS for communicating such information). Two hotspots appear to perform cross-device tracking via Adobe Marketing Cloud Co-op [2]. 6. Two hotspots (3.0%) state in their privacy policies that they explicitly link the user's MAC address to the collected PII, allowing long-term user tracking, especially for desktop OSes with fixed MAC. 7. From our Android experiments, we reveal that 9 out of 22 hotspots can ef- fectively track Android devices even though Android uses a separate captive portal app and randomizes MAC address as visible to the hotspot. 4 S. Ali et al. 2 Background and Related Work In this section, we first provide an overview of public hotspots, and then briefly review related previous studies on hotspots, web tracking, and ad injection. Hotspot access is usually deployed in three forms: captive portal, direct/open- access (no captive portals), or password-protected networks. In captive portal networks, users first go through a captive portal session before getting internet access. The captive portal web-page usually displays the privacy policy and/or the terms-of-service (TOS) document, along with some advertisements, and sometimes an option to select their preferred language (for viewing the portal content), and a social login or registration form.

Load more