Cybercasing the Joint: on the Privacy Implications of Geo-Tagging
Total Page:16
File Type:pdf, Size:1020Kb
Cybercasing the Joint: On the Privacy Implications of Geo-Tagging Gerald Friedland 1 Robin Sommer 1,2 1 International Computer Science Institute 2 Lawrence Berkeley National Laboratory Abstract Sites like pleaserobme.com (see § 2) have started cam- paigns to raise the awareness of privacy issues caused This article aims to raise awareness of a rapidly emerging privacy threat that we term cybercasing: using geo-tagged in- by intentionally publishing location data. Often, how- formation available online to mount real-world attacks. While ever, users do not even realize that their files contain lo- users typically realize that sharing locations has some implica- cation information. For example, Apple’s iPhone 3G em- tions for their privacy, we provide evidence that many (i) are beds high-precision geo-coordinates with all photos and unaware of the full scope of the threat they face when doing videos taken with the internal camera unless explicitly so, and (ii) often do not even realize when they publish such switched off in the global settings. Their accuracy even information. The threat is elevated by recent developments exceeds that of GPS as the device determines its position that make systematic search for specific geo-located data and in combination with cell-tower triangulation. It thus reg- inference from multiple sources easier than ever before. In ularly reaches resolutions of +/- 1m in good conditions this paper, we summarize the state of geo-tagging; estimate the and even indoors it often has postal-address accuracy. amount of geo-information available on several major sites, in- cluding YouTube, Twitter, and Craigslist; and examine its pro- More crucial, however, is to realize that publishing grammatic accessibility through public APIs. We then present geo-location (knowingly or not) is only one part of the a set of scenarios demonstrating how easy it is to correlate geo- problem. The threat is elevated to a new level by the tagged data with corresponding publicly-available information combination of three related recent developments: (i) the for compromising a victim’s privacy. We were, e.g., able to find sheer amount of images and videos online that make even private addresses of celebrities as well as the origins of other- a small relative percentage of location data sufficient for wise anonymized Craigslist postings. We argue that the secu- mounting systematic privacy attacks; (ii) the availability rity and privacy community needs to shape the further devel- of large-scale easy-to-use location-based search capabil- opment of geo-location technology for better protecting users ities, enabling everyone to sift through large volumes of from such consequences. geo-tagged data without much effort; and (iii) the avail- ability of so many other location-based services and an- 1 Introduction notated maps, including Google’s Street View, allow- ing correlation of findings across diverse independent Location-based services are rapidly gaining traction in sources. the online world. With big players such as Google and In this article, we present several scenarios demon- Yahoo! already heavily invested in the space, it is not strating the surprising power of combining publicly surprising that GPS and WIFI triangulation are becom- available geo-informationresources for what we term cy- ing standard functionality for mobile devices: starting bercasing: using online tools to check out details, make with Apple’s iPhone, all the major smartphone makers inferences from related data, and speculate about a lo- are now offering models allowing instantaneously upload cation in the real world for questionable purposes. The of geo-tagged photos, videos, and even text messages to primary objective of this paper is to raise our commu- sites such as Flickr, YouTube, and Twitter. Likewise, nu- nity’s awareness as to the scope of the problem at a time merous start-ups are basing their business models on the when we still have an opportunity to shape further de- expectation that users will install applications on their velopment. While geo-tagging clearly has the potential mobile devices continuously reporting their current loca- for enabling a new generation of highly useful personal- tion to company servers. ized services, we deem it crucial to discuss an appropri- Clearly, many users realize that sharing location infor- ate trade-off between the benefits that location-awareness mation has implications for their privacy, and thus de- offers and the protection of everybody’s privacy. vice makers and online services typically offer different We structure our discussion as follows. We begin levels of protection for controlling whether, and some- with briefly reviewing geo-location technology and re- times with whom, one wants to share this knowledge. lated work in § 2. In § 3 we examine the degree to which 1 we already find geo-tagged data “in the wild”. In § 4 we tematic location-based search: the authors leveraged demonstrate the privacy implications of combining geo- Foursquare’s “check-ins” to identify users who are cur- tags with other services using a set of example cybercas- rently not at their homes. However, in this case loca- ing scenarios. We discuss preliminary thoughts on miti- tions were deliberately provided by the potential victims, gating privacy implications in § 5 and conclude in § 6. rather than being implicitly attached to files they upload. Also, geo-tagging is a hazard to a much wider audience, as even people who consciously opt-out of reporting any 2 Geo-Tagging Today information might still see their location become public through a third party publishing photos or videos, e.g., We begin our discussion with an overview of geo- from a private party. Besmer and Lipford [1] examine a tagging’s technological background, along with related related risk in the context of social networks that allow work in locational privacy. users to add identifying tags to photos published on their Geo-location Services. An extensive and rapidly profile pages. The authors conduct a user study that re- growing set of online services is collecting, providing, veals concerns about being involuntarily tagged on pho- and analyzing geo-information. Besides the major play- tos outside of a user’s own control. They also present a ers, there are many smaller start-ups in the space as well. tool for negotiating photo sharing permissions. Foursquare, for example, encourages its users to con- stantly “check-in” their current position, which they then To mitigate the privacy implications, researchers propagate on to friends; Yowza!! provides an iPhone ap- have started to apply privacy-preserving approaches plication that automatically locates discount coupons for from the cryptographic community to geo-information. stores in the user’s current geographical area; and Sim- Zhong et al. [4] present protocols for securely learning pleGeo aims at being a one-stop aggregator for location a friend’s position if and only if that person is actually data, making it particularly easy for others to find and nearby, and without any service provider needing to be combine information from different sources. aware of the users’ locations. Olumofin et al. [7] discuss In a parallel development, a growing number of sites a technique that allows a user to retrieve point-of-interest now provide public APIs for structured access to their information from a database server without needing to content, and many of these already come with geo- disclose the exact location. Poolsappasit et al. [8] present location functionality. Flickr, YouTube, and Twitter all a system for location-based services that allows specifi- allow queries for results originating at a certain location. cation and enforcement of specific privacy preferences. Many third-parties provide services on top of these APIs. From a different perspective, Krumm [5] offers a survey PicFog, for example, provides real-time location-based of ways that computation can be used to both protect and search of images posted on Twitter. Such APIs have also compromise geometrical location data. already been used for research purposes, in particular for GPS and WIFI Triangulation. Currently, it is mostly automated content analysis such as work by Crandall et high-end cameras that either come with GPS functional- al. [2], who crawled Flickr for automatically identifying ity or allow separate GPS receivers to be inserted into landmarks. their Flash connector or the so-called “hot shoe”. Like- Locational Privacy. Location-based services take wise, it is mostly the high-end smartphones today that different approaches to privacy. While it is common to have GPS built in, including the iPhone, Android-based provide users with a choice of privacy settings, the set of devices, and the newer Nokia N-series. An alternative options as well as their defaults tend to differ. YouTube, (or additional) method for determining the current lo- for example, uses geo-information from uploaded videos cation is WIFI access point or cell-tower triangulation: per default, while Flickr requires explicit opt-in. Like- correlating signal strengths with known locations allows wise, defaults differ across devices: Apple’s iPhone geo- a user or service to compute a device’s coordinates with tags all photos/videos taken with the internal camera un- high precision, as we demonstrate in § 4. If a device less specifically disabled; with Android-based phones, does not directly geo-tag media itself, such information the user needs to turn that functionality on. can also be added in post-processing, either by correlat- The privacy implications of recording locations have ing recorded timestamps with a corresponding log from seen attention particularly in the blogosphere. However, a hand-held GPS receiver; or manually using a map or these discussions are mostly anecdotal and rarely con- mapping software. sider how the ease of searching and correlating infor- Metadata. For our discussion, we are primarily con- mation elevates the risk.