Privacy vs. Convenience

Justin P. Mauss Department of Computer Science University of Wisconsin – Platteville Platteville, WI 53818 [email protected]

Abstract

The shift of doing everything online from social networking to banking has created a lucrative business for aggregating user information. Online privacy issues are no longer focused on user via user. The main privacy issue is the centralization of user information being aggregated, sold, and shared across thousands of businesses with few laws or regulations due to the government playing catch up with rapid advancements in technology. However, it is also very convenient to bank online, have ads targeted to our interests, and instantly share updates with friends and family. Each user needs to decide a comfortable balance between privacy and convenience. Before that balance is decided, it is important for each user to be informed about how and why companies collect, sell, and share users' data. Tools can be used in real time to view companies tracking users, allowing users the option to block online tracking.

Introduction

Privacy is an ever growing concern as advancements in technology and the web are outpacing most laws on what data analytic companies can collect, use, sell and share. As of 2013 it is mainly up to the users' discretion for disclosing information on the web to the public and businesses. This is a look into breaking down the current state of user information sharing on the web. The collection of data is not all bad. It helps make our lives more convenient, such as online banking, connecting with friends and family, shopping at home, obtaining deals and free services, and many other every day aspects of life. It ultimately comes down to privacy vs. convenience. After seeing how information is aggregated and used by companies, it is up to the user to decide what balance they are comfortable with between privacy and convenience.

There is a great analogy that everyone can understand when discussing the balance between privacy and convenience. Think of a house, privacy is needed for each home. One can build a tall fence all around their house to make it more private, but then the view is restricted. One would also have a lot of privacy if there were no windows built into the house, but then it becomes an inconvenience when having no natural light and needing to go outside each time one wanted to view something outside. Also, one could only install one door so everyone entering has to come in and out of that one door. However, it is much more convenient to have multiple doors for easy access to the garage and backyard. So privacy and convenience become mutually 2 exclusive concepts. If one increases, the other decreases. The balance is to be decided on an individual bases.

Privacy Study

A group of students from the Nelson Mandela Metropolitan University in Port Elizabeth, South Africa conducted a study called "Claimed vs. Observed Information Disclosure on Social Networking Sites." [3] 131 participants were observed at a Facebook component level. Participants were sent a questionnaire that asked them to indicate different Facebook components that they had disclosed on their profiles. Data collection took place during September 2011. The researchers used three main categories for data: Personal identifiable information (gender, hometown, birthday, photos), sensitive personal information (employer info, secondary school, relationship status), and potentially stigmatizing information (religious status, political views, and favorite movies, books, and music. [3]

The participants were complete strangers to the researchers, so all participants’ Facebook profiles were only viewed from the perspective of a “Public” view instead of a “Friend” view. When the researchers received the data from the questionnaires, they then compared the percentage of people who claimed disclosure on certain data to the actual observed disclosure when searching for the participant on Facebook. The results from the questionnaires versus the observed discloser are in Table 1:

Table 1: Personal Identifiable Information [3]

It turns out that many participants claimed public disclosure when posting information on Facebook, but the actual observed disclosure was much lower. This table is just one category of data from the study, but the other two categories follow the same pattern. The "Difference Between Behaviors" percentage is almost always much positive and very high. This shows that the participants are actually more protected from sharing data to the public than they thought. Contrary to the popular perception of social networking, user via user privacy is higher than the user might think. The next section, Social Networking Sites, elaborates on the real privacy issue. [3] 3

Social Networking Sites

Social networking sites are some of the best examples of services to analyze online privacy. Facebook will be the main focus as it leads the social networking world with 1.06 billion active users every month. [8] As mentioned early, there are few strict laws for terms of services or privacy policies, leaving Facebook in complete control when establishing privacy policies. Facebook is a free service, but it comes with a different type of cost. When a user is on Facebook, they are the product and their information is being sold. Facebook can aggregate and mine all user data to, for example, create targeted ads for other companies' products and services. The targeted ads get used by users more often than random ads, so Facebook can receive more money for its ad space across the site because Facebook can present ads with the user’s exact interests. The more "whole" your information/profile is, the more valuable it is. Facebook users have some controls over privacy, but it is only privacy via the other users. For the most part it is not privacy controls dealing with what Facebook can do with user data. [2]

Online Tracking

Tracking Methods

Behind the scenes tracking is done using a number of methods. Some of the main methods include cookies, browser history, beacons, and IP address.

Cookies

There are many different types of cookies. Some cookies are for a website's functionality. However, a number of cookies are solely for tracking purposes such as HTTP cookies and 3rd party cookies. HTTP cookies are placed on the computer from the visited site. The cookies gather all kinds of data from the user's system information to clicks on the company's website. HTTP cookies will stay on a computer permanently unless they are manually removed. 3rd party cookies are mainly different advertising companies gathering data. The advertising companies work with many other different companies by placing their cookies on their sites. That allows the advertising companies to track users across many different sites creating a more complete profile of each web user. [4]

Browsing History 4

The default setting for most browsers allow for a large amount of browsing history to be stored. Companies are able to extract browsing history using CSS JavaScript and other methods. The browsing history is valuable because it reveals the user's most visited sites. The user's site preferences are used to determine the type of consumer demographic they fit into. Although most updated browsers block CSS JavaScript history extraction, there are always new ways being created to extract the history. [4]

Beacons

Another method for tracking web users is a beacon. Beacons are embedded in a company's website and they track page views, current time, IP address, and browser and system specifications. All this information is used to continue to build user's profile so it becomes even more complete, making the user's data increase in value. [4]

IP Address

Companies can also obtain users' IP address to determine the user's location down to the zip code. The location allows ads and services on websites to suggest local products, services, or events. The average internet user does not change their IP Address. When an IP Address does not change, companies can track all connection from that same IP Address so the user can be easily identified for future visits.

KISSmetrics Tracking

In 2011 researchers at U.C. Berkeley discovered a tracking service that could not be evaded. The researchers analyzed the tracking service as they turned off flash storage, blocked cookies, and surfed in "Privacy" mode. The tracking service was still able to track while using the common ways build within a browser to turn off tracking. The tracking service mainly tracked number of visitors to a site or page, what the visitors do on the site, and what other sites they are visiting. The tracking service could also track across multiple opened browsers. [11]

This specific tracking service was created and used by KISSmetrics. KISSmetrics is a popular website analytics company allowing businesses to gather a wealth of information on their visitors and customers. The tracking service found by the researchers was on many of KISSmetrics' clients' websites such as Hulu, Spotify, AOL, Groupon, and many other of the web's most popular sites. The tracking service would attach a unique ID to a visitor's computer. If the visitor navigated to other sites using KISSmetrics, KISSmetrics could aggregate the visitor's data from all those specific sites using the unique ID as the primary key. This is important because each website's data on the visitor is different. For instance, Hulu contains their movie and TV show interests, Spotify contains their music interests, Groupon contains their interests in vacation locations, restaurants, and sporting events, and so on across all the sites that are clients of 5

KISSmetrics. This is important for KISSmetrics because the larger, more complete profile of each individual increases the value of their web analytic services, allowing them to charge more because their information on individuals can be more complete than other web analytic companies. [11]

The tracking service also used almost all the tracking methods mentioned earlier and more. This way, if an anti-tracking service or setting was on, it may only stop a few aspects of the tracking service so it will be able to continue tracking as normal using the unblocked portion of the tracking service. The main issue is this is unknown to users and never explained. There were some lawsuits and in 2012 KISSmetrics agreed to settle a class-action lawsuit by promising to avoid controversial tracking methods and notify users about the tracking methods so they have a choice on the matter. [10]

Social Networking Plugins

Many other companies employ the same methods that KISSmetrics uses to track consumers, but the difference being users need to first click through privacy agreement terms before being able to use the site. However, clicking "agree" is not saying much since in almost all cases the privacy agreement terms are many pages of small print and obscure technical or legal language. One of the best examples for companies using legal methods to track user activity across multiple sites is social plugins, such as Facebook, Google+, and Twitter plugins, being integrated with sites all across the web.

Social plugins are used on other sites for users to click and share links, videos, pictures, and articles on their social media platform to show friends, family, and co-workers. This is convenient as a user to be able to share anything with whoever instantly; however there is a privacy issue that is not well known by the average user. When a user is logged into their social media account, Facebook for example, Facebook now knows any browsing activity the user does if the websites they are browsing have the integrated social plugins. The user does not need to actually click the plugin for Facebook to know the user is visiting that page. Most users leave their Facebook accounts signed in while doing other browsing and social plugins are integrated with many of the top sites. Official Facebook plugins are integrated in the homepage of 24.3% of the current top 10,000 websites globally. Even more so, 49.3% of the current top 10,000 websites have some form of Facebook link on their homepage. Twitter is at 10% and 41.7% respectfully, and Google + is at 13.3% and 21.5%. Also, as of 2/16/2013, Google is the most visited site in the world, Facebook is the second, and twitter is the tenth. The amount of users with accounts on these social media sites, along with the high percentage of their integration across the web allows for a very lucrative business in data aggregation. All three of these services are free to the users, but the users' information is the product being sold to other businesses. [1] [2]

eBay Ad Tracking 6

Some users may find it convenience that ads and suggestions are being catered to them based off their information, instead of just seeing any random ad. EBay's approach is to be very straight forward with their tailored ads. When a user is on eBay, the tailored ads will have a small icon near the bottom of the ad labeled "Ad Choice." A user can click the icon and get an explanation of why eBay uses tailored ads using the user's information and also displays a link to the Ad Choice and privacy policy. If users turn off ad choice, they will no longer use information to create tailor ads but instead use a random ad where the tailored ones use to be. Now eBay is a step in the right direction by giving user's an opt-out option to that service, but the information is still being collected so ultimately eBay is still in control. [9]

Monitor and Block Tracking

Collusion Tool

The previous tracking examples may be over whelming to some users, but there are easy to use tools for the users to monitor and stop tracking. Before diving into anti-tracking tools users can use, there is an important tool for users to monitor websites and trackers. The tool is an experimental add-on called Collusion. The Collusion add-on is available for free in both Firefox and Chrome browsers. Collusion provides a real time visual for connections between 3rd party tracking companies and businesses. Figure 1 below provides a sample of what Collusion shows the user. The blue outlined circles are sites the user actually visited while the red circles are various 3rd party companies such as KISSmetrics. A circle is connected to another if information was exchanged between the two. The user can also scroll the cursor over each circle to view more information.

Figure 1: Collusion Sample 7

I used Collusion to provide a visual comparison between an average day of web browsing without anti-tracking tools compared to an average day of web browsing with anti-tracking tools. I visited ten sites that I use daily or weekly. The sites include uwplatt.edu, eBay, Discover, PayPal, Wired, Google, Wall Street Journal, Apple, Twitter, and Facebook. As mentioned before, when a user is signed into their account it can assist the companies in connecting and aggregating the user's data. The two test cases include viewing Collusion before anti-tracking tools are used and being signed into all website accounts, and after anti-tracking tools are used and being signed into all website accounts. Figure 2 is the screenshot of the first case. As mentioned before, the blue glowing circles are the sites that were actually visited. The other circles are various tracking services and companies exchanging data with the visited websites. There are over 50 large tracking services and companies communicating with the 10 sites actually visited, and as a reminder, an account with a lot of information is signed in at each of those visited sites. The next case will be viewed after discussing the anti-tracking tools.

Figure 2: Screenshot before Anti-Tracking Tools and Signed into all Accounts

Anti-Tracking Tools

Next, the anti-tracking tools need to be discussed before viewing the difference through Collusion when tracking is prevented. Three main tools that work great together include Ghostery, Adblock Plus, and Priv3. All three tools are also browser add-ons. Ghostery detects cookies, web bugs, beacons, and other tracking methods. Ghostery tracks over 1,200 trackers and allows the user to quickly set up filters to decide what tracking methods to block or allow. 8

When visiting a site, all the current trackers are populated in a window for a user to view if they choose to. The user can opt in and out of different trackers detected, and they can also view information of each tracker including purpose of tracking, what information is being collected, data retention, contact information, data sharing, privacy policy, and number of websites that specific tracker is found on. For my purposes, I had Ghostery block all tracking activity unless a website needed some solely for functional purposes.

The next tool installed is Adblock Plus. Adblock Plus is an open source add-on that blocks pop- ups, video ads, and banners. Most tracking activity is for advertising purposes, so Adblock Plus takes away the possibility of accidently clicking on an Ad or being overwhelmed with ads on a website. The third tool is Priv3 which helps cover the idea of tracking through social media plug-ins as mentioned before. It is a browser add-on created by researchers from International Computer Science Institute in Berkeley. Priv3's main purpose is to work on blocking tracking of Facebook, Twitter, Google +, and LinkedIn. As of now, the team is just focused on those four major social media sites as they continuously tweak their Priv3 add-on to keep up with the social networking tracking methods. Finally, below is the results in Collusion after the three anti- tracking tools are installed in a browser.

Collusion Results

Figure 3 shows the second case of Collusion after the anti-tracking tools were installed. The amount of tracking services and companies exchanging information with the actual sites visited is down 40. Some of the tracking services still left are specifically for functional purposes only. Also, only 8 out of the 10 sites are even doing anything with information. This is because those sites did not have any functional purposes tied into their tracking services.

Figure 3: Screenshot after Anti-Tracking Tools and Signed into all Accounts

Conclusion 9

As discussed before, some tracking services are required because they are tied in with the website’s functionality. More companies are adapting methods that crash their own site if some specific services are stopped, that way users would have to opt-in to the service to use their website and open the door for tracking again. The realistic issue is that almost all companies are tracking aggregating user's activity. As of June 30, 2012 there were around 2.4 billion internet users worldwide. Also, as of December 2012, only about 28 million internet users were actively using the top anti-tracking tools discussed. That means only about 1.2% of the worldwide internet users block major trackers. [6] [7]

As mentioned a few times, not all tracking is bad and it provides daily conveniences that most internet users are accustom too. The issue is awareness and education of online privacy so that each user can decide their own balance between privacy vs. convince online.

References

[1] Vieira, M., Antunes, N., & Madeira, H. (2009). Using web security scanners to detect vulnerabilities in web services. Informally published manuscript, Department of Informatics Engineering, University of Coimbra, Portugal. , Available from IEEE. (978-1-4244-4421-2).

[2] Bonneau, J., & Anderson, J. (2009). Prying data out of a social network. Informally published manuscript, Computer Laboratory, University of Cambridge, , Available from IEEE. (978-0-7695-3689-7).

[3] Ntlatywa, P., Botha, R., & Haskins, B. (2012). Claimed vs observed information disclosure on social networking sites. Informally published manuscript, Institute for ICT Advancement and School of ICT, Nelson Mandela Metropolitan University, Port Elizabeth, South Africa. , Available from IEEE. (978-1-4673-2159-4).

[4] Ayenson, Mika, Wambach, Dietrich James, Soltani, Ashkan, Good, Nathan and Hoofnagle, Chris Jay. (2011) Flash Cookies and Privacy II: Now with HTML5 and ETag Respawning. Available at SSRN: http://ssrn.com/abstract=1898390 or http://dx.doi.org/10.2139/ssrn.1898390.

[5] V K, N., Aliyar, L., & Ali, A. (2010). An overview of cryptographic solutions to web security. Informally published manuscript, Anna University of Technology, Coimbatore, India. , Available from IEEE. (978-1-4244-5967-4).

[6] Miniwatts Marketing Group. (2012, June 30). Top 20 countries with highest number of internet users. Retrieved from http://www.internetworldstats.com/top20.htm.

[7] Alexa. (2013, Feb 17). Top 500 sites on the web. Retrieved from http://www.alexa.com/topsites.

[8] Facebook.com. (2013, January 30). Retrieved from http://investor.fb.com/releasedetail.cfm? releaseID=736911 10

[9] Using adchoice. (2013, March 19). Retrieved from http://pages.ebay.com/help/account/adchoice.html

[10] Kissmetrics agrees settlement in etags wiretap class action lawsuit. (2012, Oct 23). Retrieved from http://www.lawyersandsettlements.com/settlements/16962/kissmetrics-agrees- settlement-in-etags-wiretap-class.html

[11] Singel, R. (2011, July 29). Researchers expose cunning online tracking service that can’t be dodged. Retrieved from http://www.wired.com/business/2011/07/undeletable-cookie/