Trends in Data Exfiltration and Privacy Policies

Trends in Data Exfiltration and Privacy Policies Alexander Huang Alex Kimn Jesse Widner [email protected] [email protected] [email protected] Diane Zhou [email protected] May 13, 2020 1 Contents 1 Introduction 4 2 Background 5 2.1 Companies . .5 2.1.1 Cisco Webex . .5 2.1.2 Discord . .6 2.1.3 Google Hangouts . .6 2.1.4 Skype . .6 2.1.5 Zoom . .6 3 Methodology 7 3.1 Metrics . .7 3.1.1 Vagueness/Specificity . .7 3.1.2 Responsibility . .7 3.1.3 Readability . .8 4 Results: General Trends 9 4.1 Personal Data Collection . .9 4.2 Third Party Data Sharing . .9 4.3 Personal Data Retention . 12 4.4 Audio/Video Recording . 14 2 4.5 Overall Readability . 15 5 Results: Case Studies 16 5.1 Zoom 2019 vs. Zoom 2020 . 16 5.2 Jitsi Meet . 17 6 Conclusion 18 3 1 Introduction Recently, it has come to light that a tablet driver released by the graphical tablet manufacturer Wacom tracks the name of every application window opened by the user1. Within the privacy agreement the user sees during the driver installation process, it is mentioned that Wacom sends \aggregate usage data, technical session information, and information about [the user's] hardware device" to Google Analytics. However, the agreement fails to state exactly what is being sent and why, using only vague and uninformative language to mask a blatant example of user data exfiltration. Unfortunately, this form of data exfiltration is now a common occurrence. Under the (often correct) assumption that users will not thoroughly read privacy agreements, well-known and trusted companies like Wacom can obtain sensitive user information without the explicit knowledge or consent of the user. These data can then be sold to data companies or other third parties for a profit, implying that unbeknownst to the user, their information could be made available to a wide range of unknown actors. Moreover, any security vulnerabilities present during the transfer of this exfiltrated data present opportunities for potentially malicious actors to obtain private user information. In this way, seemingly trustworthy actors can breach a user's privacy and data security by exploiting their naivete. Therefore, it is critically important for software users to both: (i) know if a downloaded piece of software could potentially expose them to data exfiltration and (ii) understand what the consequences of such data exfiltration are. Thus, in this work, we analyze the privacy policies of various videoconferencing services to determine potential data privacy vulnerabilities and avenues for data exfiltration. We then evaluate the privacy policies according to three metrics: specificity, responsibility, and readability to determine how well each service accounts for these data privacy issues. Finally, we provide several recommendations for users wishing to protect their data privacy. 1https://www.theverge.com/2020/2/6/21126245/wacom-tablet-app-tracking-google-analytics 4 2 Background The historic lack of standardization in software privacy policies has conditioned most users to automatically click \yes" and agree to everything in order to access the software. A study performed in 2017 showed that 74% (N=543) of individuals did not choose to read the privacy policy when presented with the option but 97% agreed to it. Even when presented with a mandatory terms of service (TOS) policy, which should take roughly 15 minutes to read at a typical adult reading rate of 250-280 words per minute, users spent an average of 51 seconds reading it. Those who declined the TOS spent 90 seconds longer reading it[1]. This study elucidates an important point: most users do not read privacy policy and terms of service with enough time and given the obfuscated nature of data exfiltration, users are largely unaware their data is being taken from them and do not know how that data will be used. Furthermore, privacy policies contain technical jargon that is difficult for most average users to grasp, making readability low. Even if users are proactive in reading the policies, there is a good chance they will not understand it. The readability of privacy policies is important since users tend to trust them more having understood them [2]. We argue that privacy policies serve as a contract between users and service providers. Therefore, it is important to ensure that all parties, particularly the user, are fully aware of the implications of agreeing to the contract. We will assess the readability of all privacy policies examined in this project. 2.1 Companies Given the recent increased usage and increased scrutiny of videoconferencing platforms due to the ongoing COVID-19 pandemic, we focus on five videoconferencing platforms for our analysis: Cisco Webex, Discord, Google Hangouts, Skype, and Zoom. We provide a brief introduction and description of each of the videoconferencing services studied in the following sections. 2.1.1 Cisco Webex Cisco Webex (originally Webex) is a videoconferencing and collaborative platform wholly owned by the technology conglomerate Cisco Systems. The primary videoconferencing service offered within the Webex platform is Webex Meetings. Thus, we primarily analyze the privacy policy of Webex Meetings [3] specifically, in addition to the general policies outlined in Cisco Systems' online 5 privacy policy [4] where the latter is applicable. 2.1.2 Discord Discord is a VoIP and general communication application that features text, image, video, and audio communication over chat channels on a digital distribution platform similar to a centralized version of IRC designed primarily for gamers, but also other communities including education. We analyze the company's privacy policy [5] as it applies to the platform and software distributed by Discord, Inc. 2.1.3 Google Hangouts Google Hangouts is a communications software developed by Google that supports text chats, audio chats, and video conferencing. It was originally just a feature of Google+, a discontinued social media platform, but has since become a standalone service. Hangouts does not have its own privacy policy and instead references the general Google privacy policy that is shared across many Google services. 2.1.4 Skype Skype is a telecommunications application that allows for video chatting, voice calling, and instant messaging. Microsoft acquired Skype in 2011, so Microsoft's privacy policy now applies to Skype [6]. Skype can be used between various devices such as computers, tablets, mobile devices, the Xbox One console, smartwatches, and Amazon Echo. Specialized Skype products include Microsoft Teams (originally Skype for Business) and Skype for content creators, but we will focus on the general Skype product available for free to all consumers. 2.1.5 Zoom Zoom is a cloud-based peer-to-peer platform for videoconferencing and online chatting developed Zoom Video Communications. The service was originally launched in 2012 by a former executive at Cisco Webex, and has since grown to become one of the most popular videoconferencing platforms. We note that due to increased media scrutiny surrounding its data privacy policies, Zoom made 6 significant updates to its privacy agreement on March 29th, 2020. As such, though we analyze both Zoom's current privacy agreement [7] in comparison to the other videoconferencing services, we also compare its previous privacy agreement [8], dated to December 31st, 2019 in a separate case study in Chapter 5. 3 Methodology In analyzing the privacy policies of each company, we consider a few metrics as described in the next section. We find it important to distinguish between language and substance, so we include metrics for both. Language reflects how easy it is for a consumer to understand the privacy policy, while substance reflects what companies can actually be held accountable for. 3.1 Metrics 3.1.1 Vagueness/Specificity The privacy policies that we have analyzed vary based on notions of specificity and vagueness related to the amount of detail put into the privacy policies by the respective companies. These notions apply to what data the company collects, when it does so, and any guarantees on what it does with such data along with the type of data that is shared with different entities (if any) and what the data is used for. If a company were to list all of this information in full detail, i.e. without some form of catch-all clause, the company would be considered fully specific in how it handles user data. If a company does not list all of this information or includes a catch-all clause in its privacy agreement, then it may be considered more vague depending on the amount of detail in the privacy policy. An example of a fully vague policy would include not disclosing what data is collected or how it is used in any detail. 3.1.2 Responsibility Companies vary in how much responsibility they claim to take when it comes to privacy and data protection. We rate the responsibility of a company from full responsibility, i.e. the company proactively makes sure data is only collected with consent and used for certain functions, to no responsibility, i.e. it is entirely up to the user to consider what data they are sharing when they 7 take certain actions and should have no expectations about what that data might be used for. In the case that a privacy policy does not explicitly state what the company takes responsibility for, we assume the company to fall on the \no responsibility" side of the spectrum. 3.1.3 Readability Readability is another metric we will use in our analysis and evaluation of each company's privacy policies. Quantifying readability is a concurrent research problem in its own right with many viable methods for producing a readability score, but we used the Flesch{Kincaid readability test in this project.

Trends in Data Exfiltration and Privacy Policies

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support