<<

Proceedings on Privacy Enhancing Technologies ; 2020 (2):129–154

Janus Varmarken†*, Hieu Le†, Anastasia Shuba, Athina Markopoulou, and Zubair Shafq The TV is Smart and Full of Trackers: Measuring Smart TV and Tracking

Abstract: In this paper, we present a large-scale mea- 1 Introduction surement study of the smart TV advertising and track- ing ecosystem. First, we illuminate the network behav- Smart TV adoption has steadily grown over the last few ior of smart TVs as used in the wild by analyzing - years, with more than 37% of US households owning at work traÿc collected from residential gateways. We fnd least one smart TV in 2018, a 16% increase over the that smart TVs connect to well-known and platform- previous year [1]. This growth is driven by two trends. specifc advertising and tracking services (ATSes). Sec- First, over-the top (OTT) streaming services such ond, we design and implement tools that sys- as and Netfix have become popular, with more tematically explore and collect traÿc from the top-1000 than 28 million and 60 million subscribers in the US, re- apps on two popular smart TV platforms, and spectively [2]. Second, smart TV solutions are available Fire TV. We discover that a subset of apps at relatively a˙ordable prices, with many of the external communicate with a large number of ATSes, and that smart TV boxes/sticks priced less than $50, while built- some ATS organizations only appear on certain plat- in smart TVs cost only a few hundreds dollars [3]. forms, showing a possible segmentation of the smart As a result, a diverse set of smart TV platforms have TV ATS ecosystem across platforms. Third, we evaluate emerged, and have been integrated into various smart the (in)e˙ectiveness of DNS-based blocklists in prevent- TV products. For example, Apple TV integrates tvOS, ing smart TVs from accessing ATSes. We highlight that and TCL and Sharp TVs integrate Roku. even smart TV-specifc blocklists su˙er from missed ads Many of these platforms have their own respective and incur functionality breakage. Finally, we examine , where the vast majority of smart TV apps our Roku and Fire TV datasets for exposure of person- are ad-supported [4]. OTT advertising, which includes ally identifable information (PII) and fnd that hun- smart TV, is expected to increase by 40% to $2 billion in dreds of apps exfltrate PII to third parties and plat- 2018 [5]. Roku and Fire TV are two of the leading smart form domains. We also fnd evidence that some apps TV platforms in number of ad requests [6]. Despite their send the advertising ID alongside static PII values, ef- increasing popularity, the advertising and tracking ser- fectively eliminating the user’s ability to opt out of ad vices (“ATSes”) on smart TVs are currently not well personalization. understood by users, researchers, and regulators. In this Keywords: Smart TV; privacy; tracking; advertising; paper, we present one of the frst large-scale measure- blocklists ment studies of the emerging smart TV advertising and tracking ecosystem. DOI 10.2478/popets-2020-0021 Received 2019-08-31; revised 2019-12-15; accepted 2019-12-16. In the Wild Measurements (§3). First, we analyze the network traÿc of smart TV devices in the wild. We instrument residential gateways of 41 homes and collect fow-level summary logs of the network traÿc gener- *Corresponding Author: Janus Varmarken†: University ated by 57 smart TVs from seven di˙erent platforms. of , Irvine, E-mail: [email protected] The comparative analysis of network traÿc by di˙er- † Hieu Le : University of California, Irvine, E-mail: ent smart TV platforms uncovers similarities and di˙er- [email protected]. (†The frst two authors made equal contribu- ences in their characteristics. As expected, we fnd that tions and share the frst authorship). Anastasia Shuba: Broadcom Inc. (The author was a student a substantial fraction of the traÿc is related to popular at the University of California, Irvine at the time the work was video streaming services such as Netfix and Hulu. More conducted), E-mail: [email protected] importantly, we fnd generic as well as platform-specifc Athina Markopoulou: University of California, Irvine, E- ATSes. Although realistic, the in the wild dataset does mail: [email protected] not provide app-level visibility, i.e., we cannot deter- Zubair Shafq: University of Iowa, E-mail: zubair- [email protected] mine which apps generate traÿc to ATSes. To address this limitation, we undertake the following major e˙ort. The TV is Smart and Full of Trackers 130

Controlled Testbed Measurements (§4). We de- by multiple apps (“app prevalence”) and keyword search sign and implement two software tools, Rokustic for (based on ATS related words like “ads” and “measure”). Roku and Firetastic for Amazon Fire TV, which sys- PII Exposures (§6). We further examine the net- tematically explore apps and collect their network traf- work traces of our testbed datasets and fnd that hun- fc. We use Rokustic and Firetastic to exercise the top- dreds of apps exfltrate personally identifable informa- 1000 apps of their respective platforms, and refer to the tion (PII) to third parties and platform-specifc parties, collected network traÿc as our testbed datasets. We an- mostly for non-functional advertising and tracking pur- alyze the testbed datasets w.r.t. the top des- poses. Alarmingly, we fnd that many apps send the ad- tinations the apps contact, at the granularity of fully vertising ID alongside static PII values such as the de- qualifed domain names (FQDNs), e˙ective second level vice’s serial number. This eliminates the user’s ability domains (eSLDs), and organizations. We use “domain”, to opt out of personalized advertisements by resetting “endpoint”, and “destination” interchangeably in place the advertising ID, since the ATS can simply link an old of FQDN and eSLD when the distinction is clear from advertising ID to its new value by joining on the serial the context. We further separate destinations as frst, number. We evaluate the blocklists’ ability to prevent third, and platform-specifc party, w.r.t. to the app that exposures of PII and fnd that they generally perform contacts them. well for Roku, but struggle to prevent exfltration of the First, we fnd that the majority of apps contact few device’s serial number and the device ID to third parties ATSes, while about 10% of the apps contact a large and the platform-specifc party for Fire TV. number of ATSes. Interestingly, many of these more Contributions. In this paper, we analyze the network concerning apps come from a small set of developers. behavior of smart TVs, both in the wild and in the lab. Second, we fnd what appears to be a segmentation of Our contributions include the following: (1) providing the smart TV ATS ecosystem across Roku and Fire TV an in-depth comparative analysis of the ATS ecosystems as (1) the two datasets have little overlap in terms of of Roku, Fire TV, and Android; (2) illuminating the ATS domains; (2) some third party ATSes are among key players within the Roku and Fire TV ATS ecosys- the key players on one platform, but completely absent tems by mapping domains to eSLDs and parent orga- on the other; and (3) apps that are present on both plat- nizations; (3) evaluating the e˙ectiveness and adverse forms have little overlap in terms of the domains they e˙ects of an extensive set of blocklists, including smart contact. Third, we compare the top third party ATS TV specifc blocklists; (4) instrumenting long experi- domains of the testbed datasets to those of Android. ments per app to uncover approximately twice as many Evaluation of DNS-Based Blocklists (§5). Users domains as [11]; and (5) making our tools, Rokustic and typically rely on DNS-based blocking solutions such as Firetastic, and our testbed datasets available [12]. Pi-hole [7] to prevent in-home devices such as smart Outline. The structure of the rest of the paper is as fol- TVs from accessing ATSes. Thus, we evaluate the e˙ec- lows. Section 2 provides background on smart TVs and tiveness of DNS-based blocklists, selecting those that reviews related work. Section 3 presents the in the wild are most relevant to smart TVs. Specifcally, we ex- measurement and analysis of 57 smart TVs from seven amine and test four popular blocklists: (1) Pi-hole De- di˙erent platforms in 41 homes. Section 4 presents our fault blocklist (PD) [7], (2) Firebog’s recommended ad- systematic testing approach and comparative analysis vertising and tracking lists (TF) [8], (3) Mother of all of the top-1000 Roku and Fire TV apps. Section 5 eval- Ad-Blocking (MoaAB) [9], and (4) StopAd’s smart TV uates four well-known DNS-based blocklists and shows specifc blocklist (SATV) [10]. Our comparative analysis their limitations through analysis of missed ATSes and shows that block rates vary, with Firebog having the app breakage. Section 6 investigates exfltration of PII. highest coverage across di˙erent platforms and StopAd Section 7 concludes the paper, discusses limitations, and blocking the least. We further investigate potential false outlines directions for future work. The appendix pro- negatives (FN) and false positives (FP). We discover vides additional details and results, as well as an evalu- that blocklists miss di˙erent ATSes (FN), some of which ation and discussion of the limitations of our approach. are missed by all blocklists, while more aggressive block- lists can su˙er from false positives that result in break- ing app functionality. We discuss two ways to discover false negatives, through observing domains contacted The TV is Smart and Full of Trackers 131

2 Background & Related Work the main ways for smart TV platforms and app devel- opers to generate revenue [4]. Roku’s advertising rev- enue exceeded $250 million in 2018 and is expected to Background on Smart TVs. A smart TV is es- more than double by 2020 [13]. Both Roku and Fire TV sentially an Internet-connected TV that has apps for take a 30% cut of the advertising revenue from apps on users to stream content, play games, and even browse their platforms [14]. The smart TV advertising ecosys- the web. There are two types of smart TV products tem mirrors many aspects of the web advertising ecosys- in the market: (1) built-in smart TVs, and (2) exter- tem. Most importantly, smart TV advertising uses pro- nal smart TV boxes/sticks, also referred to as over-the- grammatic mechanisms that allow apps to sell their ad top (OTT) streaming devices. Some TV manufacturers, inventory in an automated fashion using behavioral tar- such as and , o˙er TVs with built-in smart geting [15, 16]. TV functionality. While several others provide external The rapidly growing smart TV advertising and asso- box/stick solutions, such as Roku (by Roku), Fire TV ciated tracking ecosystem has already warranted privacy (by Amazon), and Apple TV (by Apple), that convert a and security investigations into di˙erent smart TV plat- regular TV into a smart TV. In addition, hybrid prod- forms. Consumer Reports examined privacy policies of ucts exist, as some TV manufacturers now integrate ex- various smart TV platforms including Roku, LG, Sony, ternal box/stick solutions directly into their smart TVs. and [17]. They found that privacy policies are of- For example, TCL and Sharp o˙er smart TVs that inte- ten challenging to understand and it is diÿcult for users grate Roku TV, while Insignia and Toshiba o˙er smart to opt out of di˙erent types of tracking. For instance, TVs that integrate Fire TV. We use “smart TV” as an many smart TVs use Automatic Content Recognition umbrella term for TVs with built-in smart TV function- (ACR) to track their users’ viewing data and use it to ality and OTT streaming devices. serve targeted ads [18]. Vizio paid $2.2 million to settle There is a diverse set of smart TV platforms, each the charges by the Federal Trade Commission (FTC) with its own set of apps that users can install on their that they were using ACR to track users’ viewing data TVs. Many smart TVs use an Android-based operating without their knowledge or consent [19]. While smart system (e.g., Sony, AirTV, Philips) or a modifed version TV platforms now allow users to opt out of such track- of it (e.g., Fire TV). Regular Android TVs have access ing, it is not straightforward for users to turn it o˙ [20]. to apps from the Play Store, while Fire TV has Further, even with ACR turned o˙, users still must its own app store controlled by Amazon. In both cases, agree to a basic privacy policy that asks for the right to applications for such TVs are built in a manner simi- collect data about users’ location, choice of apps, etc. lar to regular Android applications. Likewise, Apple TV apps are built using technologies and frameworks that Related Work. While the desktop [21–23] and mo- are also available for iOS apps, and both types of apps bile [24–26] ATS ecosystems have been thoroughly stud- can be downloaded from Apple’s App Store. ied, the smart TV ATS ecosystem has not been exam- Some smart TV platforms are distinct as compared ined at scale until recently. to traditional Android or iOS. For example, Samsung Three concurrent papers studied the network be- smart TV apps are built for their own custom platform havior and privacy implications of smart TVs [11, 27, called Tizen and are downloadable from the Tizen app 28]. Ren et al. [27] studied a large set of IoT de- store. Likewise, applications for the Roku platform are vices, spanning multiple device categories. Their re- built using a customized language called BrightScript, sults showed that smart TVs were the category of de- and are accessible via the Roku Channel Store. Yet an- vices that contacted the largest number of third parties, other line of smart TVs such as LG smart TV and Hy- which further motivates our in-depth study of the smart brid broadcast broadband TV (HbbTV) follow a web- TV ATS ecosystem. Huang et al. [28] used crowdsourc- based ecosystem where applications are developed using ing to collect network traÿc for IoT devices in the wild HTML, CSS, and JavaScript. Finally, some smart TV and showed that smart TVs contact many trackers by platforms, such as , do not have app stores matching the contacted domains against the Disconnect of their own, but are only meant to “cast” content from blocklist. Finally, Moghaddam et al. [11] also instru- other devices such as smartphones. mented the Roku and Fire TV platforms to map the As with mobile apps, smart TV apps can integrate ATS endpoints and the exposure of PII. Our work in- third-party libraries and services, often for advertising dependently confrms the fndings of these works w.r.t. and tracking purposes. Serving advertisements is one of the smart TV ATS ecosystem, both by analyzing seven The TV is Smart and Full of Trackers 132 di˙erent smart TV platforms in the wild and by per- example, hulu.com belongs to the Com- forming systematic tests of two platforms (Roku and pany and .com belongs to Alphabet. Fire TV) in the lab. In addition, we further contribute App-Level Party Categorization. The app-level vis- along two fronts. First, we show that even the same app ibility in our testbed experiments (Sec. 4) enables cate- across di˙erent smart TV platforms contact di˙erent gorization of an Internet destination as a frst party or a ATSes, which shows the fragmentation of the smart TV third party w.r.t. app generating the traÿc. We provide ATS ecosystem. Second, we evaluate the e˙ectiveness of an overview of the technique here and defer details to di˙erent sets of blocklists, including smart TV specifc Appendix C.1. blocklists, in terms of their ability to prevent ads and We adopt a technique similar to prior work [24], and their adverse e˙ects on app functionality. We also sug- we augment it to also include a platform-specifc party gest ways to aid blocklist curation through analysis of for traÿc to platform-related destinations. We match domain usage across apps and PII exposures. tokenized eSLDs with tokenized package/app names Earlier work in this space includes [29] by Ghiglieri and developer names. If the tokens match, we label the and Tews, who studied how broadcasting stations could domain as frst party. Otherwise, if the traÿc originated track viewing behavior of users in the HbbTV plat- from platform activity rather than app activity, we la- form. In contrast to the rich app-based platforms we bel it as platform-specifc party: for Fire TV, AntMoni- study, the HbbTV platform studied in [29] contained tor [37] labels connections with the responsible process; only one HbbTV app that uses HTML5-based overlays for Roku, we check if the eSLD contains “roku”. Other- to provide interactive content. Related to our work, they wise, if the domain is contacted by at least two di˙erent found that the HbbTV app loaded third-party tracking apps from di˙erent developers, we label it as third party. scripts from Google Analytics. Malkin et al. [30] sur- Lastly, we resort to labeling it as other to capture do- veyed 591 U.S. Internet users about their expectations mains that are only contacted by a single app. on data collection by smart TVs. They found that users would rather enjoy new technology than worry about privacy, and users thus over rely on existing laws and regulations to protect their data. 3 Smart TV Traÿc in the Wild

In this section, we study the network behavior of smart 2.1 Labeling Methodology TV devices when used by real users by analyzing a dataset collected at residential gateways of tens of Throughout this paper, we provide insight into the homes (we refer to this dataset as the in the wild smart TV ATS ecosystems by labeling a domain ac- dataset). We compare the number of fows and traÿc cording to (1) its purpose (ATS or non-ATS); (2) its volumes generated by smart TVs from seven di˙erent parent organization (i.e., the domain owner); and (3) its platforms. We analyze the most frequently used domains relation to the app that uses it (frst, third, or platform- of each platform by identifying ATS domains and map- specifc party). We detail this methodology below. ping each domain to its parent organization. ATS Domains. We identify ATS domains as follows. Data Collection. To study smart TV traÿc character- For fgures that denote top domains, we check if the istics in the wild, we monitor network traÿc of 41 homes FQDN is labeled as ads or tracking by VirusTotal, in a major metropolitan area in the . We McAfee, OpenDNS [31–33], or if it is blocked by any sni˙ network traÿc of smart TV devices at the resi- of the blocklists considered in Sec. 5. For fgures and dential gateways using o˙-the-shelf OpenWRT-capable tables that involve entire datasets, we only consider the commodity routers. We collect fow-level summary in- blocklists due to the impracticality of manually labeling formation for network traÿc. For each fow, we collect thousands of data points. its start time, FQDN of the external endpoint (using DNS), and the internal device identifer. We identify Parent Organizations. To understand the presence of smart TVs using heuristics that rely on DNS, DHCP, di˙erent organizations on smart TV platforms, we map and SSDP traÿc and also manually verify the identi- each FQDN to its e˙ective second level domain (eSLD) fed smart TVs by contacting users. Our data collection using Mozilla’s Public Suÿx List [34, 35], and use covers a total of 57 smart TVs across 41 homes over Crunchbase [36]’s acquisition and sub-organization in- the duration of approximately 3 weeks in 2018. Note formation to fnd the parent company of the eSLD. For that we obtained written consent from users, informing The TV is Smart and Full of Trackers 133

Number of Flows Number of Flows vortex.hulu.com insights-collector.newrelic.com -global..com api-global.netflix.com cws-us-east.conviva.com vortex.hulu.com api-global.[1].netflix.com home.hulu.com time-ios.apple.com aeg-personalization.quickplay.com e673.e9.akamaiedge.net uwp-aeg-hbs.quickplay.com play.hulu.com http-v-darwin.hulustream.com home.hulu.com occ-1-1736-999.1.nflxso.net http-v-darwin.hulustream.com d2lkq7nlcrdi7q.cloudfront.net occ-1-1736-999.1.nflxso.net cws-us-east.conviva.com time-ios.g.aaplimg.com cws-110*57.[2].amazonaws.com 1-courier.push.apple.com init-p01st.push.apple.com http-e-darwin.hulustream.com giga.logs.roku.com e1042.b.akamaiedge.net liberty.logs.roku.com occ-0-1736-999.1.nflxso.net occ-0-1736-999.1.nflxso.net .apple.com.edgekey.net ichnaea.netflix.com dtvn-live-sponsored.akamaized.net occ-2-1736-999.1.nflxso.net midland.logs.roku.com pt.hulu.com cws-eu-west-1.conviva.com cws-110*57.[2].amazonaws.com cdn-0.nflximg.com cws-189*47.[2].amazonaws.com http-e-darwin.hulustream.com d2hzeyj6b557bu.cloudfront.net occ-2-1736-999.1.nflxso.net http-v-[3].footprint.net atv-ext.amazon.com t2.hulu.com oca-api.geo.netflix.com init-p01st.push.apple.com b.scorecardresearch.com xp.itunes-apple.com.akadns.net cws.conviva.com auth.hulu.com stream.nbcsports.com e1042.e12.akamaiedge.net tp.akam.nflximg.com mt-ingestion-[4].akadns.net scribe.logs.roku.com nrdp.nccp.netflix.com pubads.g.doubleclick.net a1910.b.akamai.net 0 10k 20k 30k 0 10k 20k 30k

(a) Roku (b) Apple Number of Flows api-global.netflix.com mobile-collector.newrelic.com feed.theplatform.com Number of Flows hh.prod.sony[6].com log-ingestion.samsungacr.com android.clients.google.com www.youtube.com www.fox.com api-global.netflix.com flingo.tv lcprd1.samsungcloudsolution.net ichnaea.netflix.com android.clients.google.com pubads.g.doubleclick.net log-2.samsungacr.com clients3.google.com i.ytimg.com vortex.hulu.com www.lookingglass.rocks occ-1-1736-999.1.nflxso.net i.ytimg.com youtube-ui.l.google.com clients4.google.com api.twitter.com play.googleapis.com clients1.google.com artist.api.lv3.cdn.hbo.com ypu.samsungelectronics.com js-agent.newrelic.com i9.ytimg.com www.googleapis.com occ-0-1736-999.1.nflxso.net mtalk.google.com nrdp.nccp.netflix.com ytimg.l.google.com .api.hbo.com s.youtube.com profile.localytics.com clients4.google.com occ-0-586-590.1.nflxso.net video-stats.l.google.com connectivitycheck.gstatic.com occ-2-1736-999.1.nflxso.net occ-1-586-590.1.nflxso.net osb.samsungqbe.com assets.fox.com t2.hulu.com sp.auth.adobe.com ichnaea.netflix.com occ-2-586-590.1.nflxso.net http-v-darwin.hulustream.com api.meta.ndmdhs.com upu.samsungelectronics.com tv.deezer.com api.fox.com dpu.samsungelectronics.com youtubei.googleapis.com ocfconnect-[7].samsungiotcloud.com cdn.meta.ndmdhs.com googleads.g.doubleclick.net 0 2k 4k 0 50k 100k

(c) Sony (d) Samsung Fig. 1. Top-30 fully qualifed domain names in terms of number of fows per device for a subset of the smart TVs in the “in the wild” dataset. See Appendix C.2 for the other brands. Domains identifed as ATS are highlighted with red, dashed bars. them of our data collection and research objectives, in external smart TV devices. Thus, we do not di˙erentiate accordance with our institution’s IRB guidelines. between built-in vs. external Roku smart TV devices. Dataset Statistics. Table 1 lists basic statistics of We expect smart TV devices to generate signifcant smart TV devices observed in our dataset. Overall, we traÿc because they are typically used for OTT video note 57 smart TVs from 7 di˙erent vendors/platforms streaming [38]. Chromecast devices generate the highest using a variety of technologies. number of fows (exceeding 200 thousand fows) on aver- Devices can be built-in smart TVs such as Samsung age, while Samsung, Apple, and Roku devices generate and Sony, others like Chromecast and Apple TV can nearly 50 thousand fows on average. Roku devices gen- be external stick/box solutions, while devices like Roku erate the highest volume of fows (exceeding 80 GB) on can have both forms. For example, 7 out 9 Roku devices average, with one Roku generating as much as 283 GB. in our dataset were built-in Roku smart TVs, while the Except for LG and Sony devices, all smart TV devices remaining two were external Roku sticks. Note that a generate at least tens of GBs worth of traÿc on aver- smart TV platform such as Roku supports the same age. Finally, we note that smart TV devices typically set of apps and a similar interface for both built-in and connect to hundreds of di˙erent endpoints on average. The TV is Smart and Full of Trackers 134

Apple Smart Device Average Average Average NBCUniversal Media comScore TV Count Flow Flow eSLD AT&T Amazon Platform Count Volume Count AEG /Device /Device /Device Roku Conviva (x 1000) (GB) Unknown/CDN Apple 16 49.3 46.6 536 Apple TV Samsung 11 62.6 33.2 369 Walt Disney Company

Chromecast 10 201.9 26.3 354 Roku Roku 9 48.1 83.0 543 Deezer Samsung Vizio 6 43.4 63.4 278 Samsung LG 4 10.9 0.9 1893 Netflix Sony 1 33.1 0.1 186 New Relic Vizio NFL

Table 1. Traÿc statistics of 57 smart TV devices observed across Vizio LG 41 homes (“in the wild” dataset). LG Kodi Sony Sony Localytics Flingo Chromecast Adobe Systems Endpoint Analysis. Fig. 1 plots the top-30 FQDNs in Time Warner terms of fow count for Roku, Apple, Sony, and Samsung Fox smart TV platforms. The plots for the remaining smart TV platforms are in Appendix C.2. Alphabet We note several similarities in the domains accessed by di˙erent smart TV devices. First, video streaming services such as Netfix and Hulu are popular across the Fig. 2. Mapping of platforms measured in the wild to the par- ent organizations of the endpoints they contact (for the top-30 board, as evident from domains such api-global.netfix. FQDNs of each platform). The width of an edge indicates the com and vortex.hulu.com. Second, cloud/CDN services number of distinct FQDNs within that organization that was ac- such as Akamai and AWS (Amazon) also appear for cessed by the platform. di˙erent smart TV platforms. Smart TVs likely connect zations such as Apple on the other end of the . to cloud/CDN services because popular video stream- Furthermore, it reveals the presence of organizations like ing services typically rely on third party CDNs [39, 40]. Conviva, comScore, and Localytics, whose main busi- Third, we note the prevalence of well-known adver- ness is in the advertising and tracking space. We note tising and tracking services (ATSes). For example, that Samsung, Deezer, Roku, LG, and Flingo have the *.scorecardresearch.com and *.newrelic.com are third majority of their domains labeled as ATSes while Net- party tracking services, and pubads.g.doubleclick.net is fix and Walt Disney Company have less than half of a third party advertising service. their domains labeled as ATSes. We notice several platform-specifc di˙erences in the domains accessed by di˙erent smart TV platforms. Takeaway & Limitations. Traÿc analysis of di˙er- For example, giga.logs.roku.com (Roku), time-ios.apple. ent smart TV platforms in the wild highlights interest- com (Apple), hh.prod.sonyentertainmentnetwork.com ing similarities and di˙erences. As expected, all devices (Sony), and log-ingestion.samsungacr.com (Samsung) generate traÿc related to popular video streaming ser- are unique to di˙erent types of smart TVs. In addition, vices. In addition, they also access ATSes, both well- we notice platform-specifc ATSes. For example, the fol- known and platform-specifc. While our vantage point lowing advertising-related domains are not in the top-30 at the residential gateway provides a real-world view of (and therefore not pictured in Fig. 1), but are unique to the behavior of smart TV devices, it lacks granular in- di˙erent smart TV platforms: p.ads.roku.com (Roku), formation beyond fows (e.g., packet-level information) ads.samsungads.com (Samsung), and us.info.lgsmartad. and does not tie traÿc to the app that generate it. An- com (LG). other limitation of this analysis is that the fndings may be biased by the viewing habits of users in these 41 Organizational Analysis. Figure 2 illustrates the mix households. It is unclear how to normalize to provide of di˙erent parent organizations contacted by the seven a fair comparison of endpoints accessed by the di˙er- smart TV platforms in our dataset. It shows the preva- ent smart TV platforms. We address these limitations lence of Alphabet in smart TV platforms like Chrome- cast, Sony, and LG, while revealing competing organi- The TV is Smart and Full of Trackers 135 next by systematically analyzing two popular smart TV libraries is still possible. For example, the Ooyala IQ platforms in a controlled testbed. SDK [46] provides various analytics services that can be integrated into a Roku app. Thus, such libraries can help ATSes learn the viewing habits of users by collect- 4 Systematic Testing of the Roku ing data from multiple apps. In terms of permissions, Roku only protects microphone access with a permis- and Fire TV Platforms sion and does not require any permission to access the advertising ID. Users can choose to reset this ID and opt In this section, we perform an in-depth, systematic out of targeted advertising at any time [45]. However, study of two smart TV platforms, Roku and Amazon apps and libraries can easily other IDs or use fn- Fire TV, which we chose since they are popular [41], gerprinting techniques to continue tracking users even a˙ordable ($25), and among the leading smart TV plat- after opt-out. We further elaborate on this in Sec 6. forms in terms of number of ad requests [6]. Sections 4.1 App Selection. The RCS provides a web (and on- and 4.2 present our measurement approach for system- device) interface for browsing the available Roku apps. atically testing the top-1000 apps in each platform while To of our knowledge, Roku does not pro- collecting their network traÿc. Since app exploration is vide public documentation on how to programmatically automated, no real users are involved, thus no IRB is query the RCS. We therefore reverse-engineer the REST needed. Our measurement approach provides visibility API backing the RCS web interface by inspecting the into the behaviors of individual apps, which was not HTTP(S) requests sent while browsing the RCS, and possible from the vantage point used for the in the wild use this insight to write a script that crawls the RCS dataset. In Section 4.3, we analyze the two testbed for the metadata of all (8515 as of April 2019) apps. datasets, and compare them to each other and to the To test the most relevant apps, we select the Android ATS ecosystem. top-50 apps in 30 out of the 32 categories. We exclude “Themes” and “Screensavers” since these apps do not show up among the regular apps on the Roku and there- 4.1 Roku Data Collection fore cannot be operated using our automation software. We base our selection on the “ rating count”, which In this section, we describe the Roku platform and our we interpret as the review count. Roku apps can be app selection methodology, and present an overview of labeled with multiple categories, thus some apps con- Rokustic—our software tool that automatically explores tribute to the top-50 of multiple categories. This places Roku apps. We use Rokustic to explore and collect traf- the fnal count of apps in our dataset at 1044. fc from 1044 Roku apps. The resulting network traces are analyzed in Sec. 4.3. Automation (Rokustic). To scale testing of apps, we implement a software tool, Rokustic, that automatically Roku Platform. We start by describing the Roku plat- installs and exercises Roku apps. Due to lack of space, form, which has its own app store—the Roku Chan- we only provide a brief overview here and defer addi- nel Store [42] (RCS)—that o˙ers more than 8500 apps, tional details to Appendix A.1. called “channels”. For security purposes, Roku sand- We run Rokustic on a Raspberry Pi that acts as a boxes each app (apps are not allowed to interact or ac- router and hosts a local wireless network that the Roku cess the data of other apps) and provides limited access is connected to. Rokustic utilizes ECP [47], a REST-like to system resources [43]. Furthermore, Roku apps can- API exposed by the Roku device, to control the Roku. not run in the background. Specifcally, app scripts are Given a set of apps to exercise, Rokustic installs each only executed when the user selects a particular app, app by invoking the ECP endpoint that opens up the and when the user exits the app, the script is halted, on-device version of the RCS page for the app, and then and the system resumes control [44]. sends a virtual key press to click the “Add Channel” To display ads, apps typically rely on the Roku Ad- button. To exercise apps, Rokustic frst invokes the ECP vertising Framework which is integrated into the Roku endpoint that returns the set of installed apps. For each SDK [45]. The framework allows developers to use ad app, Rokustic then (1) starts tcpdump on the Raspberry servers of their preference and updates automatically Pi’s wireless interface; (2) uses ECP to launch the app without requiring the developer to rebuild the app. Even and invoke a series of virtual key presses in an attempt though such a framework eliminates the need for third to incur content playback; (3) pauses for fve minutes to party ATS libraries, the development and usage of such The TV is Smart and Full of Trackers 136 let the content play; (4) exits the app and repeats from Number of Roku Fire TV Both step 2 an additional two times; (5) terminates tcpdump. Apps exercised 1044 1010 128 Since Roku apps cannot execute in the background Fully qualifed domain names (FQDN) 2191 1734 578 (see “Roku Platform”), all captured traÿc will belong to FQDNs accessed by multiple apps 669 603 199 URL paths 13899 240713 74 the exercised app and the Roku system. The total inter- action time with each app is approximately 16 minutes. Table 2. Summary of the Roku and Fire TV testbed datasets. The rightmost column summarizes the intersection between the We do not attempt to decrypt TLS traÿc as we cannot two testbed datasets. For example, there are 128 apps that are install our own self-signed certifcates on the Roku. present both in the Roku dataset and the Fire TV dataset.

4.2 Fire TV Data Collection We rely on AntMonitor [37, 48], an open-source VPN-based library, to intercept all outgoing network In this section, we describe the Fire TV platform, our traÿc from the Fire TV, and to label each packet with app selection methodology, and present an overview the package name of the application (or system pro- of Firetastic—our software tool that automatically ex- cess) that generated it. We enable AntMonitor’s TLS plores Fire TV apps. By using Firetastic to control six decryption for added visibility into PII exposures. We Fire TV devices in parallel, we explore and collect traf- analyze the success of TLS decryption in Appendix B. fc from 1010 Fire TV apps within a one week period. In summary, TLS decryption was generally successful The resulting network traces are analyzed in Sec. 4.3. with 10% or fewer failures for 55% of all apps, and 20% Fire TV Platform. Although Fire TV is made by or fewer failures for 80% of all apps. Amazon, its underlying , Fire OS, is For app exploration, we utilize DroidBot [49], a a modifed version of Android. This allows apps for Python tool that dynamically maps the UI and sim- Fire TV to be developed in a similar fashion to An- ulates user inputs such as button presses using the An- droid apps. Therefore, all third-party libraries that are droid Debug Bridge (ADB). To increase the probability available for Android apps can also be integrated into of content playback, we confgure DroidBot to utilize Fire TV apps. Similarly, application sandboxing and its breadth frst search algorithm to explore each app. permissions in Fire TV are analogous to those of An- The intuition is that the main content is often made droid, and any permission requested by the app is inher- available from top-level UI elements. ited by all libraries that the app includes. This allows In summary, for each app, Firetastic: (1) starts third party libraries to track users across apps using AntMonitor; (2) explores the app for 15 minutes; (3) a variety of identifers, such as Advertising ID, Serial stops AntMonitor; and (4) extracts the .pcapng fles Number, and Device ID, etc. We further discuss track- that were generated during testing. We use Firetastic ing through PII exposure in Section 6. to explore apps on six Fire TV devices in parallel. Our test setup is resource-eÿcient and scalable, using only App Selection. To test the most relevant apps, we pick one computer to send commands to multiple Fire TVs. the top-1000 apps from Amazon’s curated list of “Top Featured” apps. We ignore some apps that use a local VPN (as they would confict with AntMonitor), that could not be installed manually, and utility apps that 4.3 Comparing Roku and Fire TV can change the device settings (which would a˙ect the test environment). As a result, we ignore around 200 In this section, we analyze the (ATS) domains accessed apps while including 1010 testable applications. Ama- by the apps in the Roku and Fire TV testbed datasets. zon’s app store o˙ers around 4,000 apps at the time We frst provide an overview of the datasets, and analyze of writing, thus our dataset covers approximately 25%. how many (ATS) domains apps contact. We then look Automation (Firetastic). We design and implement closer at the eSLDs and third party ATS domains that a software tool, Firetastic, that integrates the capabil- are contacted by the most apps, including which parent ities of two open source tools for Android: an SDK for organizations they belong to. Furthermore, we compare network traÿc collection and a tool for input automa- the top third party ATS domains to those of Android. tion. We provide a brief overview of Firetastic here, and Finally, we compare the domains accessed by apps that defer additional details to Appendix A.2. are present on both testbed platforms. The TV is Smart and Full of Trackers 137

1.0 1.0

0.8 0.8

0.6 0.6

0.4 0.4

CDF of Tested Apps 0.2 CDF of Tested Apps 0.2 Roku Roku 0.0 Fire TV 0.0 Fire TV 0 20 40 60 80 100 0 20 40 60 80 Number of Distinct Domains Number of Distinct ATS Domains

(a) Roku & Fire TV: Distinct domains per app. (b) Roku & Fire TV: Distinct ATS domains per app. A do- main is considered an ATS if it is labeled as so by any of the blocklists considered in Sec. 5. Number of Apps Number of Apps roku.com amazon.com doubleclick.net amazonaws.com googlesyndication.com amazon-adsystem.com google-analytics.com amazonvideo.com scorecardresearch.com media-amazon.com amazonaws.com doubleclick.net spotxchange.com amazon-dss.com ifood.tv cloudfront.net tremorhub.com crashlytics.com irchan.com googleapis.com .com amazonalexa.com 2mdn.net google-analytics.com 1rx.io googlesyndication.com ravm.tv ssl-images-amazon.com vimeocdn.com .com imrworldwide.com google.com adrta.com .com akamaized.net scorecardresearch.com innovid.com ifood.tv insightexpressai.com gstatic.com demdex.net youtube.com gvt1.com uplynk.com adsrvr.org ytimg.com akamaihd.net serving-sys.com monarchads.com unity3d.com agkn.com akamaihd.net cloudfront.net Platform Party googlevideo.com Platform Party springserve.com Third Party moatads.com Third Party stickyadstv.com First Party adobe.com First Party ewscloud.com titantv.com 0 250 500 750 1000 0 500 1000

(c) Roku: Top-30 eSLDs. (d) Fire TV: Top-30 eSLDs. Number of Apps Number of Apps pubads.g.doubleclick.net e.crashlytics.com tpc.googlesyndication.com pubads.g.doubleclick.net www.google-analytics.com ssl.google-analytics.com search.spotxchange.com pagead2.googlesyndication.com securepubads.g.doubleclick.net googleads.g.doubleclick.net ad.doubleclick.net imasdk.googleapis.com event.spotxchange.com settings.crashlytics.com b.scorecardresearch.com data.flurry.com sb.scorecardresearch.com graph.facebook.com googleads4.g.doubleclick.net b.scorecardresearch.com ade.googlesyndication.com reports.crashlytics.com gcdn.2mdn.net csi.gstatic.com adrta.com www.google-analytics.com PD[0].ads.tremorhub.com config.uca.cloud.unity3d.com secure.insightexpressai.com applab-sdk.amazon.com events.tremorhub.com ad.doubleclick.net dpm.demdex.net z.moatads.com cm.g.doubleclick.net ade.googlesyndication.com insight.adsrvr.org cdp.cloud.unity3d.com .monarchads.com dpm.demdex.net 0 250 500 0 50 100

(e) Roku: Top-20 third party ATS domains. (f) Fire TV: Top-20 third party ATS domains.

Fig. 3. Analysis of domain usage per app and the top domains across all apps in the Roku and Fire TV testbed datasets. The TV is Smart and Full of Trackers 138

Oracle domains contacted per app in Fig. 3b, the vast majority Amazon Unity Tech (80%) of apps from the two platforms behave similarly. Numitas On the positive side, for both platforms, around Facebook Adobe Systems 60% of the apps contact only a small handful ATS comScore domains. Yet, about 10% of the Roku and Fire TV Fire TV The Trade Desk apps contact 20+ and 10+ ATS domains, respectively. Barons Media These concerning apps come from a small set of devel- PIxalate Bain Capital opers. For instance, for Roku, “Future Today Inc.” [50], Telaria “8ctave ITV”, and “Stu˙WeLike” [51] are responsible RTL Group for 51%, 13%, and 11%, respectively, of these apps.

Roku On the Fire TV side, “HTVMA Solutions, Inc.” [52] is responsible for 15% of the apps, and “Gray Televi- sion, Inc.” [53] is responsible for 12% of the apps.

Alphabet Key Players. Figures 3c (Roku) and 3d (Fire TV) present the top-30 eSLDs in terms of the number of apps that contact a subdomain of the eSLD. We defne an eSLD’s app penetration as the percentage of apps in the dataset that contact the eSLD. Fig. 4. Mapping of platforms measured in our testbed environ- Platform. The top eSLD of each platform has 100% ment to the parent organizations of the top-20 third-party ATS app penetration and belongs to the platform operator. domains their apps contact. The width of an edge indicates the While these eSLDs also cover subdomains that provide number of apps that contact each organization. functionality, we note that both platform operators are engaged in advertising and tracking, as shown in Fig. 9 Overview. The datasets collected using Rokustic and in Appendix C.3, which separates traÿc to the eSLDs Firetastic are summarized in Table 2. For Roku, we dis- in Figs. 3c and 3d by ATS and non-ATS FQDNs. cover 2191 distinct FQDNs, 699 of which are contacted Third Parties. As evident from Figs. 3c and 3d, Al- by multiple apps. For Fire TV, we discover 1734 dis- phabet has a strong presence in the ATS space of both tinct FQDNs, 603 of which are contacted by multiple platforms, with *.doubleclick.net, an ad delivery end- apps. We also fnd 578 FQDNs that appear in both point, achieving 58% and 35% app penetration for Roku datasets, 199 of which are contacted by multiple apps. and Fire TV, respectively. Its analytic services also rank Our automation uncovers approximately twice as many high, with google-analytics.com in the top-20 for both FQDNs as [11], possibly due to longer experiments and platforms and crashlytics.com in the top-10 for Fire TV. di˙erent app exploration goals. We further detail this in To better understand additional key third party ATSes, Appendix A.3. we strip away the platform-specifc endpoints and re- Leveraging the blocklists from Sec. 5, we identify port the top-20 third party ATS FQDNs for Roku and 314 ATS domains that are unique to the Roku dataset, Fire TV in Figs. 3e and 3f, respectively. We note that 285 that are unique to the Fire TV dataset, and an over- both platforms use distinct third party ATSes. For ex- lap of 227 between the two datasets. When considering ample, SpotX (*.spotxchange.com), which serves video eSLDs of the ATS domains, we fnd 68 eSLDs that are ads, is a signifcant player in the Roku ATS space with unique to the Roku dataset, 100 that are unique to the 17% app penetration, but only maintains 1% app pene- Fire TV dataset, and an overlap of 138 eSLDs. These tration for Fire TV. Even when considering the smaller numbers suggest that the ATS ecosystems of the two players, we see little overlap between the two platforms, platforms have substantial di˙erences, which we ana- suggesting these players focus their e˙orts on a single lyze in further detail later in this section. platform. For example, Kantar Group’s insightexpres- sai.com analytics service has 7% app penetration on the Number of (ATS) Domains Contacted. Figure 3a Roku platform, but only 0.01% on the Fire TV platform. presents the empirical CDF of the number of distinct domains each app in the two testbed datasets contacts. Parent Organization Analysis. We further analyze Fire TV apps appear more “chatty”—most Fire TV the parent organizations of Roku and Fire TV third apps contact about twice as many domains as the Roku party ATS endpoints in Fig. 4, using the method de- apps. However, when we consider the number of ATS scribed earlier in Section 2.1. Interestingly, the set of top The TV is Smart and Full of Trackers 139

120 Common FireTV 100 Roku

80

60

40

Number of Hostnames 20

0 BET VH1 A&E MTV AMC Blippi Drinks Pilates CNNgo Lifetime iFood.tv NRA TV Sling TV Pluto TV JTV Live HISTORY CraftSmart BestCooks ABC Bloomberg Watch TBS Watch TNT iHeartRadio Red Bull TV Cool School Thai recipes MOOVIMEX Haystack TV TechSmart.tv BBC America FilmRise Kids Low Carb 360 P. Allen Smith Relax My Dog Italian Recipes HappyKids Now You Know PlayStation Vue Om Nom Stories Chinese Recipes Free Movies Now Baking by ifood.tv Hallmark Channel UP Faith & Family Stand-up Comedy NFL Sunday Ticket CBS Stream Fun With Roblox by WSB-TV Channel 2 Mediterranean Food Outside TV Features The Holy Tales Bible Deportes Baby By HappyKids.tv Network The BILLIARD Channel Wow, I Never Knew That

Fig. 5. Top-60 common apps (apps present in both testbed datasets) ordered by the number of domains that each app contacts. Con- sidering all 128 common apps, there are 597 domains which are exclusive to Roku apps, 496 domains which are exclusive to Fire TV apps, and 155 domains which are contacted by both the Roku and the Fire TV versions of the same app. third party organizations is rather diverse, with only a (data.furry.com) both have a strong presence on both slight overlap in the shape of Adobe Systems and com- Fire TV and Android. Some of the third party ATS ob- Score, possibly suggesting that the remaining organiza- served for Fire TV, which were not present for Android, tions focus their e˙orts on a single platform. Fire TV include comScore, Adobe (dpm.demdex.net), and Ama- shows gaming and social media ATSes from Unity Tech zon (applab-sdk.amazon.com). and Facebook, whereas Roku exhibits more traÿc to Common Apps in Roku and Fire TV Next, we ATSes from companies that focus on video ads such as compare the Roku and Fire TV datasets at the app- The Trade Desk, Telaria, and RTL Group. Similar to the level by analyzing the FQDNs accessed by the set of in the wild organization analysis in Fig 2, we again note apps that appear on both platforms, referred to as com- that Alphabet dominates the set of third party ATSes mon apps. Recall from Table 2 that the datasets col- on both Roku and Fire TV. lected using Rokustic and Firetastic contain a total of Comparing to Android ATS Ecosystem. Next, we 128 common apps. We identifed common apps by fuzzy compare the top-20 third party ATS endpoints in our matching app names since they sometimes vary slightly Roku and Fire TV datasets (Figs. 3e and 3f) with those for each platform (e.g., “TechSmart.tv” on Roku vs. reported for Android [24]. “TechSmart” on Fire TV). We further cross-referenced Roku vs. Android. The key third party ATSes with the developer’s name to validate that the apps were in Roku (Fig. 3e) di˙er from the Android platform. indeed the same (e.g., both TechSmart apps are created For example, SpotX (*.spotxchange.com) and comScore by “Future Today”). The 128 common apps contact a (*.scorecardresearch.com) both have a strong presence total of 1248 distinct FQDNs. Out of these, 597 FQDNs on Roku, but are not among the key players for An- are exclusively contacted by Roku apps, 496 are exclu- droid. In contrast, Facebook’s graph.facebook.com is sively contacted by Fire TV apps, and only 155 FQDNs the second most popular ATS domain on Android, are contacted by both Roku and Fire TV apps. but insignifcant on Roku. The set of top third party Figure 5 reports the amount of overlapping and non- ATSes in Roku is also more diverse and includes overlapping FQDNs for the top-60 common apps (in smaller organizations such as Pixalate (adrta.com) terms of the number of distinct FQDNs that each app and Telaria (*.tremorhub.com). While Alphabet has a contacts). In general, the set of FQDNs contacted by strong foothold in both ATS ecosystems, it is less sig- both the Roku and the Fire TV versions of the same nifcant for Roku (9 out 20 ATS FQDNs are Alphabet- app is much smaller than the set of platform-specifc owned, vs. 16 out of 20 for Android). FQDNs. From inspecting the common FQDNs for some Fire TV vs. Android. In contrast to Roku, Fire TV of the apps in Fig. 5, we fnd that these generally include is more similar to Android: we see an overlap of 9 endpoints that serve content. For example, for Mediter- FQDNs, 7 of which are owned by Alphabet. This ranean Food, the only two common FQDNs are subdo- is expected, given that Fire TV is based o˙ of An- mains of ifood.tv, which belong to the parent organiza- droid and thus natively supports the ATS services of tion behind the app. This makes intuitive sense as the Android. Facebook (graph.facebook.com) and Verizon same app presumably o˙ers the same content on both The TV is Smart and Full of Trackers 140 platforms and must therefore access the same servers to Block Rate (%) download said content. On the other hand, the platform- Platform # Domains PD TF MoaAB SATV specifc domains contain obvious ATS endpoints such Dataset obtained “in the wild” as ads.yahoo.com and ads.stickyadstv.com for the Roku Apple 3179 10% 13% 12% 5% Samsumg 1765 14% 19% 15% 8% version of the app, and aax-us-east.amazon-adsystem. Chromecast 1576 9% 15% 15% 5% com and mobileanalytics.us-east-1.amazonaws.com for Roku 2312 15% 19% 18% 7% the Fire TV version of the app. In conclusion, our anal- Vizio 942 16% 18% 16% 11% ysis of common apps reveals (to our surprise) little over- LG 627 45% 54% 50% 27% lap in the ATS endpoints they access, which further sug- Sony 119 16% 24% 16% 7% gests that the smart TV ATS ecosystem is segmented Dataset obtained in our testbed Roku 2191 17% 22% 20% 9% across platforms. Fire TV 1734 22% 27% 22% 9% Takeaway. The ATS ecosystems of the Roku and Table 3. Block rates of the four blocklists when applied to the Fire TV platforms seem to di˙er substantially: (1) the domains in our datasets. full set of ATS domains contacted by apps in the two datasets have little overlap; (2) some organizations are key players on one platform, but almost absent on the Setup. We evaluate the following blocklists: other (e.g., SpotX has a signifcant presence on Roku, 1. Pi-hole Default (PD): We test blocklists included but is almost absent on Fire TV, whereas Facebook has in Pi-hole’s default confguration [56] to imitate the a reasonable foothold Fire TV, but almost zero presence experience of a typical Pi-hole user. This set has on Roku); and (3) apps present in both datasets have seven hosts fles including Disconnect.me ads, Dis- little overlap in terms of the ATS domains they contact. connect.me tracking, hpHosts, CAMELEON, Mal- Finally, we fnd that the key third party ATS players on wareDomains, StevenBlack, and Zeustracker. PD Android have little overlap with Roku, but substantial contains a total of about 133K entries. overlap with Fire TV, which intuitively makes sense as 2. The Firebog (TF): We test nine advertising and Fire TV is built on top of Android. fve tracking blocklists recommended by “The Big Blocklist Collection” [8], to emulate the experience of an advanced Pi-hole user. This includes: Discon- 5 Blocklists for Smart TVs nect.me ads, hpHosts, a dedicated blocklist target- ing smart TVs, and hosts versions of EasyList and In this section, we evaluate four well-known DNS-based EasyPrivacy. TF contains 162K entries total. blocklists’ ability to prevent smart TVs from accessing 3. Mother of all Ad-Blocking (MoaAB): We test ATSes and their adverse e˙ects on app functionality. We this curated hosts fle [9] that targets a wide-range further demonstrate how the datasets resulting from au- of unwanted services including advertising, track- tomated app exploration may aid in curating new can- ing, (cookies, page counters, web bugs), and mal- didate rules for blocklists. ware (phishing, spyware) to again imitate the expe- rience of an advanced Pi-hole user. MoaAB contains a total of about 255K entries. 5.1 Evaluating Popular DNS Blocklists 4. StopAd (SATV): We test a commercial smart TV focused blocklist by StopAd [10]. This list partic- DNS-based blocking solutions such as Pi-hole [7] are ularly targets Android based smart TV platforms used to prevent in-home devices, including smart TVs, such as Fire TV. We extract StopAd’s list by analyz- from accessing ATS domains [54]. To block advertis- ing its APK using Android Studio’s APK Analyzer ing and tracking traÿc, they essentially “blackhole” [57]. SATV contains a total of about 3K entries. DNS requests to known ATS domains. Specifcally, they match the domain name in the DNS request against a We applied these blocklists to both our in the wild and set of blocklists, which are essentially curated hosts fles testbed datasets and we report the results next. that contain rules for well-known ATS domains. If the Block Rates. We start our analysis by comparing how domain name is found in one of the blocklists, it is typi- many FQDNs are blocked by the di˙erent blocklists. cally mapped to 0.0.0.0 or 127.0.0.1 to prevent outbound We defne a blocklist’s block rate as the number of dis- traÿc to that domain [55]. tinct FQDNs that are blocked by the list, over the total The TV is Smart and Full of Trackers 141

App Name No List PD TF MoaAB SATV No No No No No No No No No No Ads Breakage Ads Breakage Ads Breakage Ads Breakage Ads Breakage Pluto TV 5 5 5 5 5 iFood.tv 5 5 — 6 Common YouTube 5 5 5 5 5 CBS News Live 5 6 6 6

Top Top The Roku Channel 5 6 5

Roku Sony 5 5 5 5 WatchFreeComedyFlix 5 5 5 Live Past 100 Well SmartWoman 5 Random Pluto TV 5 5 5 5 5 5 6 iFood.tv 5 — 6 — 6 Tubi Common Downloader The CW for Fire TV 5 — 6 — 6 — 6 5

Top Top FoxNow 5 — 6 — 6 5 5 Fire TV Fire Watch TNT KCRA3 Sacramento 5 — 6 — 6 5 5 Watch 5 5 Jackpot Pokers by PokerStars 5 Random Table 4. Missed ads and functionality breakage for di˙erent blocklists when employed during manual interaction with 10 Roku apps and 10 Fire TV apps. For “No Ads”, a checkmark ( ) indicates that no ads were shown during the experiment, a cross (5) indicates that some ad(s) appeared during the experiment, and a dash (—) indicates that breakage prevented interaction with the app alto- gether. For “No Breakage”, a checkmark ( ) indicates that the app functioned correctly, a cross (5) indicates minor breakage, and a bold cross (6) indicates major breakage.

number of distinct FQDNs in the dataset. Table 3 com- acting with a sample of apps from our testbed datasets pares the block rates of the aforementioned blocklists on while coding for ads and app breakage. We sample 10 our in the wild and testbed datasets. Overall, we note Roku apps and 10 Fire TV apps, including the top- that TF, closely followed by MoaAB and PD, blocks the 4 free apps, three apps that are present on both plat- highest fraction of domains across all of the platforms forms, and an additional three randomly selected apps. in both the in the wild and testbed datasets. SATV is We test each app fve times: once without any blocklist the distant last in terms of block rate. It is noteworthy and four times where we individually deploy each of the that TF blocks more domains than MoaAB despite being aforementioned blocklists. During each experiment, we about one-third shorter. We surmise this is because TF attempt to trigger ads by playing multiple and/or includes a smart TV focused hosts fle, and thus catches live TV channels and fast-forwarding through video con- more relevant smart TV ATSes. This fnding shows that tent. We take note of any functionality breakage (due to the size of a blocklist does not necessarily translate to false positives) and visually observable missed ads (due its coverage. to false negatives). We di˙erentiate between minor and Blocklist Mistakes. Motivated by the di˙erences in major functionality breakage as follows: minor breakage the block rates of the four blocklists, we next compare when the app’s main content remains available but the them in terms of false negatives (FN) and false posi- application su˙ers from minor user interface glitches or tives (FP). False negatives occur when a blocklist does occasional freezes; and major breakage when the app’s not block requests to ATSes and may result in (visually content becomes completely unavailable or the app fails observable) ads or (visually unobservable) PII exfltra- to launch. tion. False positives occur when a blocklist blocks re- Missed Ads vs. Functionality Breakage. Table 4 quests that enable app functionality and may result in summarizes the results of our manual analysis for missed (visually observable) app breakage. ads and functionality breakage. Overall, we fnd that We frst systematically quantify visually observable none of the blocklists are able to block ads from all of false negatives and false positives of blocklists by inter- the sampled apps while avoiding breakage. In particular, The TV is Smart and Full of Trackers 142

100% more apps that contact a FQDN, the more likely it is Roku Fire TV that the FQDN is an ATS. This is intuitive and consis- 80% tent with a similar observation previously made in the 60% mobile ecosystem [24]. We frst use simple keywords such as “ad”, “ads”, 40%

Block Rate and “track” to shortlist obvious ATS domains in our 20% datasets. While keyword search is not perfect, this sim-

0% ple approach identifes several obvious false negatives 1+ 2+ 3+ 4+ 5+ 6+ 7+ 8+ 9+ 10+ (we provide the full list in Appendix C.4). For ex- Number of Apps ample, p.ads.roku.com and adtag.primetime.adobe.com are advertising/tracking related domains which are not Fig. 6. Block rates as a function of the number apps that con- tact a FQDN. For the horizontal axis, “2+” represents the set blocked by any of the lists. of FQDNs that are contacted by 2 or more apps. For Roku, the We observe that many of these false negatives (i.e., more apps that contact an FQDN, the more likely it is that the missed ATS domains) are contacted by multiple apps FQDN is an ATS, according to the blocklists. The same is not in our testbed datasets. For example, p.ads.roku.com true for Fire TV because platform services start to dominate the is accessed by more than 100 apps in our Roku testbed set of FQDNs that are accessed by many apps, and platform ser- dataset. To gain further insight into potential false nega- vices are often not blocked. tives, we study whether the likelihood of being blocked is impacted by the number of apps that access a do- none of the blocklists are able to block ads in YouTube main. Figure 6 plots the block rates for the union of and Pluto TV (available on both Roku and Fire TV). the four blocklists as a function of FQDNs’ occurrences Across di˙erent lists, PD seems to achieve the best bal- across apps in our testbed datasets. We note that the ance between blocking ads and preserving functionality. block rate substantially increases for the domains that For Roku, PD and TF perform similarly. While TF appear across multiple apps. For example, the block rate is the only list that blocks ads in Sony Crackle, both almost doubles for domains contacted by two or more lists miss ads in YouTube and Pluto TV. TF majorly apps as compared to domains contacted by a single or breaks three apps, while PD only majorly breaks one more apps. Domains that are contacted by multiple dif- app. MoaAB is unable to block ads in four apps and ferent apps are therefore more likely to belong to third majorly breaks only one app. SATV does not cause any party ATS libraries included by smart TV apps. breakage, but is unable to block ads in six apps. For Fire TV, PD again seems to be the most e˙ec- tive at blocking ads while avoiding breakage, but is still unable to block ads in one app (Pluto TV) and majorly 6 PII Exposures in Smart TVs breaks two apps. TF is also unable to block ads in Pluto TV, but majorly breaks four apps. MoaAB is unable to In this section, we examine our testbed datasets from block ads in three apps and majorly breaks three apps Sec. 4 for exposure of personally identifable informa- (one minor). SATV is unable to block ads in four apps tion (PII) and we evaluate the e˙ectiveness of block- and majorly breaks one app (two minor). lists in preventing it. We defne “PII exposure” as the transmission of any PII from the smart TV device to Takeaway. All blocklists su˙er from a non-trivial any Internet destination. We identify PII values (such amount of visually observable FPs and FNs. Some as advertising ID and serial number) through the set- blocklists (e.g., PD and TF) are clearly more e˙ec- tings menus and packaging of each device. Since trackers tive than others. Interestingly, SATV, which is curated are known to encode or hash PII [58], we compute the specifcally for smart TVs, did not perform well. MD5 and SHA1 hashes for each of the PII values. We then search for these PII values in the HTTP header felds and URI path. Recall from Section 4 that we can 5.2 Identifying False Negatives analyze HTTP information even for encrypted fows in the Fire TV dataset due to AntMonitor’s TLS decryp- In this section, we demonstrate how datasets generated tion [37, 48], but can only analyze unencrypted fows in using automation tools such as Rokustic and Firetastic Roku. The number of PII exposures reported for Roku enable blocklist curators to identify potential false neg- should therefore be considered a lower bound. atives in the blocklist. In particular, we observe that the The TV is Smart and Full of Trackers 143

PII Roku Testbed Dataset (Apps & eSLDs) Fire TV Testbed Dataset (Apps & eSLDs) 1st Party 3rd Party Other Total 1st Party 3rd Party Platform Party Other Total Advertising ID 4/4/25% 263/36/88% 6/3/0% 269/42/81% 17/7/25% 53/31/78% 715/4/71% 5/5/40% 725/39/71% Serial Number 48/17/5% 128/16/74% 2/2/0% 174/34/36% 10/3/0% 51/4/33% 867/4/9% 2/2/0% 881/9/12% Device ID - -- - 19/8/0% 153/27/36% 819/5/14% 10/11/21% 856/43/31% Username 4/4/0% 1/1/100% - 5/5/20% 1/2/0% 2/2/100% 1/1/100% - 4/5/40% MAC - -- - - 2/2/100% - - 2/2/100% Location - 42/2/100% - 42/2/100% - 27/7/90% 2/2/100% - 28/7/90% Table 5. Applications / eSLDs / % Distinct FQDNs Blocked. Number of apps that expose PII, number of distinct eSLDs that receive PII from these apps, and percentage of distinct subdomains of the eSLDs that are blocked by the blocklists. We further separate by party as defned in Sec. 2.1. Roku platform column omitted since we do not observe PII exposures to platform domains.

Overview. Table 5 reports the PII exposures for both device’s serial number, suggesting that the PII is used testbed platforms. For Roku, the majority of the PII to track what ads have been shown to the user. exposures are to third parties, whereas for Fire TV Exposures to Platform Party. For Fire TV, the they are to the platform-specifc party. The blocklists majority of the exposures of serial number and de- adequately prevent exfltration of PII to third parties, vice ID to platform destinations seem to be for adver- blocking 74% or more of third party domains for all the tising and tracking purposes. For example, 697 apps PII values considered for Roku. For Fire TV, they miti- send the serial number and device ID (and advertis- gate the majority of advertising ID exposures, blocking ing ID) to the platform endpoint aviary.amazon.com 71% or more of the involved domains, but are not as with a URI path of “/GetAds”, and 53 apps send the e˙ective in preventing exposures of serial number and serial number to dna.amazon.com, with a URI path device ID to third parties and platform destinations. of “/GetSponsoredTileAds”. Judging from these paths, Di˙erentiating PII Exposures. Inspired by [59], we it would seem like the advertising ID alone would be adopt a simple approach for distinguishing between suÿcient and the more appropriate PII. On the other “good” and “bad” PII exposures that treats PII ex- hand, some exposures seem to serve a functional pur- posures to third parties as a higher threat to privacy pose. For example, 67 apps send the serial number to than PII exposures to frst parties. PII exposures to atv-ext.amazon.com, with varying URI paths contain- frst parties are generally warranted as they likely have ing “/cdp/”. We surmise that this domain serves as a functional purpose such as personalization of content “Content Delivery Platform(s)” [60], allowing apps to (e.g., keeping track of where the user paused a video). personalize content without user login. Specifcally, we For example, the Roku app “Acacia Fitness & Yoga see paths such as “/cdp/playback/GetDefaultSettings” Channel” from “RLJ Entertainment”, sends a request coupled with an “x-atv-session-id” HTTP header feld. to the frst party domain api.rlje.net with a URI path of Joint Exposure of Static and Dynamic PII. We “/cms/acacia/today/roku/content/browse.json” while observe that some apps send the advertising ID along- including the device’s serial number in an HTTP header side other static identifers. This goes against recom- feld, suggesting that the serial number is used to per- mended developer practices, where apps and ATSes sonalize today’s featured content. should only rely on dynamic identifers that can be On the other hand, PII exposures to third parties refreshed like advertising ID to give users the abil- are generally unwarranted as they typically do not have ity to opt out of being tracked. Aside from the 697 a functional purpose. This extends to cases where the Fire TV apps that expose advertising ID alongside se- app retrieves its content through a third party CDN rial number and device ID discussed earlier, we ob- as the personalization could be achieved by frst send- serve 10 Roku apps (including prominent ones such as ing the PII to the frst party server which could then Pluto TV and PBS) that send both the advertising ID respond to the app with the CDN URL for the con- and the serial number to third parties (subdomains of tent to be retrieved. For instance, the Roku app “Arm- scorecardreasearch.com and youboranqs01.com). Simi- chairTourist” from “ArmchairTourist Video Inc.” sends larly, 12 Fire TV apps send the advertising ID along- a request to the third party domain ads.adrise.tv with side the device ID to third party destinations such a URI path of “/track/impression...” that encodes the as ads.adrise.tv and ctv.monarchads.com. Thus, these The TV is Smart and Full of Trackers 144 practices allow ATSes to link an old advertising ID to We further evaluate popular DNS-based blocklists’ abil- the new value by joining on the static identifers. ity to prevent smart TVs from accessing ATSes, and Leveraging Missed PII Exposures to Improve fnd that all lists su˙er from missed ATSes and incur Blocklists. The above indicate another direction for app breakage. Finally, we examine our testbed datasets improving blocklist curation for smart TVs. By deploy- for exposure of personally identifable information (PII) ing tools such as Rokustic and Firetastic and searching and discover that hundreds of apps send PII to third the network traces for PII exposures, blocklist curators parties and the platform-specifc party, mostly for ad- can generate candidate rules that can then be exam- vertising and tracking purposes. ined manually. Using this approach, we identifed 38 Limitations. Our methodology has its limitations. domains in the Roku dataset and 30 in the Fire TV First, the automated app exploration may not always dataset that receive PII, but were not blocked by any result in content (video/audio) playback, which may im- list. These numbers are conservative as we exclude loca- pact the (ATS) domains discovered. We evaluate the tion and account name that are likely to be used for le- extent of this limitation in Appendix A.3. We fnd that gitimate purposes, such as logging in or serving location- Rokustic and Firetastic perform on par with concurrent based content. These domains include obvious ATSes work [11] in terms of playback success, and that they such as ads.aimitv.com and ads.ewscloud.com. Another manage to discover a large fraction of the number of noteworthy mention is hotlist.samba.tv: Samba TV uses domains discovered during manual interaction. Second, Automatic Content Recognition to provide content sug- apps may prevent TLS interception through use of cer- gestions on smart TVs, but this comes at the cost of tifcate pinning, which may prevent Firetastic from ob- targeted advertising that even propagates onto other serving PII exposures in encrypted traÿc. We assess the devices in the home network [61]. decryption failures in Appendix B: we fnd that for 80% Takeaway. Hundreds of Roku and Fire TV apps expose of the apps in our Fire TV testbed dataset, TLS inter- PII, mostly to third parties and the platform-specifc ception only fails for 20% or fewer of an app’s TLS con- party. For Fire TV, we observe that most of the expo- nections. Third, our analysis of FPs and FNs in DNS- sures to the platform-specifc party seem to be for ad- based blocklists in Sec. 5.1 does not account for DNS vertising and tracking purposes. We observe that many over HTTPS (DoH), nor static advertisements, thus it Roku and Fire TV apps send the advertising ID along- may overcount blocklist FNs for these cases. side a static identifer (e.g., serial number), which en- Future Work. Our fndings motivate more research to ables the ATS to relink a user profle associated with an further understand smart TVs and to develop privacy- old advertising ID to a new advertising ID, thus elimi- enhancing solutions specifcally designed for each smart nating the user’s ability to opt out. The blocklists gener- TV platform. For example, more research is needed ally do well at preventing exposure of the advertising ID to curate accurate, fne-grained (as opposed to DNS- on both platforms, but are less successful at preventing based), and platform-specifc blocklists. To foster fur- exposures of serial number and device ID on Fire TV. ther research along this direction, we plan to make our tools, Rokustic and Firetastic, and testbed datasets publicly available [12]. We intend to further improve our 7 Conclusion & Directions tools’ ability to thoroughly explore smart TV apps along the directions discussed in Appendix A.3.

Summary. In this paper, we performed one of the frst comprehensive measurement studies of the emerg- ing smart TV advertising and tracking service (ATS) Acknowledgements ecosystem. To that end, we analyzed and compared: (i) a realistic but small in the wild dataset (57 smart TV de- This work is supported in part by NSF Awards 1715152, vices from seven di˙erent platforms, with coarse fow- 1750175, 1815131, and 1815666, Seed Funding by UCI level information); and (ii) two large testbed datasets VCR, and Minim. The authors would like to thank (top-1000 apps on Roku and Fire TV, tested system- M. Hammad Mazhar and the team at Minim for their atically, with granular per app and packet-level infor- help with collecting and analyzing smart TV traÿc mation). Our work establishes that the smart TV ATS in the wild. We would also like to thank our shepherd, ecosystem is fragmented across di˙erent smart TV plat- Erik Sy, and the anonymous PETS reviewers for their forms and is di˙erent from the mobile ATS ecosystem. feedback that helped signifcantly improve the paper. The TV is Smart and Full of Trackers 145

[17] "Consumer Reports". Samsung and Roku Smart TVs Vul- References nerable to Hacking. https://www.consumerreports.org/ /samsung-roku-smart-tvs-vulnerable-to-hacking- [1] Rimma Kats. How Many Households Own a Smart consumer-reports-fnds/, 2018. [Online; accessed 2019-04- TV? https://www.emarketer.com/content/how-many- 22]. households-own-a-smart-tv, 2018. [Online; accessed 2019- [18] Sapna Maheshwari. How Smart TVs in Millions of U.S. 05-10]. Homes Track More Than What’s On Tonight. https:// [2] Hulu gained twice as many US subscribers as Netfix at the www.nytimes.com/2018/07/05/business/media/tv-viewer- start of 2019. https://www.cnbc.com/2019/05/01/hulu- tracking.html, 2018. [Online; accessed 2019-05-10]. gained-twice-as-many-subscribers-as-netfix-in-us.html, [19] "FTC". VIZIO to Pay $2.2 Million to FTC, State of New 2019. [Online; accessed 2019-05-10]. Jersey to Settle Charges It Collected Viewing Histories on [3] Amazon: Smart TVs. https://www.amazon.com/smart- 11 Million Smart Televisions without Users’ Consent. https: tv-store/b?ie=UTF8&node=5969290011, 2019. [Online; //www.ftc.gov/news-events/press-releases/2017/02/vizio- accessed 2019-05-10]. pay-22-million-ftc-state-new-jersey-settle-charges-it, 2017. [4] Connected TV Advertising is Surging. https://www. [Online; accessed 2019-04-22]. videonuze.com/article/connected-tv-advertising-is-surging, [20] Whitson Gordon. How to Stop Your Smart TV From Track- 2017. [Online; accessed 2019-05-10]. ing What You Watch. https://www.nytimes.com/2018/07/ [5] MAGNA ADVERTISING FORECASTS. https:// 23/smarter-living/how-to-stop-your-smart-tv-from-tracking- magnaglobal.com/magna-advertising-forecasts-fall-update- what-you-watch.html, 2018. [Online; accessed 2019-05-10]. executive-summary/, 2019. [Online; accessed 2019-05-10]. [21] J. R. Mayer and J. C. Mitchell. Third-Party Web Track- [6] Jump PR. Beachfront Releases 2018 CTV Ad Data, ing: Policy and Technology. In 2012 IEEE Symposium on Roku Still Leads, Amazon Growing Quickly. https: Security and Privacy, pages 413–427, May 2012. //www.broadcastingcable.com/post-type-the-wire/2018- [22] Phillipa Gill, Vijay Erramilli, Augustin Chaintreau, Balachan- ctv-ad-data-realeased-by-beachfront, 2018. [Online; ac- der Krishnamurthy, Konstantina Papagiannaki, and Pablo cessed 2019-05-10]. Rodriguez. Follow the Money: Understanding Economics of [7] Pi-Hole: A black hole for Internet advertisements. https: Online Aggregation and Advertising. In Proceedings of the //pi-hole.net/, 2019. [Online; accessed 2019-05-11]. 2013 conference on Internet measurement conference, pages [8] WaLLy3K. The Big Blocklist Collection. https://frebog.net, 141–148. ACM, 2013. 2019. [Online; accessed 2019-04-29]. [23] Steven Englehardt and Arvind Narayanan. Online Tracking: [9] MoaAB: Mother of All AD-BLOCKING. https://forum.xda- A 1-million-site Measurement and Analysis. In Proceedings developers.com/showthread.php?t=1916098, 2019. [Online; of the 2016 ACM SIGSAC Conference on Computer and accessed 2019-04-22]. Communications Security, CCS ’16, pages 1388–1401, New [10] Kromtech Alliance Corp. Stopad for tv. https://stopad.io/ York, NY, USA, 2016. ACM. tv, 2019. [24] Abbas Razaghpanah, Rishab Nithyanand, Narseo Vallina- [11] Hooman Mohajeri Moghaddam, Gunes Acar, Ben Burgess, Rodriguez, Srikanth Sundaresan, Mark Allman, Christian Arunesh Mathur, Danny Yuxing Huang, Nick Feamster, Kreibich, and Phillipa Gill. Apps, Trackers, Privacy, and Edward W. Felten, Prateek Mittal, and Arvind Narayanan. Regulators: A Global Study of the Mobile Tracking Ecosys- Watching You Watch: The Tracking Ecosystem of Over- tem. NDSS, 2018. the-Top TV Streaming Devices. In Proceedings of the 2019 [25] Jingjing Ren, Ashwin Rao, Martina Lindorfer, Arnaud ACM SIGSAC Conference on Computer and Communica- Legout, and David Cho˙nes. ReCon: Revealing and Con- tions Security, CCS ’19, pages 131–147, , NY, trolling PII Leaks in Mobile Network Traÿc. In Proceedings USA, 2019. ACM. of the 14th Annual International Conference on Mobile Sys- [12] UCI Networking Group. The TV is Smart and Full of Track- tems, Applications, and Services, pages 361–374. ACM, ers: Project Page. http://athinagroup.eng.uci.edu/projects/ 2016. smarttv/, 2019. [26] Anastasia Shuba, Athina Markopoulou, and Zubair Shafq. [13] Ross Benes. 10 Ways Roku Is Growing Its Ad Business. NoMoAds: E˙ective and Eÿcient Cross-App Mobile Ad- https://www.emarketer.com/content/10-ways-roku-is- Blocking. Proceedings on Privacy Enhancing Technologies, growing-its-ads-business, 2019. [Online; accessed 2019- 2018(4):125–140, 2018. 05-10]. [27] Jingjing Ren, Daniel J. Dubois, David Cho˙nes, Anna Maria [14] Garett Sloane. AMAZON IS NOW TAKING A 30 PER- Mandalari, Roman Kolcun, and Hamed Haddadi. Infor- CENT CUT OF AD SALES FROM FIRE TV. https: mation Exposure From Consumer IoT Devices: A Multidi- //adage.com/article/design/amazon-taking-30-percent-ad- mensional, Network-Informed Measurement Approach. In sales-fre-tv/315678, 2018. [Online; accessed 2019-05-10]. Proceedings of the Internet Measurement Conference, IMC [15] Roku, Inc. The Roku Advantage. https://advertising.roku. ’19, pages 267–279, New York, NY, USA, 2019. ACM. com/advertising-solutions, 2019. [Online; accessed 2019-05- [28] Danny Yuxing Huang, Noah Apthorpe, Gunes Acar, Frank 10]. Li, and Nick Feamster. IoT Inspector: Crowdsourcing La- [16] Amazon.com, Inc. Amazon DSP. https://advertising. beled Network Traÿc from Smart Home Devices at Scale, amazon.com/products/amazon-dsp, 2019. [Online; accessed 2019. 2019-05-10]. [29] M. Ghiglieri and E. Tews. A privacy protection system for HbbTV in Smart TVs. In 2014 IEEE 11th Consumer Com- The TV is Smart and Full of Trackers 146

munications and Networking Conference (CCNC), pages 2019-05-10]. 357–362, Jan 2014. [49] Yuanchun Li, Ziyue Yang, Yao Guo, and Xiangqun Chen. [30] Nathan Malkin, Julia Bernd, Maritza Johnson, and Serge DroidBot: a Lightweight UI-guided Test Input Generator for Egelman. “What Can’t Data Be Used For?” Privacy Expec- Android. In 2017 IEEE/ACM 39th International Conference tations about Smart TVs in the US. In European Workshop on Software Engineering Companion (ICSE-C), pages 23–26. on Usable Security (Euro USEC), 2018. IEEE, 2017. [31] VirusTotal. https://www.virustotal.com/. [Online; accessed [50] Future Today Inc. http://www.stu˙welike.com/, 2019. 2019-08-24]. [Online; accessed 2019-12-05]. [32] McAfee, LLC. Customer URL Ticketing System. https: [51] Stu˙WeLike. http://www.stu˙welike.com/, 2019. [Online; //www.trustedsource.org/. [Online; accessed 2019-08-24]. accessed 2019-12-05]. [33] OpenDNS Domain Tagging. https://community.opendns. [52] Manta Media Inc. Htvma Solutions, Inc. https://www. com/domaintagging/. [Online; accessed 2019-08-24]. manta.com/c/mhqfv38/htvma-solutions-inc, 2019. [Online; [34] Mozilla Foundation. Public Suÿx List. https://publicsuÿx. accessed 2019-12-05]. org/. [Online; accessed 2019-08-23]. [53] Gray , Inc. https://gray.tv/, 2019. [Online; ac- [35] John Kurkowsi. tldextract. https://github.com/john- cessed 2019-12-06]. kurkowski/tldextract. [Online; accessed 2019-08-23]. [54] telekrmor. Round 3: What Really Happens On Your Net- [36] Crunchbase. https://www.crunchbase.com/. [Online; ac- work? https://pi-hole.net/2017/07/06/round-3-what-really- cessed 2019-08-29]. happens-on-your-network/, 2017. [Online; accessed 2019- [37] Anastasia Shuba, Anh Le, Emmanouil Alimpertis, Minas 05-11]. Gjoka, and Athina Markopoulou. AntMonitor: A System for [55] Pi-hole LLC. Blocking Mode. https://docs.pi-hole.net/ On-Device Mobile Network Monitoring and its Applications. ftldns/blockingmode, 2018. arXiv preprint arXiv:1611.04268, 2016. [56] Customising Sources for Ad Lists. https://github.com/pi- [38] Je˙rey Erman, Alexandre Gerber, KK Ramadrishnan, Sub- hole/pi-hole/wiki/Customising-Sources-for-Ad-Lists, 2019. habrata Sen, and Oliver Spatscheck. Over The Top Video: [57] Google LLC. apkanalyzer. https://developer.android.com/ The Gorilla in Cellular Networks. In Proceedings of the 2011 studio/command-line/apkanalyzer, 2019. [Online; accessed ACM SIGCOMM conference on Internet measurement con- 2019-04-29]. ference, pages 127–136. ACM, 2011. [58] Steven Englehardt, Je˙rey Han, and Arvind Narayanan. [39] Timm Bttger, Felix Cuadrado, Gareth Tyson, Ignacio Cas- I never signed up for this! privacy implications of tro, and Steve Uhlig. Open Connect Everywhere: A Glimpse tracking. Proceedings on Privacy Enhancing Technologies, at the Internet Ecosystem through the Lens of the Netfix 2018(1):109–126, 2018. CDN. ACM SIGCOMM Computer Communication Review, [59] Georg Merzdovnik, Markus Huber, Damjan Buhov, Nick 48(1):28–34, 2018. Nikiforakis, Sebastian Neuner, Martin Schmiedecker, and [40] Vijay Kumar Adhikari, Yang Guo, Fang Hao, Volker Hilt, Edgar Weippl. Block me if you can: A large-scale study of and Zhi-Li Zhang. A tale of three CDNs: An active mea- tracker-blocking tools. In 2017 IEEE European Symposium surement study of Hulu and its CDNs. In 2012 Proceedings on Security and Privacy (EuroS&P), pages 319–333. IEEE, IEEE INFOCOM Workshops, pages 7–12. IEEE, 2012. 2017. [41] eMarketer. US Connected TV Users, by Brand, 2018 & [60] Antidot. Content Delivery Platform. https://www.antidot. 2022. https://www.emarketer.com/Chart/US-Connected- net/content-delivery-platform/, 2019. [Online; accessed TV-Users-by-Brand-2018-2022-of-connected-TV-users/ 2019-12-05]. 220767, 2018. [Online; accessed 2019-11-26]. [61] Sapna Maheshwari. How Smart TVs in Millions of U.S. [42] Roku, Inc. Roku Channel Store. https://channelstore.roku. Homes Track More Than What’s On Tonight. https:// com, 2019. [Online; accessed 2019-04-19]. www.nytimes.com/2018/07/05/business/media/tv-viewer- [43] Roku, Inc. Roku Developer Documentation: Security tracking.html, 2018. [Online; accessed 2019-05-11]. Overview. https://sdkdocs.roku.com/display/sdkdoc/ [62] Raspberry Pi Foundation. Setting up a Raspberry Pi as Security+Overview, 2019. [Online; accessed 2019-04-22]. an access point in a standalone network (NAT). https:// [44] Roku, Inc. Roku Developer Documentation: Development www.raspberrypi.org/documentation/confguration/wireless/ Environment Overview. https://sdkdocs.roku.com/display/ access-point.md, 2019. [Online; accessed 2019-03-04]. sdkdoc/Development+Environment+Overview, 2019. [On- [63] Android tcpdump. https://www.androidtcpdump.com/, line; accessed 2019-04-22]. 2019. [Online; accessed 2019-04-11]. [45] Roku, Inc. Roku Developer Documentation: Roku Advertis- ing Framework. https://sdkdocs.roku.com/display/sdkdoc/ Roku+Advertising+Framework, 2019. [Online; accessed 2019-04-22]. [46] Ooyala IQ SDK for Roku. https://github.com/ooyala/iq- sdk-roku, 2015. [Online; accessed 2019-04-22]. [47] Je˙ Bush, Kevin Cooper, and Linda Kyrnitszke. External Control API. https://sdkdocs.roku.com/x/K5cY, 2013. [Online; accessed 2019-03-04]. [48] AntMonitor open-source. https://github.com/UCI- Networking-Group/AntMonitor, 2018. [Online; accessed The TV is Smart and Full of Trackers 147

Appendix App Installation. Given a set of apps to exercise, Rokustic installs each app by invoking the ECP end- point that opens up the on-device version of the RCS In appendices A and B, we discuss implementation de- page for the app, and then sends a virtual key press to tails and limitations of our methodology, and outline click the “Add Channel” button that starts download directions for improvement. In Appendix C, we provide and installation of the app. Rokustic then waits for fve additional analysis of datasets that had to be omitted seconds, and then queries the Roku for the set of in- from the main text due to lack of space. Notably, the stalled apps to check if the installation has completed, rest of the “in the wild” dataset is within Appendix C.2. and if so continues to install the next app, otherwise the wait-and-check is repeated (until a fxed threshold). App Exploration. From manual inspection of a few A Automatic App Exploration apps (e.g. YouTube and Pluto TV), we fnd that playable content is often presented in a grid, where each In this section, we provide details on the implementa- is a di˙erent video or live TV channel. Generally, tion of our automatic app exploration tools, Rokustic for the user interface defaults to highlighting one of these Roku (Sec. A.1) and Firetastic for Fire TV (Sec. A.2), cells (e.g., the frst recommended video). Pressing “Se- in addition to what is described in the main paper (Sec- lect” on the Roku remote immediately after the app has tions 4.1 and 4.2, respectively). Then, in Section A.3, launched will therefore result in playback of some con- we discuss the limitations of Rokustic and Firetastic, tent. From this insight, we devised a simple algorithm including when the automation does not lead to video (see Listing 1) that attempts to cause playback of three playback and how the automated exploration compares di˙erent videos for each installed Roku app. to manual testing of apps that require login. Ultimately, this discussion leads to directions for how to improve for app in roku.installedApps: Rokustic’s and Firetastic’s automatic app exploration. startPacketCapture(app.id); # Play default video launch(app); sleepSeconds(20); A.1 Roku Automation press("SELECT"); sleepMinutes(5); # Play some other video by selecting a To scale testing of apps, we implement a software sys- # different cell in the grid tem, Rokustic, that automatically installs and exercises relaunch(app); Roku apps on a Roku Express 3900X (Roku for short). press("DOWN"); press("DOWN"); Setup. We run Rokustic on a Raspberry Pi 3 Model B+ press("RIGHT"); press("RIGHT"); set up to host a standalone network as per the instruc- press("SELECT"); sleepMinutes(5); tions given in [62]. The Pi’s wireless interface (wlan0) is # Play a 3rd video confgured as a wireless access point with DHCP server relaunch(app); and NAT, and the Roku is connected to this local wire- press("DOWN"); press("DOWN"); less network. The Pi’s wired interface (eth0) connects press("SELECT"); sleepMinutes(5); the Pi and thus, in turn, the Roku to the WAN. This # Quit the Roku app setup enables us to collect all traÿc going in and out press("HOME"); of the Roku by running tcpdump on the Raspberry Pi’s stopPacketCapture(app.id); wireless interface. We do not attempt to decrypt TLS Listing 1. Algorithm for exercising Roku apps. traÿc as we cannot install our own self-signed certif- cates on the Roku. For each Roku app, the algorithm frst starts a Rokustic utilizes the Roku External Control (ECP) packet capture so as to produce a .pcap fle for each API [47] to control the Roku. The ECP is a REST-like Roku app, thereby essentially labeling traÿc with the API exposed by the Roku to other devices on the local app that caused it: since Roku apps cannot execute in network. The ECP includes a set of REST-endpoints the background (see Sec. 4.1), all traÿc captured dur- that provides the ability to press keys on the Roku re- ing execution of a single app will belong to that app mote, query the Roku for various information such as and the Roku system. The target app is launched, and the set of installed apps, programmatically browse the the algorithm pauses, waiting for the app to load. Next, Roku Channel Store (RCS), etc. a virtual “Select” key press is sent to attempt to start The TV is Smart and Full of Trackers 148 video playback, and the algorithm subsequently pauses video content. Thus, we confgure DroidBot to utilize to let the content play. The app is then relaunched by its breadth frst search algorithm to explore each ap- returning to the Roku’s home screen and then launching plication. The intuition is that this should cover more the app again. The purpose of this is to safely return to distinct UI paths of the app, thus increasing the chance the app’s home screen such that di˙erent content can of content playback (in contrast, the Depth First Search be selected for playback. This is repeated two times, algorithm may cause the automation to deep end into with slight variations in the sequence of navigational a path that we do not care about, such as a settings key presses, such that each app will (presumably) end menu). With some trial and error, we selected the input up playing three di˙erent videos, making the total ex- command interval as three seconds which leaves enough ploration time approximately 16 minutes per app. We time for applications to handle the command and load note that the relaunch procedure internally performs the next view during app exploration. short sleeps to let the app quit and launch again, and We summarize Firetastic’s automation algorithm in that we have omitted a one second sleep after each nav- Listing 2. For each app, Firetastic frst starts the local igational key press in the pseudo code in Listing 1. VPN to capture (and decrypt) traÿc. Next, it invokes Although the Roku remote has a “Back” button, DroidBot, which in turn launches the app and begins ex- which behaves similarly to the back button in Android, ploring it. When the 15-minute exploration completes, we purposefully avoid using it as a means to return to Firetastic stops the local VPN and extracts the .pcapng the app’s home screen: if the video selected for playback fles that were generated during testing. is shorter than the sleep duration, the app will return to device = "10.0.1.xx:5555" its home screen automatically, and pressing “Back” will pcapng_dir = "/some/path/to/store/pcapng" therefore quit the app and return to the Roku’s home apk_dir = "/some/path/to/apk/batch" screen. This would pollute our data as the subsequent for app in apk_dir: navigational key presses would cause a di˙erent app to # Start AntMonitor on Fire TV be highlighted and then launched by the next “Select” start_antmonitor(device) key press. # Ensure VPN connection up ensure_antmonitor_connected(device) A.2 Fire TV Automation # Run DroidBot command params = { duration: 15min , policy: "bfs_naive", Since Fire TV is based on Android, we can use existing install_timeout: 5min , Android tools to capture network traÿc. Although there interval: 3sec } are various methods for capturing traÿc on Android run_droidbot(app, params , device) on the device itself (e.g. androidtcpdump [63]), most of # Stop AntMonitor on Fire TV them require a rooted device. While it is possible to root stop_antmonitor_vpn(device) a Fire TV, it may make applications behave di˙erently if # Extract the pcap files they detect root. Thus, to collect measurements that are extract_pcapng_files(app, pcapng_dir, representative of an average user, we use a VPN-based device) traÿc interception method that does not require rooting # Clean up before testing next app the device [37, 48]. We discard incoming traÿc because remove_pcapng_files(device) video content results in huge pcap fles, which due to a technical limitation of ADB (very slow transfer speeds Listing 2. Algorithm for exercising FireTV apps. for large fles) slows down the automated experiments signifcantly. The outgoing traÿc is suÿcient for our domain and PII analysis. To automatically explore each Fire TV application, A.3 Limitations of App Exploration we utilize Droidbot [49], as it treats each app as a tree of possible paths to explore instead of randomly gen- Our automated app exploration has, admittedly, limita- erating events, which results in higher test coverage of tions. For example, it can miss video playback for some the application. Furthermore, we deduce that developers apps, and cannot fully explore apps that require login. would minimize the necessary clicks in order to reach the In this section, we evaluate our automated exploration core sections of their applications, especially for playing vs. a more realistic manual exploration by a real user, The TV is Smart and Full of Trackers 149

App Name Playback? Distinct eSLDs Distinct ATS domains A A (auto) Auto (A) Manual (M) M A M M YouTube 8 15 53% 5 15 33% Sling TV 5 10 21 48% 6 21 29% The Roku Channel 16 13 123% 7 5 140% Crackle 9 34 26% 8 42 19% JW Broadcasting 5 3 4 75% 2 2 100% PBS KIDS 4 7 57% 4 6 67% Top-10 Top-10 ESPN 5 6 16 38% 4 16 25% Tubi - Free Movies & TV 5 3 19 15.79% 5 34 15% DisneyNOW 5 6 9 67% 4 5 80%

Roku Pluto TV - It’s Free TV 24 9 267% 36 7 514% Roku Newscaster 5 4 5 80% 3 2 150% ChuChu TV 10 10 100% 9 17 53% Elvis 20 38 53% 14 54 26% 5

Random Mondo 4 5 80% 2 2 100% tik tok 1 3 33% 2 3 67% Total 8 of 15 (53%) 74 113 65% 64 122 52% Pluto TV - It’s Free TV 15 30 50% 13 35 37% ABC 14 13 108% 5 6 83% Fox Now 22 24 92% 18 18 100% AMC 5 18 28 64% 8 19 42% Fox Sports GO 5 12 19 63.16% 7 17 41% Kids for Youtube 16 19 84% 7 5 140% Top-10 Top-10 PBS Kids 12 15 80% 7 9 78% CNN Go 31 34 91% 41 34 121% Sundance TV 5 13 23 57% 5 11 45% Fire TV Fire MTV 5 56 38 147% 38 54 70% Vimeo 12 11 109% 5 3 167% Dog TV Online 10 14 71% 2 5 40% WCSC Live 5 News 5 13 12 108% 5 5 100%

Random WFXG FOX 54 18 18 100% 8 7 114% 13abc WTVG Toledo, OH 12 11 109% 5 2 250% Total 10 of 15 (67%) 125 117 107% 115 138 83% Table 6. Content playback success for Rokustic and Firetastic, and a comparison of the number of domains discovered by Rokustic and Firetastic to the number of domains discovered during manual interaction with the same app, for 15 apps that do not require login. For each app, we perform approximately 16 minutes of automated and 16 minutes of manual interaction.

App Name Distinct eSLDs Distinct ATS domains A A Auto (A) Manual (M) M A M M HBO NOW 3 7 43% 2 5 40% Hulu 6 18 33% 3 21 14% Netfix 3 4 75% 3 3 100% SHOWTIME 4 8 50% 3 6 50% Roku 5 7 71% 3 5 60% Total 14 30 47% 6 26 23% HBO NOW 7 9 78% 2 2 100% Hulu 9 19 47% 4 17 24% Netfix 11 9 122% 4 3 133% SHOWTIME 10 8 125% 4 3 133% Fire TV Fire STARZ 18 14 129% 12 7 171% Total 27 42 64% 15 27 56% Table 7. Comparison of the number of domains discovered by Rokustic and Firetastic to the number of domains discovered during manual interaction with the same fve apps that require login. The automation was performed while logged out, and the manual in- teraction was performed while logged in. For each app, we perform approximately 16 minutes of automated and 16 minutes of manual interaction. The TV is Smart and Full of Trackers 150 for 20 apps. We report the cases where our automatic the 15 apps (60% for the top-10 apps, and 80% for the exploration succeeds and fails, and we provide insights random apps). A common characteristic of the 33% un- into the reasons why, and ideas for future improvements. successful apps is that the majority of their content is First, we evaluate our tools’ ability to perform play- locked, and that free content is deferred to the least ac- back, because it a˙ects our ability to capture traÿc re- cessible sections of the UI (e.g., the bottom row of a lated to ads delivered at the start of or during con- grid layout). This decreases the chance that Firetastic tent (video/audio) playback. We evaluate Firetastic’s will discover a playable video during the 16 min exper- and Rokustic’s content playback rates in Sec. A.3.1 by iment, especially since attempts to play locked content performing a case study of 15 apps. We fnd that the two often redirects to login/activation screens where addi- tools perform similarly to the the state-of-the-art [11], tional time is lost. We can improve on this in future and we identify how we can further improve the tools work by mixing BFS and DFS exploration: the automa- to increase the chance of incurring content playback. tion can explore each level up to a certain threshold In Sec. A.3.2, we evaluate the automated approach’s before drilling down into the next nested view. ability to discover an app’s domain space by comparing Rokustic manages to play content for 53% of the 15 the eSLDs and ATS domains discovered by the tools to apps (50% for the top-10 apps, and 60% for the ran- the eSLDs and ATS domains discovered during manual dom apps). Unsuccessful attempts are primarily due to interaction with the same 15 apps. We fnd that both nested menus, where additional “Select” key press(es) tools generally do well at mapping a large fraction of are necessary to start playback. Through manual in- the number of domains discovered during manual inter- teraction with a few apps prior to executing our top- action. 1000 apps measurement, we observe that if an app starts Finally, a common limitation of app exploration in playback on the frst “Select” key press, additional key general (not just for smart TV apps) is that it is diÿ- presses may result in pausing and/or exiting the video, cult to explore apps that require user login. In Sec. A.3.3 or skipping past an ad. Therefore, we opt for a con- we compare the eSLDs and ATS domains discovered by servative approach that, when successful, will collect as Firetastic and Rokustic for 5 popular streaming apps much ad/video traÿc as possible. In future work, we can (while logged out) to the eSLDs and ATS domains dis- further improve content playback by repeatedly sending covered during manual interaction (while logged in). As “Select” key presses (with short pauses in between) un- expected, we fnd that the automation misses parts of til the network throughput for the Roku stays above a the domain space of apps that serve third-party ads as certain threshold for a short duration (e.g., the bitrate part of content that is only accessible after logging in. of 720p video for 5 seconds). This dynamic strategy can Interestingly, we also observe cases where the automa- handle more nested menus and thus be resilient to fu- tion discovers more (ATS) domains than the manual ture changes to an app’s UI. experiments. Takeaway. In summary, both tools work well when an app adheres to good UI design principles, such as re- ducing the number of user actions required to reach the A.3.1 Video Playback main content. The content playback success rates are on par with state-of-the-art concurrent work [11], with Setup. To evaluate Firetastic’s and Rokustic’s ability Firetastic slightly ahead, possibly due to its dynamic to incur playback of an app’s main content (video/au- approach to UI exploration (as opposed to the static, dio) for the set of apps that do not require login to access heuristic-based approach used in [11]), and with Rokus- (parts of) their content, we run the full (˘16 minutes) tic slightly behind. While Firetastic leverages existing automation for 15 apps on each platform, while observ- advanced Android tools (Droidbot), similar tools do not ing for content playback. We pick the 15 apps according currently exist for Roku, thus we had to build them from to their popularity (as defned in Sec. 4): (1) the top-10 scratch, and it is therefore natural that Rokustic falls most popular apps to capture the most infuential apps; slightly behind its Fire TV counterpart. and (2) 5 additional randomly sampled apps, spread evenly across the popularity spectrum, to represent the dataset as a whole. Results. We present the content playback results in Table 6. Firetastic manages to play content for 67% of The TV is Smart and Full of Trackers 151

A.3.2 Automation vs. Manual Testing two platforms. This lower bound should improve when we implement the changes suggested earlier. Setup. Next, we evaluate Firetastic’s and Rokustic’s ability to successfully map an app’s domain space. We manually interact with the 15 apps from Sec. A.3, and A.3.3 Apps that Require Login compare the network behavior observed during auto- mated testing to the network behavior observed during Setup. To understand how well the automation man- manual testing. For a fair comparison, we interact with ages to map the domain spaces of apps that require each app for approximately the same duration as in the login, we pick 5 of the top subscription-based streaming automated experiments (˘16 minutes). For consistency, apps and run the automation without logging in, and we follow a protocol in which we attempt to play 7 dif- also manually interact with the same apps while logged ferent videos for approximately 2 minutes each, leaving in (following the same protocol as in Sec. A.3.2). We a few minutes to navigate between videos. We compare compare the network behavior using the same metrics the network behavior in terms of the number of eSLDs as in Sec. A.3.2. The results are presented in Table 7. and ATS domains (as defned by the union of the block- Results. Firetastic actually discovers more eSLDs than lists from Sec. 5) contacted by each app. The results are the manual experiments for 3 out of the 5 apps. We presented in Table 6. also observe that the STARZ app contacts more ATS Results. Firetastic is successful in mapping the domain domains when automatically tested than in the man- space for 10 out of 15 apps (67%), uncovering 0.8 times ual experiment. These fndings are interesting as they (or more) the number of eSLDs, and 0.7 times (or more) indicate that an app’s domain space and ATS-related the number of ATS domains discovered in the man- activity possibly changes after the user logs in and is ual experiments. In fact, Firetastic even discovers more not necessarily tied to video playback. An ideal exper- eSLDs and ATS domains than the manual experiment iment would thus need to thoroughly exercise the app for 6 (40%) and 7 (47%), respectively, of the 15 apps. in both states. The results for Rokustic are more in line Rokustic is less successful, but still manages to uncover with what is to be expected: Rokustic discovers fewer 0.67 times (or more) the number of eSLDs and ATS eSLDs and ATS domains than the manual experiments domains in the manual experiment for 7 (47%) and 8 (55% and 53%, respectively, on average). Finally, for (53%), respectively, of the 15 apps. Moreover, Rokustic both platforms, we observe that the number of ATS do- even discovers 2.67 times as many eSLDs as the manual mains contacted by Hulu increases signifcantly for the experiment for one of the apps (Pluto TV). manual experiments. This is to be expected as Hulu is Intuition. The two tools have very di˙erent approaches the only of the 5 apps that deliver third-party ads dur- to app exploration: Firetastic seeks to explore as much ing content playback. app functionality as possible, but is likely to exit con- Takeaway. As expected, Rokustic can only map parts tent playback early, whereas Rokustic seeks to mimic a of the (ATS) domain spaces of apps that require login. real user that sits through 3 videos of 5 minutes. Each For Firetastic, we observe that some apps contact more approach has its own merit: Firetastic is good at dis- ATS domains while logged out, and an ideal experiment covering many ATS domains for apps that present ads would thus need to explore the apps in both states. before content playback begins, whereas Rokustic is the more successful tool when it comes to discovering ATS domains for apps that defer ad delivery to later in the B Fire TV TLS Interception video/audio stream, as was the case for Pluto TV. Fi- nally, we note that even in the worst case, i.e., when Firetastic attempts to decrypt TLS traÿc to facilitate Firetastic and Rokustic do not manage to incur content detection of PII exposures in encrypted traÿc. How- playback, they still uncover several ATS domains. ever, the decryption may fail: (i) for apps that attempt Takeaway. This case study, and the fact that Fire- to mitigate TLS interception, for example through use tastic and Rokustic uncovered approximately twice as of certifcate pinning, or (ii) if the cipher suites used many domains as the state-of-the-art [11] for the top- in the TLS connection are not supported by the TLS 1000 apps measurement described in Sec. 4, show that decryption library used in AntMonitor. To approximate our tools already provide suÿcient means to automati- the impact that such decryption failures may have on cally estimate a lower bound on the ATS domains of the the PII exposure results, we evaluate the failure rate The TV is Smart and Full of Trackers 152

1 out of 5 (or fewer) TLS connections for 80% of all 1.0 apps. Since TLS decryption was generally successful, 0.8 this validates the PII exposure results.

0.6

0.4 C Additional Analysis of Datasets

CDF of Tested Apps 0.2 C.1 Labeling Datasets Continued 0.0 0.0 0.2 0.4 0.6 0.8 To complement Sec 2.1, this section provides the full Share of TLS Decryption Failures details for how we label each endpoint as either frst Fig. 7. Empirical CDF of TLS decryption failures per app. The party, third party or platform-specifc party, w.r.t. the decryption generally worked well: Decryption fails for 1 out of 10 app that contacts it: (or fewer) TLS connections for 55% of all apps; 1 out of 5 (or 1. We frst tokenize app identifers and the eSLD fewer) TLS connections for 80% of all apps. of the contacted FQDN (we obtain the eSLD using Mozilla’s Public Suÿx List [34, 35]). For of Firetastic’s TLS interception across all apps in the Fire TV, we tokenize the package names and de- Fire TV testbed dataset from Sec. 4. veloper names. For Roku, we rely on app and de- veloper names since its apps do not have package Methodology. For each app in the Fire TV testbed names. For app/package tokens, we ignore common dataset, we frst identify the set of TCP connections, and platform-specifc strings like “com”, “fretv”, t, initiated by this app and labeled as TLS by tshark. “roku”, etc., while retaining all tokens from the de- Next, we identify the subset h of TCP connections in t veloper names. We then match the resulting identi- that also contained at least one packet labeled by tshark fers with the tokenized eSLD. as HTTP: these are the connections that are successfully 2. If the tokens match, we label the destination as decrypted. Finally, we compute the decryption failure frst party. We note that since we keep all developer rate of each app as |t|−|h| . We note that our methodol- |t| tokens, we will map “roku” related eSLDs as frst ogy conservatively computes an upper bound compared party if the app was developed oÿcially by Roku. to the actual failure rate, as any non-HTTP over TLS 3. Otherwise, we label a destination as platform- (e.g., proprietary binary protocols) will be counted as specifc party if it originated from platform activity decryption failures. rather than app activity. For Fire TV, we rely on We note that in t, we only include TLS connec- AntMonitor’s [37] ability to label each connection tions, where the TLS handshake concluded successfully. with the responsible process. For Roku, we simply We assume that an app will retry the connection if it check if the eSLD contains “roku”. rejects AntMonitor’s certifcate. AntMonitor stops in- 4. Otherwise, if the destination is contacted by at least tercepting an app’s connections if it detects that the two di˙erent apps from di˙erent developers, we la- app rejects its certifcate, thus the second TLS hand- bel it as third party. shake should complete successfully. This restriction on 5. Finally, if the destination does not fall into any t therefore also prevents double counting (i.e., the orig- of the other categories, we resort to labeling it as inal failed connection does not contribute to the total other, which thus captures domains that are only number of TLS connections initiated by the app). Al- contacted by a single app and are not identifed as though we only recorded upstream data for Fire TV, we a frst party nor platform-specifc party. use the presence of an upstream TLS Application Data packet as a proxy for inferring that the TLS handshake We acknowledge that the variations in labeling method- concluded successfully. ology for platform-specifc parties for Roku and Fire TV Results. In Fig. 7, we show the empirical CDF for the may impact comparability. However, we believe that our TLS decryption failure rates for all apps in the Fire TV choice provides the more accurate platform-specifc la- testbed dataset. We note that the decryption generally beling for each testbed platform. works well. For example, decryption fails for 1 out of 10 (or fewer) TLS connections for 55% of all apps, and The TV is Smart and Full of Trackers 153

C.2 In the Wild Dataset Continued C.3 Key Players Continued

Figure 8 complements Fig. 1 in Sec. 3, providing data Figure 9 provides further insight into the “Keys Players” for the remaining three devices. discussion in Sec. 4.3 by accumulating fows to subdo- mains of the top-30 eSLDs of each platform. Each eSLD

Number of Flows is in turn mapped to its parent organization. Finally, all nrdp.nccp.netflix.com api-global.netflix.com fows to domains under that parent organization are sep- ichnaea.netflix.com www.youtube.com android.clients.google.com arated based on whether they are labeled as advertising occ-0-*.nflxso.net occ-2-*.nflxso.net and tracking or not. A large amount of the fows to the occ-1-*.nflxso.net i.ytimg.com ios.nccp.netflix.com two platform operators are labeled as advertising and ipv4-c10*.oca.nflxvideo.net control2.tvinteractive.tv googleads.g.doubleclick.net tracking. i9.ytimg.com atv-ext.amazon.com www.lookingglass.rocks cdn-0.nflximg.com springserve.com api.us-east-1.aiv-delivery.net imrworldwide.com youtube-ui.l.google.com stickyadstv.com agkn.com control-*.tvinteractive.tv tremorhub.com securepubads.g.doubleclick.net insightexpressai.com spotxchange.com ipv4-c09*.oca.nflxvideo.net innovid.com ipv4-c04*.oca.nflxvideo.net adsrvr.org 1rx.io www.google.com adrta.com ytimg.l.google.com monarchads.com SpringServe, LLC ravm.tv The Nielsen Company (US), LLC pagead2.googlesyndication.com gvt1.com Comcast Corporation presentationtracking.netflix.com 2mdn.net Golden Gate Capital demdex.net Telaria, Inc. s.youtube.com ewscloud.com Bain Capital irchan.com SpotX, Inc. video-stats.l.google.com scorecardresearch.com The Trade Desk, Inc. Innovid, Inc. Marimedia 0 5k 10k 15k 20k Unknown roku.com Barons Media Comscore, Inc. (a) Vizio vimeocdn.com vimeo.com Roku, Inc. akamaized.net Roku Ads & Tracking Number of Flows ifood.tv The E.W. Scripps Company mtalk.google.com googlesyndication.com eBizPa LLC google-analytics.com Vector Capital lh3.googleusercontent.com serving-sys.com Oracle mobile-gtalk.l.google.com moatads.com Vimeo, Inc. android.clients.google.com flurry.com Numitas lh4.googleusercontent.com doubleclick.net Unity Tech lh5.googleusercontent.com unity3d.com Corp. lh6.googleusercontent.com facebook.com Facebook gstatic.com Adobe Inc. clients4.google.com akamaihd.net , Inc. Fire TV cloudfront.net Other www.youtube.com googlevideo.com www.lookingglass.rocks google.com Alphabet Inc. vortex.hulu.com youtube.com googleapis.com home.hulu.com Broadcast Interactive Media clients3.google.com crashlytics.com Verizon Media i.ytimg.com ytimg.com adobe.com http-v-darwin.hulustream.com titantv.com youtube-ui.l.google.com uplynk.com ssl-images-amazon.com clients1.google.com amazonalexa.com www.gstatic.com amazon-dss.com play.hulu.com amazon-adsystem.com Amazon.com, Inc. connectivitycheck.gstatic.com media-amazon.com video-stats.l.google.com amazonvideo.com s.youtube.com http-e-darwin.hulustream.com amazonaws.com google-public-dns-a.google.com ytimg.l.google.com i9.ytimg.com tpc.googlesyndication.com mtalk4.google.com amazon.com yt3.ggpht.com cws-eu-west-1.conviva.com 0 100k 200k 300k

(b) Chromecast Fig. 9. Left to right: (1) mapping of fows to subdomains of the Number of Flows top-30 eSLDs (see Figs. 3c and 3d) of each testbed platform; (2) api-global.netflix.com us.lgtvsdp.com mapping from eSLD to its parent organization; (3) separation of ichnaea.netflix.com www.lookingglass.rocks the fows to the organization by advertising and tracking fows googleads.g.doubleclick.net www.youtube.com android.clients.google.com and other fows. An edge’s width represents the number of fows. www.google.com ngfts.lge.com snu.lge.com coordinator-*.amazonaws.com us.rdx2.lgtvsdp.com occ-1-*.nflxso.net us.ibs.lgappstv.com r4*.googlevideo.com occ-0-*.nflxso.net r5*.googlevideo.com occ-2-*.nflxso.net ipv4-c02*.oca.nflxvideo.net pagead2.googlesyndication.com C.4 False Negatives for Blocklists us.info.lgsmartad.com api*.prodaa.netflix.com ipv4-c07*.oca.nflxvideo.net comet.yahoo.com r3*.googlevideo.com www.googleadservices.com Table 8 provides a full list of false negatives to com- geo.yahoo.com dcs-edge-*.elb.amazonaws.com plement Sec 5.2. We discover potential ATS domains by occ-1-*.1.nflxso.net 0 1k 2k looking at whether multiple apps contact the domain (c) LG and through keyword searches.

Fig. 8. Continuation of Fig. 1 from Sec. 3: Top-30 fully qualifed domain names in terms of number of fows per device for the remaining devices in the in the wild dataset. Domains identifed as ATSes are highlighted with red, dashed bars. The TV is Smart and Full of Trackers 154

Hostname PD TF MoaAB SATV p.ads.roku.com 5 5 5 5 ads.aimitv.com 5 5 5 5 adtag.primetime.adobe.com 5 5 5 5 ads.adrise.tv 5 5 5 ads.samba.tv 5 5 5 tracking.sctv1.monarchads.com 5 5 5 ads.ewscloud.com 5 5 5 5 trackingrkx.com 5 5 5 5 us-east-1-ads.superawesome.tv 5 5 5 5 track.sr.roku.com 5 5 5 5 router.adstack.tv 5 5 5 5 metrics.claspws.tv 5 5 5 5 customerevents.netfix.com 5 5 5 event.altitude-arena.com 5 5 5 ads.altitude-arena.com 5 5 5 myhouseofads.frebaseio.com 5 5 5 5 mads.amazon.com 5 5 5 5 ads.aimitv.com.s3.amazonaws.com 5 5 5 5 analytics.mobitv.com 5 5 5 5 events.brightline.tv 5 5 5 5 ctv.monarchads.com 5 5 5 ads.superawesome.tv 5 5 5 adplatform-static.s3-us-west- 5 5 5 5 1.amazonaws.com kraken-measurements.s3-external- 5 5 5 5 1.amazonaws.com kinstruments-measurements.s3- 5 5 5 5 external-1.amazonaws.com venezia-measurements.s3-external- 5 5 5 5 1.amazonaws.com ad-playlistserver.aws.syncbak.com 5 5 5 5 Table 8. Examples of potential false negatives for the four DNS- based blocklists found using app penetration analysis and key- words search (“ad”, “ads”, “analy”, “track”, “hb” (for heartbeat), “score”, “event”, “metrics”, “measure”).