<<

Data Intimacy and

Michael Kearns University of Pennsylvania [email protected]

AT&T Forum for Technology, Entertainment, and Policy Washington D.C. September 26, 2017

Accompanying whitepaper in preparaon Data Volume and Diversity

Utility Entertainment/Social Media Operation

Cardboard Chrome AdSense Android Allo View Google+ Phones

Google Google Google Google Google Play Android Android Android Express Calendar Camera Play Games Movies & TV Auto Pay Wear

Google DoubleClick Google Google Play YouTube Analytics Maps Flights Music

LinkNYC Project Project Fi Google Google YouTube YouTube TV Baseline Nest Trips Gaming

Senosis Tilt Brush Health

Facebook VR Octazen Facebook Facebook Free Basics Facebook Check-In Events Groups Social Solutions Network

DeepFace Facebook Facebook WhatsApp Marketplace Messanger Private Data vs. Intimate Data

• Private: social security number, credit cards/history, medical records, etc. • Emphasis on objecve “facts” and data, and keeping them locked down • Inmate: opinions, atudes, beliefs, moods, mental state, lifestyle, etc. • May not be “wrien down” anywhere • Inmate data is more valuable and aconable than private data Data Intimacy: “As If Unobserved”

[Stevens-Davidowitz] Data Intimacy: “As If Unobserved”

[Stevens-Davidowitz] Data Intimacy: Drawing Inferences

[Kosinski, Sllwell, Graepel] Data Intimacy: Drawing Inferences

[Backstrom, Kleinberg] The Machine Learning Pipeline raw data feature engineering/extracon feedback/supervision The Machine Learning Pipeline Never Enough: Long Tails Never Enough: Correlations Implications

• Generic data: e.g. raw network traffic, packets; has no “meaning” • Private data: sensive and personal, but sll “on the surface” • Inmate data: not even explicit or tangible, requires inferences • Inmate data is the most valuable, and cannot be measured in bits