<<

Successes of

Cynthia Dwork, Pre-Modern

Propose

Break Modern Cryptography Propose Propose STRONGER STRONGER Definition

Propose Algs Definition algorithms satisfying definition Break Definition

Break Definition Modern Cryptography Propose Propose STRONGER STRONGER Definition

Propose Algs Definition algorithms satisfying definition Break Definition

Break Definition No Algorithm?

Propose Definition ?

Why? Provably No Algorithm? Propose WEAKER/DIFF Definition Propose Definition Alg / ? ?

Bad Definition Scientific Launch

1. Methodology 2. Engaging with negative results  Dinur-Nissim Fundamental Law of Info Recovery

 “Overly accurate” estimates of “too many” statistics destroys privacy. Scientific Launch

1. Methodology 2. Engaging with negative results  Dinur-Nissim; impossibility of semantic security (Terry Gross) 3. Algorithmic Approach  Privacy-preserving programming from a few primitives  RR, symmetric noise, EM: the ORs and ANDs of DP  The astonishing Blum-Ligett-Roth result  Composition  Analytical insights: sparse vector and PMW; geometric view 4. Complexity Fruitful Interplay with Other Fields

 Learning theory, discrepancy theory, cryptography, geometry, complexity theory, mechanism design, pseudorandomness, communication complexity, machine learning, (robust) statistics, fingerprinting codes, coding theory Rich Algorithmic Literature

 Counts, linear queries, histograms, contingency tables (marginals)  Location and spread (eg, median, interquartile range)  Dimension reduction (PCA, SVD), clustering  Support Vector Machines  Sparse regression/LASSO, logistic and linear regression  Gradient descent  Boosting, Multiplicative Weights  Combinatorial optimization, mechanism design  Privacy Under Continual Observation, Pan-Privacy  Kalman filtering  Statistical Queries learning model, PAC learning  False Discovery Rate control  Pan-Privacy, privacy under continual observation … Outreach  Formative engagement with statistics  Led to earliest public deployment Social Science Research Law, Economics, Medicine,…

 PLSC, Berkman, Brussels, Simons Foundation, EC, iDASH,…  Omics: Stanford (past); IPAM (upcoming); Society of Epidemeoligic Research Policy Policy

 CPUC hearings on Energy Data Center, the ruling, the Southern CA power company  Podesta report, PCAST report  Commission on Evidence-Based Policymaking  Consumer Finance Protection Board  … Deployment

 RAPPOR, Google more generally, Apple,…  A couple of startups (Leapyear, Privatar(?))  Census – OnTheMap and upcoming  Help wanted! Deployment

 RAPPOR, Google more generally, Apple,…  Help wanted!  A couple of startups (Leapyear, Privatar(?))  Help wanted!  Census – OnTheMap and upcoming  Help wanted! Deployment

 RAPPOR, Google more generally, Apple,…  Help wanted!  A couple of startups (Leapyear, Privatar(?))  Help wanted!  Census – OnTheMap and upcoming  Help wanted! DP when Privacy is not a Concern

 Markets, Economics, Game Theory  Hartline, McSherry,Talwar; Roth; Pai and Roth; Lykouris, Syrgkanis, and Tardos  Fairness in Algorithmic Classification  Generalizability under adaptive analysis Fairness Through Awareness

Dwork, Hardt, Pitassi, Reingold, Zemel 2012 Individual Fairness

 People who are similar with respect to a specific classification task should be treated similarly  S + math ∼ Sc + finance  “Fairness Through Awareness”

Classifier

M: 푉 → 푂 M푥 푥 O: Classification V: individuals Outcomes Individual Fairness Lipschitz Classifier

푥 tiny d 푦 푀

V O

푀: 푉 → Δ 푂 푀 푥 − 푀 푦 ≤ 푑(푥, 푦) Lipschitz Mappings

Differential Privacy Individual Fairness

Objects Databases Individuals Outcomes Output of statistical analysis Classification outcome

Similarity General purpose metric Task-specific metric

 Can use dp techniques for fairness  Theorem: Exponential mechanism of [MT07] yields individual fairness and small loss when the metric has bounded doubling dimension. Which is “Right”? Statistical Validity in Adaptive Data Analysis

Dwork, Feldman, Hardt, Pitassi, Reingold, Roth q1

a1 q2 M a2 q3

a3 Database curator data analyst

 푞푖 depends on 푎1, 푎2, … , 푎푖−1  Differential privacy neutralizes risks incurred by adaptivity  Hard to find a query for which the data set is not representative The Re-Usable Holdout

 Learn on the training set  Check against holdout via a differentially private mechanism “Training”  Future exploration does not significantly depend on H “Holdout”  H stays fresh 3 Sides of the Same Coin

 Fairness, Privacy, Generalizability  “Keep Up the Good Work” – (by channeling)  Let your research be fruitful and multiply  Build the 휖 registry, formally or informally  Build libraries, continue outreach efforts  Confront Implications of the Fundamental Law  Prioritization? Who decides? Which fields have the tools  Public Understanding  Generalization beyond the sample distribution / transfer learning?  Strong relation to fairness Thank You