Success Stories of Differential Privacy

Successes of Differential Privacy Cynthia Dwork, Harvard University Pre-Modern Cryptography Propose Break Modern Cryptography Propose Propose STRONGER STRONGER Definition Propose Algs Definition algorithms satisfying definition Break Definition Break Definition Modern Cryptography Propose Propose STRONGER STRONGER Definition Propose Algs Definition algorithms satisfying definition Break Definition Break Definition No Algorithm? Propose Definition ? Why? Provably No Algorithm? Propose WEAKER/DIFF Definition Propose Definition Alg / ? ? Bad Definition Scientific Launch 1. Methodology 2. Engaging with negative results Dinur-Nissim Fundamental Law of Info Recovery “Overly accurate” estimates of “too many” statistics destroys privacy. Scientific Launch 1. Methodology 2. Engaging with negative results Dinur-Nissim; impossibility of semantic security (Terry Gross) 3. Algorithmic Approach Privacy-preserving programming from a few primitives RR, symmetric noise, EM: the ORs and ANDs of DP The astonishing Blum-Ligett-Roth result Composition Analytical insights: sparse vector and PMW; geometric view 4. Complexity Fruitful Interplay with Other Fields Learning theory, discrepancy theory, cryptography, geometry, complexity theory, mechanism design, pseudorandomness, communication complexity, machine learning, (robust) statistics, fingerprinting codes, coding theory Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median, interquartile range) Dimension reduction (PCA, SVD), clustering Support Vector Machines Sparse regression/LASSO, logistic and linear regression Gradient descent Boosting, Multiplicative Weights Combinatorial optimization, mechanism design Privacy Under Continual Observation, Pan-Privacy Kalman filtering Statistical Queries learning model, PAC learning False Discovery Rate control Pan-Privacy, privacy under continual observation … Outreach Formative engagement with statistics Led to earliest public deployment Social Science Research Law, Economics, Medicine,… PLSC, Berkman, Brussels, Simons Foundation, EC, iDASH,… Omics: Stanford (past); IPAM (upcoming); Society of Epidemeoligic Research Policy Policy CPUC hearings on Energy Data Center, the ruling, the Southern CA power company Podesta report, PCAST report Commission on Evidence-Based Policymaking Consumer Finance Protection Board … Deployment RAPPOR, Google more generally, Apple,… A couple of startups (Leapyear, Privatar(?)) Census – OnTheMap and upcoming Help wanted! Deployment RAPPOR, Google more generally, Apple,… Help wanted! A couple of startups (Leapyear, Privatar(?)) Help wanted! Census – OnTheMap and upcoming Help wanted! Deployment RAPPOR, Google more generally, Apple,… Help wanted! A couple of startups (Leapyear, Privatar(?)) Help wanted! Census – OnTheMap and upcoming Help wanted! DP when Privacy is not a Concern Markets, Economics, Game Theory Hartline, McSherry,Talwar; Roth; Pai and Roth; Lykouris, Syrgkanis, and Tardos Fairness in Algorithmic Classification Generalizability under adaptive analysis Fairness Through Awareness Dwork, Hardt, Pitassi, Reingold, Zemel 2012 Individual Fairness People who are similar with respect to a specific classification task should be treated similarly S + math ∼ Sc + finance “Fairness Through Awareness” Classifier M: 푉 → 푂 M푥 푥 O: Classification V: individuals Outcomes Individual Fairness Lipschitz Classifier 푥 tiny d 푦 푀 V O 푀: 푉 → Δ 푂 푀 푥 − 푀 푦 ≤ 푑(푥, 푦) Lipschitz Mappings Differential Privacy Individual Fairness Objects Databases Individuals Outcomes Output of statistical analysis Classification outcome Similarity General purpose metric Task-specific metric Can use dp techniques for fairness Theorem: Exponential mechanism of [MT07] yields individual fairness and small loss when the metric has bounded doubling dimension. Which is “Right”? Statistical Validity in Adaptive Data Analysis Dwork, Feldman, Hardt, Pitassi, Reingold, Roth q1 a1 q2 M a2 q3 a3 Database curator data analyst 푞푖 depends on 푎1, 푎2, … , 푎푖−1 Differential privacy neutralizes risks incurred by adaptivity Hard to find a query for which the data set is not representative The Re-Usable Holdout Learn on the training set Check against holdout via a differentially private mechanism “Training” Future exploration does not significantly depend on H “Holdout” H stays fresh 3 Sides of the Same Coin Fairness, Privacy, Generalizability “Keep Up the Good Work” – Moni Naor (by channeling) Let your research be fruitful and multiply Build the 휖 registry, formally or informally Build libraries, continue outreach efforts Confront Implications of the Fundamental Law Prioritization? Who decides? Which fields have the tools Public Understanding Generalization beyond the sample distribution / transfer learning? Strong relation to fairness Thank You.

Load more