Composable and Versatile Privacy Via Truncated CDP
Total Page:16
File Type:pdf, Size:1020Kb
Composable and Versatile Privacy via Truncated CDP Mark Bun Cynthia Dwork Princeton University Harvard University Princeton, NJ, USA Cambridge, MA, USA [email protected] [email protected] Guy N. Rothblum Thomas Steinke Weizmann Institute of Science IBM Research – Almaden Rehovot, Israel San Jose, CA, USA [email protected] [email protected] ABSTRACT between these two distributions (the maximum log-likelihood ratio We propose truncated concentrated differential privacy (tCDP), a for any event) is bounded by the privacy parameter ε. This absolute refinement of differential privacy and of concentrated differential guarantee on the maximum privacy loss is now sometimes referred privacy. This new definition provides robust and efficient composi- to as “pure” differential privacy. tion guarantees, supports powerful algorithmic techniques such as A popular relaxation, “approximate” or ¹ε; δº-differential pri- privacy amplification via sub-sampling, and enables more accurate vacy [10], roughly guarantees that with probability at least 1 − δ statistical analyses. In particular, we show a central task for which the privacy loss does not exceed ε. This relaxation allows a δ prob- the new definition enables exponential accuracy improvement. ability of catastrophic privacy failure, and thus δ is typically taken to be “cryptographically” small. Although small values of δ come CCS CONCEPTS at a price in privacy, the relaxation nevertheless frequently permits asymptotically better accuracy than pure differential privacy (for • Theory of computation → Design and analysis of algorithms; the same value of ε). Indeed, the central advantage of relaxing the KEYWORDS guarantee is that it permits an improved and asymptotically tight analysis of the cumulative privacy loss incurred by composition of differential privacy, algorithmic stability, subsampling multiple (pure or approximate) differentially private mechanisms ACM Reference Format: [15]. Mark Bun, Cynthia Dwork, Guy N. Rothblum, and Thomas Steinke. 2018. Composition is the key to differential privacy’s success, as it per- Composable and Versatile Privacy via Truncated CDP. In Proceedings of 50th mits the construction of complex – and useful – differentially pri- Annual ACM SIGACT Symposium on the Theory of Computing (STOC’18). vate analyses from simple differentially private primitives. That is, ACM, New York, NY, USA, 13 pages. https://doi.org/10.1145/3188745.3188946 it allows us to “program” in a privacy-preserving fashion. Optimiz- ing for composition, and being willing to live with high probability 1 INTRODUCTION bounds on privacy loss rather than insisting on worst-case bounds, Differential privacy (DP) is a mathematically rigorous definition led to the introduction of concentrated differential privacy (CDP), a of privacy, which is tailored to analysis of large datasets and is guarantee incomparable to the others, permitting better accuracy equipped with a formal measure of privacy loss [9, 12]. Differ- than both without compromising on cumulative privacy loss under entially private algorithms cap the permitted privacy loss in any composition [6, 14]. Intuitively, CDP requires that the privacy loss execution of the algorithm at a pre-specified parameter ε, providing have small expectation, and tight concentration around its expecta- a concrete privacy/utility tradeoff. A signal strength of differential tion. There is no hard threshold for maximal privacy loss. Instead, privacy is the ability to reason about cumulative privacy loss under the probability of large losses vanishes at a rate that roughly paral- composition of multiple analyses. lels the concentration of the Gaussian distribution. This framework Roughly speaking, differential privacy ensures that the outcome exactly captures the type of privacy loss that occurs under composi- of any analysis on a dataset x is distributed very similarly to the tion of (pure and approximate) DP mechanisms. CDP improves on outcome on any neighboring dataset x 0 that differs from x in just approximate DP by “cutting corners” in a way that has no privacy one element (corresponding to one individual). That is, differen- cost under high levels of composition. For certain approximate DP tially private algorithms are randomized, and the max divergence algorithms, such as the Gaussian mechanism, when a δ-probability failure occurs, there is no privacy catastrophe. Rather, the most Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed likely situation is that the privacy loss is still bounded by 2ε. More for profit or commercial advantage and that copies bear this notice and the full citation generally, for any k ≥ 1, the risk that the privacy loss exceeds kε on the first page. Copyrights for components of this work owned by others than ACM Ω¹k 2º p ¹ / º must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, is δ . In such situations, the log 1 δ multiplicative loss in to post on servers or to redistribute to lists, requires prior specific permission and/or a accuracy one obtains in the analysis of this algorithm is a high fee. Request permissions from [email protected]. price to pay to avoid a small increase in privacy loss. Under high STOC’18, June 25–29, 2018, Los Angeles, CA, USA levels of composition of approximate DP mechanisms, the privacy © 2018 Association for Computing Machinery. ACM ISBN 978-1-4503-5559-9/18/06...$15.00 losses of each individual mechanism are forced to be “artificially” https://doi.org/10.1145/3188745.3188946 74 STOC’18, June 25–29, 2018, Los Angeles, CA, USA Mark Bun, Cynthia Dwork, Guy N. Rothblum, and Thomas Steinke low in order to control the cumulative loss. This means that small the definition of ρ-zCDP. Clearly, decreasing ω relaxes the defini- lapses in these artificially low losses are not so consequential, and tion of ¹ρ; ωº-tCDP. See Section 1.3 for further discussion of the literally do not add up to much in the risk of exceeding the desired relationship between tCDP and prior definitions. cumulative loss. Finally, in addition to providing better accuracy than pure and CDP: a recap. The addition of Gaussian noise drawn from 2 approximate DP, CDP has a simple optimal composition theorem, ∆f N 0; ε is a universal mechanism for Concentrated Differ- computable in linear time. This stands in sharp contrast to the situation with approximate DP, for which computing optimal com- ential Privacy, yielding a privacy loss random variable Z that is position bounds is computationally intractable (#P-hard) [20]. subgaussian with standard λ = ε, and with expected privacy loss 2/ With these important advantages, concentrated DP also has µ = ε 2. Formally, we have the following bound on the moment limitations: generating function: 2 2 (1) “Packing”-based lower bounds [6, 16] imply inherent limita- 8λ 2 RE»eλ¹Z −µº¼ ≤ eε λ /2: (1) tions on the accuracy that can be obtained under CDP for 2 certain tasks. These lower bounds do not apply to approxi- This implies that Pr»Z − µ ≥ kε¼ ≤ e−k /2, where the connection mate DP (because of the δ-probability “escape hatch”), and between bounds on high moments and tail bounds goes via an there are tasks where the accuracy of CDP data analyses is application of Markov’s inequality: worse than what can be obtained under approximate DP. (2) Unlike both pure and approximate DP, CDP does not support E»e¹λ/εº¹Z −µº¼ Pr»¹Z − µº ≥ kε¼ = Pr»e¹λ/εº¹Z −µº ≥ eλk ¼ ≤ : (2) “privacy amplification” via subsampling [17, 22]. This type of eλk privacy amplification is a powerful tool, as demonstrated by the recent work of Abadi et al. in differentially private deep This is minimized when λ = k/ε, yielding a probability bounded by − 2/ learning [1]. While their privacy analysis utilizes a CDP-like e k 2. view of privacy loss, their use of subsampling for privacy Subsampling under CDP and tCDP.. Given a dataset of size n, amplification means that their algorithm cannot be analyzed let us choose a random subset of size sn, for s 2 ¹0; 1º, as a pre- within the framework of CDP. processing step before applying a differentially private algorithm. With the above progress and challenges in mind, we propose the In analogy to the situation for pure and approximate differential notion of truncated concentrated differential privacy (tCDP) to give privacy, we would expect subsampling to decrease the privacy loss us the best of all words. roughly by a factor of s. More precisely we would hope that the A new framework. Loosely speaking, CDP restricts the privacy new standard should be O¹sεº, with the mean reduced by a factor loss to be at least as concentrated as a Gaussian. tCDP relaxes this re- of roughly s2. Let us see what actually happens. quirement. Informally, it only requires Gaussian-like concentration Let x be a dataset of n zeroes, and let y be a neighboring dataset up to a set boundary, specified by a number of standard deviations. of n − 1 zeroes and a single one. Consider the sensitivity 1 query Beyond this boundary, the privacy loss can be less concentrated q, “How many 1’s are in the dataset?” Let Z be a random variable 2 than a Gaussian (but we still require subexponential concentration). with distribution N¹0; 1 º. Let A be the event that the response The formal definition uses two parameters: ρ, which (roughly) con- ε −k 2/2 trols the expectation and standard deviation of the privacy loss, received is at least 1 + z, where Pr»Z ≥ z¼ ≤ e for a fixed 2 and ω, which (roughly) controls the number of standard deviations 2 ∆q 2 k > 0. Letting σ = ε = 1/ε , when the dataset is y the output for which we require Gaussian-like concentration.