I N Max-Information, , S R and Post-Selection Hypothesis Testing Ryan Rogers, Aaron Roth, Adam Smith, and Om Thakkar

Adaptive Data Analysis refers to the reuse of data to perform analyses suggested by the outcomes of previously computed statistics on the same data. In this work, we initiate a principled study of how the generalization properties of approximate differential privacy can be used to perform Approximate Post-selection adaptive hypothesis testing. This substantially extends the existing Differential Max-Information Hypothesis connection between differential privacy and max-information, which Privacy Testing previously was only known to hold for pure differential privacy. It also extends our understanding of max-information as a partially unifying measure controlling the generalization properties of adaptive data analyses.

Differential Privacy [DMNS06]

➢ A lot of existing theory assumes tests ➢ A 퐴: 퐷푛 → 푇 is 휀, 훿 -differentially are selected independently of the data. private if for all neighboring data sets 푥, 푦 ∈ 퐷푛, i.e., ➢ In practice, data analysis is inherently 푑푖푠푡 푥, 푦 = 1, and for all sets of outcomes 푆 ⊆ 푇, we have interactive, where experiments may 푃 퐴 푥 ∈ 푆 ≤ 푒휀푃 퐴 푦 ∈ 푆 + 훿 depend on previous outcomes from the ➢ If 훿=0, we say pure DP. If 훿>0, we say approximate DP. same dataset. ➢ Question: How can we provide statistically valid answers to adaptively chosen analyses? ➢ Answer: Limit the information learned about the dataset. [DFH+15a] ➢ Part of a line of work initiated by [DFH+15a, DFH+15b,HU14].

Post-Selection Hypothesis Testing

➢ Hypothesis test: Defined by a null hypothesis 퐻0 and a test statistic 푡. Technical Contributions ➢ Purpose: Reject 퐻0 if the data 푋 is not likely to have been generated from 푛 푛 [DFH+15a] some distribution 푄 such that 푄 ∈ 퐻0. ➢ Previous results : If 퐴: 퐷 → 푇 is 휖, 0 -DP, 훽훽 ෨ 2 ➢ Significance. level of 푡 = 훼 ⟹ Pr 푡 푋 = 푅푒푗푒푐푡 ≤ 훼. ➢ For 훽 > 0, we have 퐼∞퐼,Π 퐴퐴, ,푛푛 ≤≤푂푂෨ 휖휖 푛 푋∼푄푛 ∞,Π 00 ➢ 퐼퐼∞∞ 퐴퐴, 푛 ≤≤휖푛휖푛 ➢ Assumes choice of 푡 is independent of the data 푋. ➢ Goal: For an adaptively chosen test 푡 , we want to bound 퐴(푋) ➢ Positive Result: If 퐴: 퐷푛 → 푇 is 휖, 훿 -DP, for 훽 ≈ 푂 푛 훿/휖 , Pr 푡퐴(푋) 푋 = 푅푒푗푒푐푡 for 푄 ∈ 퐻0. 푋∼푄푛 훽훽 2 ➢ we have 퐼∞퐼∞,Π,Π퐴퐴, ,푛푛 == 푂푂 휖 푛푛 ++푛푛 훿훿//휖휖 ➢ Problem: 푡퐴(푋) can be tailored specifically to 푋. Approx. DP Max-Information [DFH+15b] Max-Information ➢ Consequences: ➢ 푘 rounds of adaptivity: max-information ~ 푘 rather than 푘2 ➢ An algorithm 퐴 with bounded max-info allows the analyst to treat the ➢ Generalizes and unifies previous work output 퐴(푋) as if it is independent of data 푋 up to a factor. ➢ Negative Result: ∃ an 휖, 훿 -DP algorithm 퐴 s.t. 훽 1 훽 Pr (푋 = 푥, 퐴 푋 = 푎) ➢ 훽퐼 퐴, 푛 ≈ for any 훽 ≤ − 훿. 퐼 푋; 퐴(푋) ≤ 푘 ⟺ Pr log > 푘 ≤ 훽 퐼∞ ∞퐴, 푛 ≈ 푛 2 ∞ 푥,푎 Pr 푋′ = 푥 Pr 퐴 푋 = 푎 ➢ Differentiate between general and product distributions: 훽 훽 퐼∞ 퐴, 푛 = sup 퐼∞ 푋; 퐴(푋) Related Publications 푆:푋∼푆 훽 훽 퐼∞,Π 퐴, 푛 = sup 퐼∞ 푋; 퐴(푋) + 푃:푋∼푃푛 • [BNS 16] Raef Bassily, , Adam D. Smith, Thomas 훽 훼 −훽 Steinke, Uri Stemmer, and Jonathan Ullman. In STOC, 2016. ➢ [RRST.16]: If 퐼 퐴, 푛 ≤ 푘, then for 훾 훼 = , ∞,Π 2푘 • [DFH+15a] , Vitaly Feldman, Moritz Hardt, Toni Pitassi, , and Aaron Roth. In NIPS. 2015. Significance level of 푡퐴 푋 = 훾 훼 ⟹ Pr 푡퐴(푋) 푋 = 푅푒푗푒푐푡 ≤ 훼. 푋∼푄푛 • [DFH+15b] Cynthia Dwork, Vitaly Feldman, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Aaron Leon Roth. In STOC, 2015. • [DMNS06] Cynthia Dwork, Frank Mcsherry, Kobbi Nissim, Adam Max-Information Post-selection Hypothesis Testing Smith. In TCC, 2006. • [HU14] Moritz Hardt and Jonathan Ullman.In FOCS, 2014. • [RZ16] Daniel Russo and James Zou. In AISTATS, 2016.

Supported in part by grants from the Sloan foundation and NSF grants: CNS-1253345, CNS-1513694, IIS-1447700.