Homomorphic Encryption for Machine Learning in Medicine and Bioinformatics
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
ARTIFICIAL INTELLIGENCE and ALGORITHMIC LIABILITY a Technology and Risk Engineering Perspective from Zurich Insurance Group and Microsoft Corp
White Paper ARTIFICIAL INTELLIGENCE AND ALGORITHMIC LIABILITY A technology and risk engineering perspective from Zurich Insurance Group and Microsoft Corp. July 2021 TABLE OF CONTENTS 1. Executive summary 03 This paper introduces the growing notion of AI algorithmic 2. Introduction 05 risk, explores the drivers and A. What is algorithmic risk and why is it so complex? ‘Because the computer says so!’ 05 B. Microsoft and Zurich: Market-leading Cyber Security and Risk Expertise 06 implications of algorithmic liability, and provides practical 3. Algorithmic risk : Intended or not, AI can foster discrimination 07 guidance as to the successful A. Creating bias through intended negative externalities 07 B. Bias as a result of unintended negative externalities 07 mitigation of such risk to enable the ethical and responsible use of 4. Data and design flaws as key triggers of algorithmic liability 08 AI. A. Model input phase 08 B. Model design and development phase 09 C. Model operation and output phase 10 Authors: D. Potential complications can cloud liability, make remedies difficult 11 Zurich Insurance Group 5. How to determine algorithmic liability? 13 Elisabeth Bechtold A. General legal approaches: Caution in a fast-changing field 13 Rui Manuel Melo Da Silva Ferreira B. Challenges and limitations of existing legal approaches 14 C. AI-specific best practice standards and emerging regulation 15 D. New approaches to tackle algorithmic liability risk? 17 Microsoft Corp. Rachel Azafrani 6. Principles and tools to manage algorithmic liability risk 18 A. Tools and methodologies for responsible AI and data usage 18 Christian Bucher B. Governance and principles for responsible AI and data usage 20 Franziska-Juliette Klebôn C. -
Practical Homomorphic Encryption and Cryptanalysis
Practical Homomorphic Encryption and Cryptanalysis Dissertation zur Erlangung des Doktorgrades der Naturwissenschaften (Dr. rer. nat.) an der Fakult¨atf¨urMathematik der Ruhr-Universit¨atBochum vorgelegt von Dipl. Ing. Matthias Minihold unter der Betreuung von Prof. Dr. Alexander May Bochum April 2019 First reviewer: Prof. Dr. Alexander May Second reviewer: Prof. Dr. Gregor Leander Date of oral examination (Defense): 3rd May 2019 Author's declaration The work presented in this thesis is the result of original research carried out by the candidate, partly in collaboration with others, whilst enrolled in and carried out in accordance with the requirements of the Department of Mathematics at Ruhr-University Bochum as a candidate for the degree of doctor rerum naturalium (Dr. rer. nat.). Except where indicated by reference in the text, the work is the candidates own work and has not been submitted for any other degree or award in any other university or educational establishment. Views expressed in this dissertation are those of the author. Place, Date Signature Chapter 1 Abstract My thesis on Practical Homomorphic Encryption and Cryptanalysis, is dedicated to efficient homomor- phic constructions, underlying primitives, and their practical security vetted by cryptanalytic methods. The wide-spread RSA cryptosystem serves as an early (partially) homomorphic example of a public- key encryption scheme, whose security reduction leads to problems believed to be have lower solution- complexity on average than nowadays fully homomorphic encryption schemes are based on. The reader goes on a journey towards designing a practical fully homomorphic encryption scheme, and one exemplary application of growing importance: privacy-preserving use of machine learning. -
Information Guide
INFORMATION GUIDE 7 ALL-PRO 7 NFL MVP LAMAR JACKSON 2018 - 1ST ROUND (32ND PICK) RONNIE STANLEY 2016 - 1ST ROUND (6TH PICK) 2020 BALTIMORE DRAFT PICKS FIRST 28TH SECOND 55TH (VIA ATL.) SECOND 60TH THIRD 92ND THIRD 106TH (COMP) FOURTH 129TH (VIA NE) FOURTH 143RD (COMP) 7 ALL-PRO MARLON HUMPHREY FIFTH 170TH (VIA MIN.) SEVENTH 225TH (VIA NYJ) 2017 - 1ST ROUND (16TH PICK) 2020 RAVENS DRAFT GUIDE “[The Draft] is the lifeblood of this Ozzie Newsome organization, and we take it very Executive Vice President seriously. We try to make it a science, 25th Season w/ Ravens we really do. But in the end, it’s probably more of an art than a science. There’s a lot of nuance involved. It’s Joe Hortiz a big-picture thing. It’s a lot of bits and Director of Player Personnel pieces of information. It’s gut instinct. 23rd Season w/ Ravens It’s experience, which I think is really, really important.” Eric DeCosta George Kokinis Executive VP & General Manager Director of Player Personnel 25th Season w/ Ravens, 2nd as EVP/GM 24th Season w/ Ravens Pat Moriarty Brandon Berning Bobby Vega “Q” Attenoukon Sarah Mallepalle Sr. VP of Football Operations MW/SW Area Scout East Area Scout Player Personnel Assistant Player Personnel Analyst Vincent Newsome David Blackburn Kevin Weidl Patrick McDonough Derrick Yam Sr. Player Personnel Exec. West Area Scout SE/SW Area Scout Player Personnel Assistant Quantitative Analyst Nick Matteo Joey Cleary Corey Frazier Chas Stallard Director of Football Admin. Northeast Area Scout Pro Scout Player Personnel Assistant David McDonald Dwaune Jones Patrick Williams Jenn Werner Dir. -
The Limits of Post-Selection Generalization
The Limits of Post-Selection Generalization Kobbi Nissim∗ Adam Smithy Thomas Steinke Georgetown University Boston University IBM Research – Almaden [email protected] [email protected] [email protected] Uri Stemmerz Jonathan Ullmanx Ben-Gurion University Northeastern University [email protected] [email protected] Abstract While statistics and machine learning offers numerous methods for ensuring gener- alization, these methods often fail in the presence of post selection—the common practice in which the choice of analysis depends on previous interactions with the same dataset. A recent line of work has introduced powerful, general purpose algorithms that ensure a property called post hoc generalization (Cummings et al., COLT’16), which says that no person when given the output of the algorithm should be able to find any statistic for which the data differs significantly from the population it came from. In this work we show several limitations on the power of algorithms satisfying post hoc generalization. First, we show a tight lower bound on the error of any algorithm that satisfies post hoc generalization and answers adaptively chosen statistical queries, showing a strong barrier to progress in post selection data analysis. Second, we show that post hoc generalization is not closed under composition, despite many examples of such algorithms exhibiting strong composition properties. 1 Introduction Consider a dataset X consisting of n independent samples from some unknown population P. How can we ensure that the conclusions drawn from X generalize to the population P? Despite decades of research in statistics and machine learning on methods for ensuring generalization, there is an increased recognition that many scientific findings do not generalize, with some even declaring this to be a “statistical crisis in science” [14]. -
Individuals and Privacy in the Eye of Data Analysis
Individuals and Privacy in the Eye of Data Analysis Thesis submitted in partial fulfillment of the requirements for the degree of “DOCTOR OF PHILOSOPHY” by Uri Stemmer Submitted to the Senate of Ben-Gurion University of the Negev October 2016 Beer-Sheva This work was carried out under the supervision of Prof. Amos Beimel and Prof. Kobbi Nissim In the Department of Computer Science Faculty of Natural Sciences Acknowledgments I could not have asked for better advisors. I will be forever grateful for their close guidance, their constant encouragement, and the warm shelter they provided. Without them, this thesis could have never begun. I have been fortunate to work with Raef Bassily, Amos Beimel, Mark Bun, Kobbi Nissim, Adam Smith, Thomas Steinke, Jonathan Ullman, and Salil Vadhan. I enjoyed working with them all, and I thank them for everything they have taught me. iii Contents Acknowledgments . iii Contents . iv List of Figures . vi Abstract . vii 1 Introduction1 1.1 Differential Privacy . .2 1.2 The Sample Complexity of Private Learning . .3 1.3 Our Contributions . .5 1.4 Additional Contributions . 10 2 Related Literature 15 2.1 The Computational Price of Differential Privacy . 15 2.2 Interactive Query Release . 18 2.3 Answering Adaptively Chosen Statistical Queries . 19 2.4 Other Related Work . 21 3 Background and Preliminaries 22 3.1 Differential privacy . 22 3.2 Preliminaries from Learning Theory . 24 3.3 Generalization Bounds for Points and Thresholds . 29 3.4 Private Learning . 30 3.5 Basic Differentially Private Mechanisms . 31 3.6 Concentration Bounds . 33 4 The Generalization Properties of Differential Privacy 34 4.1 Main Results . -
Homomorphic Encryption Library
Ludwig-Maximilians-Universit¨at Munchen¨ Prof. Dr. D. Kranzlmuller,¨ Dr. N. gentschen Felde Data Science & Ethics { Microsoft SEAL { Exercise 1: Library for Matrix Operations Microsoft's Simple Encrypted Arithmetic Library (SEAL)1 is a publicly available homomorphic encryption library. It can be found and downloaded at http://sealcrypto.codeplex.com/. Implement a library 13b MS-SEAL.h supporting matrix operations using homomorphic encryption on basis of the MS SEAL. due date: 01.07.2018 (EOB) no. of students: 2 deliverables: 1. Implemenatation (including source code(s)) 2. Documentation (max. 10 pages) 3. Presentation (10 { max. 15 minutes) 13b MS-SEAL.h #i n c l u d e <f l o a t . h> #i n c l u d e <s t d b o o l . h> typedef struct f double ∗ e n t r i e s ; unsigned int width; unsigned int height; g matrix ; /∗ Initialize new matrix: − reserve memory only ∗/ matrix initMatrix(unsigned int width, unsigned int height); /∗ Initialize new matrix: − reserve memory − set any value to 0 ∗/ matrix initMatrixZero(unsigned int width, unsigned int height); /∗ Initialize new matrix: − reserve memory − set any value to random number ∗/ matrix initMatrixRand(unsigned int width, unsigned int height); 1https://www.microsoft.com/en-us/research/publication/simple-encrypted-arithmetic-library-seal-v2-0/# 1 /∗ copy a matrix and return its copy ∗/ matrix copyMatrix(matrix toCopy); /∗ destroy matrix − f r e e memory − set any remaining value to NULL ∗/ void freeMatrix(matrix toDestroy); /∗ return entry at position (xPos, yPos), DBL MAX in case of error -
Calibrating Noise to Sensitivity in Private Data Analysis
Calibrating Noise to Sensitivity in Private Data Analysis Cynthia Dwork1, Frank McSherry1, Kobbi Nissim2, and Adam Smith3? 1 Microsoft Research, Silicon Valley. {dwork,mcsherry}@microsoft.com 2 Ben-Gurion University. [email protected] 3 Weizmann Institute of Science. [email protected] Abstract. We continue a line of research initiated in [10, 11] on privacy- preserving statistical databases. Consider a trusted server that holds a database of sensitive information. Given a query function f mapping databases to reals, the so-called true answer is the result of applying f to the database. To protect privacy, the true answer is perturbed by the addition of random noise generated according to a carefully chosen distribution, and this response, the true answer plus noise, is returned to the user. Previous work focused on the case of noisy sums, in which f = P i g(xi), where xi denotes the ith row of the database and g maps database rows to [0, 1]. We extend the study to general functions f, proving that privacy can be preserved by calibrating the standard devi- ation of the noise according to the sensitivity of the function f. Roughly speaking, this is the amount that any single argument to f can change its output. The new analysis shows that for several particular applications substantially less noise is needed than was previously understood to be the case. The first step is a very clean characterization of privacy in terms of indistinguishability of transcripts. Additionally, we obtain separation re- sults showing the increased value of interactive sanitization mechanisms over non-interactive. -
Arxiv:2102.00319V1 [Cs.CR] 30 Jan 2021
Efficient CNN Building Blocks for Encrypted Data Nayna Jain1,4, Karthik Nandakumar2, Nalini Ratha3, Sharath Pankanti5, Uttam Kumar 1 1 Center for Data Sciences, International Institute of Information Technology, Bangalore 2 Mohamed Bin Zayed University of Artificial Intelligence 3 University at Buffalo, SUNY 4 IBM Systems 5 Microsoft [email protected], [email protected], [email protected]/[email protected], [email protected], [email protected] Abstract Model Owner Model Architecture 푴 Machine learning on encrypted data can address the concerns Homomorphically Encrypted Model Model Parameters 퐸(휽) related to privacy and legality of sharing sensitive data with Encryption untrustworthy service providers, while leveraging their re- Parameters 휽 sources to facilitate extraction of valuable insights from oth- End-User Public Key Homomorphically Encrypted Cloud erwise non-shareable data. Fully Homomorphic Encryption Test Data {퐸(퐱 )}푇 Test Data 푖 푖=1 FHE Computations Service 푇 Encryption (FHE) is a promising technique to enable machine learning {퐱푖}푖=1 퐸 y푖 = 푴(퐸 x푖 , 퐸(휽)) Provider and inferencing while providing strict guarantees against in- Inference 푇 Decryption formation leakage. Since deep convolutional neural networks {푦푖}푖=1 Homomorphically Encrypted Inference Results {퐸(y )}푇 (CNNs) have become the machine learning tool of choice Private Key 푖 푖=1 in several applications, several attempts have been made to harness CNNs to extract insights from encrypted data. How- ever, existing works focus only on ensuring data security Figure 1: In a conventional Machine Learning as a Ser- and ignore security of model parameters. They also report vice (MLaaS) scenario, both the data and model parameters high level implementations without providing rigorous anal- are unencrypted. -
Fundamentals of Fully Homomorphic Encryption – a Survey
Electronic Colloquium on Computational Complexity, Report No. 125 (2018) Fundamentals of Fully Homomorphic Encryption { A Survey Zvika Brakerski∗ Abstract A homomorphic encryption scheme is one that allows computing on encrypted data without decrypting it first. In fully homomorphic encryption it is possible to apply any efficiently com- putable function to encrypted data. We provide a survey on the origins, definitions, properties, constructions and uses of fully homomorphic encryption. 1 Homomorphic Encryption: Good, Bad or Ugly? In the seminal RSA cryptosystem [RSA78], the public key consists of a product of two primes ∗ N = p · q as well as an integer e, and the message space is the set of elements in ZN . Encrypting a message m involved simply raising it to the power e and taking the result modulo N, i.e. c = me (mod N). For the purpose of the current discussion we ignore the decryption process. It is not hard to see that the product of two ciphertexts c1 and c2 encrypting messages m1 and m2 allows e to compute the value c1 · c2 (mod N) = (m1m2) (mod N), i.e. to compute an encryption of m1 · m2 without knowledge of the secret private key. Rabin's cryptosystem [Rab79] exhibited similar behavior, where a product of ciphertexts corresponded to an encryption of their respective plaintexts. This behavior can be expressed in formal terms by saying that the ciphertext space and the plaintext space are homomorphic (multiplicative) groups. The decryption process defines the homomorphism by mapping a ciphertext to its image plaintext. Rivest, Adleman and Dertouzos [RAD78] realized the potential advantage of this property. -
Differential Privacy, Property Testing, and Perturbations
Differential Privacy, Property Testing, and Perturbations by Audra McMillan A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Mathematics) in The University of Michigan 2018 Doctoral Committee: Professor Anna Gilbert, Chair Professor Selim Esedoglu Professor John Schotland Associate Professor Ambuj Tewari Audra McMillan [email protected] ORCID iD: 0000-0003-4231-6110 c Audra McMillan 2018 ACKNOWLEDGEMENTS First and foremost, I would like to thank my advisor, Anna Gilbert. Anna managed to strike the balance between support and freedom that is the goal of many advisors. I always felt that she believed in my ideas and I am a better, more confident researcher because of her. I was fortunate to have a large number of pseudo-advisors during my graduate ca- reer. Martin Strauss, Adam Smith, and Jacob Abernethy deserve special mention. Martin guided me through the early stages of my graduate career and helped me make the tran- sition into more applied research. Martin was the first to suggest that I work on privacy. He taught me that interesting problems and socially-conscious research are not mutually exclusive. Adam Smith hosted me during what became my favourite summer of graduate school. I published my first paper with Adam, which he tirelessly guided towards a pub- lishable version. He has continued to be my guide in the differential privacy community, introducing me to the right people and the right problems. Jacob Abernethy taught me much of what I know about machine learning. He also taught me the importance of being flexible in my research and that research is as much as about exploration as it is about solving a pre-specified problem. -
Amicus Brief of Data Privacy Experts
Case 3:21-cv-00211-RAH-ECM-KCN Document 99-1 Filed 04/23/21 Page 1 of 27 EXHIBIT A Case 3:21-cv-00211-RAH-ECM-KCN Document 99-1 Filed 04/23/21 Page 2 of 27 UNITED STATES DISTRICT COURT FOR THE MIDDLE DISTRICT OF ALABAMA EASTERN DIVISION THE STATE OF ALABAMA, et al., ) ) Plaintiffs, ) ) v. ) Civil Action No. ) 3:21-CV-211-RAH UNITED STATES DEPARTMENT OF ) COMMERCE, et al., ) ) Defendants. ) AMICUS BRIEF OF DATA PRIVACY EXPERTS Ryan Calo Deirdre K. Mulligan Ran Canetti Omer Reingold Aloni Cohen Aaron Roth Cynthia Dwork Guy N. Rothblum Roxana Geambasu Aleksandra (Seša) Slavkovic Somesh Jha Adam Smith Nitin Kohli Kunal Talwar Aleksandra Korolova Salil Vadhan Jing Lei Larry Wasserman Katrina Ligett Daniel J. Weitzner Shannon L. Holliday Michael B. Jones (ASB-5440-Y77S) Georgia Bar No. 721264 COPELAND, FRANCO, SCREWS [email protected] & GILL, P.A. BONDURANT MIXSON & P.O. Box 347 ELMORE, LLP Montgomery, AL 36101-0347 1201 West Peachtree Street, NW Suite 3900 Atlanta, GA 30309 Counsel for the Data Privacy Experts #3205841v1 Case 3:21-cv-00211-RAH-ECM-KCN Document 99-1 Filed 04/23/21 Page 3 of 27 TABLE OF CONTENTS STATEMENT OF INTEREST ..................................................................................... 1 SUMMARY OF ARGUMENT .................................................................................... 1 ARGUMENT ................................................................................................................. 2 I. Reconstruction attacks Are Real and Put the Confidentiality of Individuals Whose Data are Reflected -
Calibrating Noise to Sensitivity in Private Data Analysis
Calibrating Noise to Sensitivity in Private Data Analysis Cynthia Dwork1, Frank McSherry1, Kobbi Nissim2, and Adam Smith3? 1 Microsoft Research, Silicon Valley. {dwork,mcsherry}@microsoft.com 2 Ben-Gurion University. [email protected] 3 Weizmann Institute of Science. [email protected] Abstract. We continue a line of research initiated in [10, 11] on privacy- preserving statistical databases. Consider a trusted server that holds a database of sensitive information. Given a query function f mapping databases to reals, the so-called true answer is the result of applying f to the database. To protect privacy, the true answer is perturbed by the addition of random noise generated according to a carefully chosen distribution, and this response, the true answer plus noise, is returned to the user. P Previous work focused on the case of noisy sums, in which f = i g(xi), where xi denotes the ith row of the database and g maps data- base rows to [0, 1]. We extend the study to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f. Roughly speak- ing, this is the amount that any single argument to f can change its output. The new analysis shows that for several particular applications substantially less noise is needed than was previously understood to be the case. The first step is a very clean characterization of privacy in terms of indistinguishability of transcripts. Additionally, we obtain separation re- sults showing the increased value of interactive sanitization mechanisms over non-interactive.