A Privacy Preserving ML System
Total Page:16
File Type:pdf, Size:1020Kb
A Privacy Preserving ML System Alexander Wu Hank O’Brien UC Berkeley UC Berkeley Abstract support addition and multiplication, thus fq is typically It is generally agreed that machine learning models may con- limited to models with activations that can be approxi- tain intellectual property which should not be shared with mated by polynomials [12][3]. users, while at the same time there should be mechanisms in Once the data is encrypted, the ciphertext can be sent place to prevent the abuse of sensitive user data. We propose to a remote server, where fq(Ek(x)) is calculated. This a machine learning inference system which provides an end- calculations is typically computationally expensive, with to-end preservation of privacy for both a machine learning latencies greater than 1 second, even with hardware ac- model developer and user. Our system aims to minimize its celeration [3]. constraints on the expressiveness and accuracy of machine learning models. Our system achieves this by utilizing trusted 2. Differential Privacy In differential privacy, a random computation, with a trust-performance trade off which extends perturbation is added to user data, before it is used in a to a cryptographic proof that data is not tampered with model. This perturbation is typically paramaterized by e and d. 1 Introduction In particular, A randomized algorithm M with do- j j main N X is (e;d) -differentially private if for all S ⊆ Deep learning inference often times requires the use of sen- X Range(M ) and 8x;y 2 N such that jjx − yjj1 ≤ 1: sitive user data. Because computation must be performed on this data, traditional cryptographic techniques used to protect Pr[M (x) 2 S] ≤ exp(e)Pr[M (y) 2 S] + d data at rest cannot be easily applied to machine learning in- ference. The model weights, architecture, and code involved [9] in inference are also often times sensitive intellectual prop- erty which is not suitable for distribution to client devices. In the context of deep learning, differential privacy meth- This creates a natural conflict of interest: Application devel- ods parameters greatly affect the accuracy of models and opers do not want to share their models, and users do not in some cases, the architecture [1] want to share their data. Traditional solutions to the problem of privacy preserving machine learning typically fall into 3 3. Hardware Enclaves Hardware based approaches to pri- categories: vacy preserving machine learning typically rely upon secure enclaves such as Intel Software Guard Extensions 1. Cryptographic: Cryptographic approaches to privacy pre- (SGX). SGX utilizes software attestation to provide a serving machine learning typically leverage homomor- cryptographic proof that a specific piece of software is phic cryptography. Homomorphic encryption encrypts running in a secure container on an otherwise untrusted the user’s data, x with a key, k while preserving the struc- machine. On modern chips, Intel SGX maintains 128MB ture of certain operations f [20]. of secure Processor Reserve Memory (PRM) of which 90MB is the Enclave Page Cache (EPC) [8]. Current Intel SGX implementations contain timing side chan- D ( f (E (x))) k q k nels due to speculative execution vulnerabilities [23]. Here Ek is typically a symmetric homomorphic cipher. In this work, we assume that a patch or similar enclave The user data X is encrypted on a client machine. In the could provide the same guarantees that SGX advertises, context of deep learning, these schemes typically only without changing the programming interface. 1 Notably the 90MB EPC is smaller than many typical Current works do not address how an application developer deep learning models. For example, a typical image clas- can go about using user data without requiring the user give sifier such as Resnet 50v2 [14] is 102MB. Applications that information to the developer, in the same way that an with a larger resident set size require cryptographically application developer can use payment information to make a secure demand paging mechanisms [11]. transaction without the user providing a credit card number, for example, to the developer. In order to improve enclave performance, works exist to distribute untrusted machine learning applications [15] which sandbox untrusted models [16]. Other works improve the performance of model infer- ence by using untrusted accelerators, maintaining in- tegrity, but trading privacy for performance [22]. 2 Motivation Previous works have focused providing a mechanism for per- forming machine learning inference on data in a privacy pre- serving manner. These works have typically had to sacrifice performance and/or use non standard cryptography to ensure privacy. Nearly all existing works we are aware of make the trust assumption that all code run on the end user’s device is trusted. To motivate our system, consider a typical web application in which a user uploads an image which a service performs remote inference on. There are real world instances of ap- plications violating the assumption that code run on a user’s device is trusted [6]. Such applications receive and upload Figure 1: The existing model. The application developer is user data in a non-privacy preserving way, perform inference, untrusted by the user, yet the user must trust the untrusted code then save the data which can then be potentially abused. where the user is entering her data. This user must also trust that the computing environment chosen by the app developer It is also generally accepted that not all code run on a user’s is secure, but she has no way of validating this because she has device should be trusted. Web browsers and even operating no control of what computing environment the app developer systems make the assumption that code run on a user’s device has chosen. These two points are the essence of our motivation is not necessarily trustworthy. While we focus our motivation and work on the case of a web application running on a desktop browser, we believe the motivation is equally applicable in the context of mobile 3 Design Goals applications. The existing machine learning inference trust model in In order to enable applications to use user data without having web applications is roughly all-or-nothing. In order to utilize direct access to it, we would like to design a system which machine learning inference with a web application, a user meets the following goals: must reject using the application all together, or trust it with all of the user’s information. 1. Scalability: the proposed system should not limit the While this naive model sounds reasonable, it is in opposi- scale of web applications which seek to use it. tion to a layered trust approach. Consider existing web ap- plications such as an online marketplace. While a user may 2. Flexibility: there is a clear trade off in current ap- trust the site with some sensitive information, such as a name, proaches between performance and privacy. Not all ap- address, etc, there are some degrees of sensitive information plications have the same privacy concerns/requirements. which users do not entrust most websites with. For exam- For some applications, running inference on a trusted ple, while users may typically trust an online marketplace infrastructure provider (e.g. AWS, GCP, Azure, etc) with their address, they typical do not trust the marketplace may be a sufficient amount of trust. Other more sen- with payment information. Instead, users typically trust only sitive applications, such as mobile banking or healthcare, a specific set of payment processors to handle their payment may have stricter requirements and may require crypto- information. graphic proof of execution on trusted, tamperfree hard- 2 ware. These applications should be able to utilize higher is running software which the user and developer can performance techniques which may be insecure. audit. Applications which have lower privacy requirements should not have to suffer the same performance degrada- 4.2 UTE tion as applications which require high performance. The UTE consists of code which the user trusts. The UTE is 3. Usability: application developers should not be re- implemented via browser iframes. Iframes provide a conve- stricted in how they design applications. In particular, nient trusted compute environment. Iframes are run in sepa- they should not be restricted in the user interfaces they rate sandboxes and explicitly isolated from untrusted appli- design. Using a privacy preserving system should also cations via browser isolation policy (e.g. same-origin policy, ideally require minimal changes to existing applications. cookie policy, etc). Iframes are hosted on a different source than the application. 4. Robustness: a privacy preserving inference system For example, if should have minimal limitations on the types of mod- els an application developer should be able to support. https://stanford.edu/application.html There may be a trade off between types of models, and is a web application, it would contain an iframe: privacy mode, but it should be minimized. Robustness should also apply to model accuracy. Privacy preserving <iframe src="https://berkeley.edu/ute-core.html"> systems should minimize their effect on model accuracy </iframe> whenever possible. The core logic of the UTE is located within a 1x1 pixel transparent iframe. The iframe serves no graphical purpose, 4 Proposed System but must be rendered to create an execution environment. The UTE also consists of graphical components which 4.1 Overview are separate iframes. Because they are from a different host than the application, the application cannot eavesdrop on the Our proposed system fundamentally treats machine learning interaction between the user and UTE. inference as an pure function in a mathematical sense. In UTE graphical components are designed to follow the same particular, we make the assumption/constraint that machine usage pattern as regular html/javascript graphical components.