Safelearn: Secure Aggregation for Private Federated Learning (Full Version)?

SAFELearn: Secure Aggregation for private FEderated Learning (Full Version)? Hossein Fereidooni1, Samuel Marchal2, Markus Miettinen1, Azalia Mirhoseini3, Helen Möllering4, Thien Duc Nguyen1 Phillip Rieger1, Ahmad-Reza Sadeghi1, Thomas Schneider4, Hossein Yalame4, and Shaza Zeitouni1 1 System Security Lab, Technical University of Darmstadt, Germany {hossein.fereidooni, markus.miettinen, ducthien.nguyen, phillip.rieger, ahmad.sadeghi, shaza.zeitouni}@trust.tu-darmstadt.de 2 Aalto University and F-Secure Corporation, Finland − samuel.marchal@aalto.fi 3 Google, USA − [email protected] 4 ENCRYPTO, Technical University of Darmstadt, Germany − {moellering, schneider, yalame}@encrypto.cs.tu-darmstadt.de Abstract—Federated learning (FL) is an emerging distributed data [6], communication between vehicles [7], and intrusion machine learning paradigm which addresses critical data pri- detection systems [8]. While FL leverages the power of the vacy issues in machine learning by enabling clients, using an massive amount of data available at edge devices nowadays, it aggregation server (aggregator), to jointly train a global model without revealing their training data. Thereby, it improves not improves data privacy by enabling to keep data locally at the only privacy but is also efficient as it uses the computation power clients [5]. This becomes particularly relevant not only because and data of potentially millions of clients for training in parallel. of legal obligations such as the GDPR [9] and HIPAA [10], but However, FL is vulnerable to so-called inference attacks by also in general when working with personal and sensitive data malicious aggregators which can infer information about clients’ like in the health sector where ML gets increasing attention. In data from their model updates. Secure aggregation restricts the central aggregator to only learn the summation or average applications being deployed on end-users’ devices, FL helps of the updates of clients. Unfortunately, existing protocols for to increase the acceptance as the user’s data never leaves its secure aggregation for FL suffer from high communication, device such that more users might be willing to contribute to computation, and many communication rounds. a training. A real-world example for such a system is GBoard, In this work, we present SAFELearn, a generic design for a mobile keyboard for Android smartphones using FL for efficient private FL systems that protects against inference attacks that have to analyze individual clients’ model updates training word suggestions [5], which was downloaded more using secure aggregation. It is flexibly adaptable to the efficiency than a million times by the end of 2020 [11]. and security requirements of various FL applications and can be Despite these benefits, FL is vulnerable to adversarial attacks instantiated with MPC or FHE. In contrast to previous works, we aiming at extracting information about the used training data. only need 2 rounds of communication in each training iteration, In these so-called inference attacks, the adversary can, for do not use any expensive cryptographic primitives on clients, tolerate dropouts, and do not rely on a trusted third party. example, infer if a specific image was used in training an We implement and benchmark an instantiation of our generic image classifier by inferring the model updates [12], [13]. design with secure two-party computation. Our implementation This violates the principal design goal of FL, i.e., protecting aggregates 500 models with more than 300K parameters in less data privacy. Several secure aggregation protocols have been than 0.5 seconds. proposed to address this problem by hindering the aggregator Index Terms—Federated Learning, Inference Attacks, Secure Computation, Data Privacy from analyzing clients’ model updates [3], [14]–[23]. However, existing approaches are inefficient, impractical, and/or rely on I. INTRODUCTION a trusted third party (TTP) [15]–[17]. In particular, they are computationally expensive [3], [14], [21], [24], increase the Federated Learning (FL) became a distributed machine number of communication rounds [14], [22], and do not tolerate learning (ML) paradigm since it was introduced by Google in dropouts [15], [16]. Especially the increase in communication 2017 [2]. It aims at improving privacy by enabling data owners rounds is problematic, as FL is typically used in a mobile to efficiently train a model on their joint training data with the setting where mobile or edge devices are involved and the help of a central aggregator and without sharing their potentially network tends to be unstable, slow, and with low bandwidth [2]. sensitive data with each other or with the aggregator. FL offers Mobile devices regularly go offline such that dropouts must be efficiency and scalability as the training task is distributed tolerated. Most importantly, these aggregation schemes hinder between many participants and executed in parallel [3], [4]. the aggregator from accessing the local updates, therefore, Possible applications include, e.g., next word prediction for making it impossible to analyze these updates for malicious mobile keyboards from Google [5], the analysis of medical client behavior that sabotage the training [25]. ?Please cite the version of this paper published at IEEE Deep Learning and Our Contributions and Outline. In this work, we introduce Security Workshop (DLS’21) [1]. SAFELearn, an efficient secure aggregation system, prohibiting access to model updates to impede powerful inference attacks distribution estimation attacks estimate the proportions of on FL. In particular, we provide the following contributions training labels in the data [30], and reconstruction attacks after giving the preliminaries in §II: reconstruct training samples [30]. Another distinction can be • We survey state-of-the-art secure aggregation protocols made between black box attacks, that are restricted to interpret for FL and analyze their limitations (§III). the model predictions [30], and white box attacks, that use • We introduce a generic design called SAFELearn for model parameters of either the trained model or from the secure aggregation for FL. It is adaptable to various secu- clients’ updates during the training [29], [30]. rity and efficiency requirements and multiple aggregation FL protects the privacy of the clients’ data against inference mechanisms (§IV). attacks run by third parties, as they can only access the • We implement and benchmark an instantiation of global model and cannot relate the information inferred from SAFELearn using secure two-party computation on multi- a global model to a specific client. Additionally, attacks on the ple FL applications and datasets (§V). Our system aggre- global models tend to be weak and fail to achieve good attack gates 500 models with more than 300K parameters in less performance [29]. However, the aggregator in FL has access than 0.5 seconds. Our implementation is available as open to the local updates of each client making FL vulnerable to source at https://github.com/TRUST-TUDa/SAFELearn. strong inference attacks by a corrupted aggregator. Thus, in this work, we aim at hindering the aggregator from accessing II. PRELIMINARIES clients’ update to prohibit powerful inference attacks that are In this section, we introduce FL and inference attacks on leveraging individual local updates of clients while enabling FL as well as the cryptographic building blocks we use. efficient FL. A. Federated Learning (FL) C. Secure Multi-Party Computation (MPC) Federated Learning (FL) [2], [4] is a concept for distributed Secure Multi-Party Computation (MPC) enables the secure machine learning that links K clients and an aggregator A evaluation of a public function on private data provided by N who collaboratively build a global model G. In each training mutually distrusting parties [31]. iteration t, A chooses a subset of the K clients and sends Secure two-party computation (STPC) [32]–[35], a special the current global model Gt−1 to them. Instead of sharing case of MPC with two parties (N = 2), allows two parties to gradients after each training iteration as in standard distributed securely evaluate a function on their private inputs. machine learning, each of these clients i 2 K then trains Gt−1 Thereby, the parties have only access to so-called secret- on multiple batches of its training data for multiple epochs shares of the inputs that are completely random and therefore do before sending the resulting locally updated model Wi to the not leak any information. The real value can only be obtained if aggregator. Then, A aggregates the received updates Wi into both shares are combined. STPC can be used in an outsourcing the global model Gt. FL results in less global training iterations scenario [33], where an arbitrary number of weak but even than in standard distributed machine learning and, hence, in malicious clients can secret-share their private inputs among less communication. two non-colluding but well-connected and powerful servers Several aggregation mechanisms have been proposed for who then run the STPC protocol. FL: (1) Federated-Averaging (FedAvg) [2], (2) Krum [26], (3) Adaptive Federated Averaging [27], and (4) Trimmed mean D. Homomorphic Encryption (HE) or median [28]. In this work, we focus on FedAvg, which is the Homomorphic Encryption (HE) enables computations on original FL aggregation mechanism, because it is commonly encrypted data. It allows to perform operations on a ciphertext, applied in FL and related work on secure aggregation [3], the decryption of which corresponds to algebraic operations [14], [19]–[21]. In FedAvg, the global model is updated by on the plaintext. HE schemes can be classified into partially summing the weighted (by the number of training samples (PHE), somewhat (SHE), or fully homomorphic encryption PjKj si×Wi used to train it) models Gt = i=1 s , where K is the set (FHE). PHE [36] supports either multiplication or addition of clients, si = kDik for training data Di of a client i 2 K, under encryption, SHE supports a bounded number of both, PjKj s = i=1 si, and Wi is client i’s update [2]. To hinder a and FHE [37] supports both without limitations.

Load more