Enhancing the Privacy of Federated Learning with Sketching

Enhancing the Privacy of Federated Learning with Sketching Zaoxing Liu, Tian Li, Virginia Smith, Vyas Sekar Carnegie Mellon University ABSTRACT main local, but the individual model updates from users, such In response to growing concerns about user privacy, feder- as the computed “gradients” in each round of the training ated learning has emerged as a promising tool to train sta- task, are still readable by a third-party (e.g., a remote server). tistical models over networks of devices while keeping data Communicating model updates potentially reveals sensitive localized. Federated learning methods run training tasks di- user information, such as a user’s health data in a federated rectly on user devices and do not share the raw user data with disease detection task (discussed in detail in §2.1). Mean- third parties. However, current methods still share model while, current efforts that attempt to boost the privacy of updates, which may contain private information (e.g., one’s federated learning build upon traditional cryptographic tech- weight and height), during the training process. Existing niques, such as secure multi-party computation [10] and dif- efforts that aim to improve the privacy of federated learn- ferential privacy [8, 25]. As federated learning is commonly ing make compromises in one or more of the following key deployed on low-powered, mobile user devices, existing ef- areas: performance (particularly communication cost), ac- forts add significant communication and computation cost, curacy, or privacy. To better optimize these trade-offs, we making them impractical to retain user experiences. propose that sketching algorithms have a unique advantage Ideally, federated learning methods should offer privacy in that they can provide both privacy and performance ben- by privatizing individual user updates, performance via low efits while maintaining accuracy. We evaluate the feasibil- communication cost, and accuracy by providing similar con- ity of sketching-based federated learning with a prototype vergence guarantees as the vanilla learning process. Achiev- on three representative learning models. Our initial findings ing privacy while retaining accuracy and performance has show that it is possible to provide strong privacy guaran- been an elusive goal in machine learning [2, 3, 25, 32], sys- tees for federated learning without sacrificing performance tems [9], and theory [8, 33] communities. or accuracy. Our work highlights that there exists a funda- In this work, our insight is a new connection between fed- mental connection between privacy and communication in erated learning and sketching algorithms (sketches), a method- distributed settings, and suggests important open problems ology that has been mainly limited in application to the ar- surrounding the theoretical understanding, methodology, and eas of network measurement and databases. We find that system design of practical, private federated learning. sketches are a promising option to jointly optimize two sides of the same coin (performance and privacy in federated learning) for two reasons: (a) First, sketches attain high accu- 1. INTRODUCTION racy with succinct data structure and have a well-studied Modern Internet-of-things devices, such as mobile phones, trade-off between accuracy and memory [5, 19]. This fea- arXiv:1911.01812v1 [cs.LG] 5 Nov 2019 wearable devices, and smart homes, generate a wealth of data ture is important in attaining learning efficiency and user each day. This user data is crucial for vendors and service experiences as massive wireless communication can cause providers to use in order to continuously improve their prod- battery draining and overheating problems from the 4G ra- ucts and services using machine learning. However, the pro- dio of mobile phones. (b) Second, we notice that sketches cessing of user data raises critical privacy concerns, which have inherent but little explored privacy benefits. In fact, has led to the recent interest in federated learning [31]. Fed- most canonical sketches (e.g., Count-Min Sketch [17] and erated learning explores training statistical learning models Count-Sketch [15]) do not naturally preserve data identities in a distributed fashion over a massive network of user de- and require additional mechanisms to trace back [15, 37] to vices, while keeping the raw data on each device local in the pre-inserted identities. While this is a key limitation of order to help preserve privacy. sketches in classical measurement tasks, it interestingly turns To date, although federated learning is motivated by pri- into a strength in the federated learning case. Moreover, re- vacy concerns, the privacy guarantees of current methods are cent theoretical advances [4, 33] have shown that differential limited. For example, the state-of-the-art method FedAvg, privacy is achievable on sketches with additional modifica- proposed in [31], only suggests that the raw user data re- tions. These properties make sketches an attractive alterna- 1 Before After 2. aggregate 3. aggregate sketches S Privacy leakage server server Model updates = � "[age, gender, weight, blood_pressure, max_heart_rate, …] 1. updates 3. new model 2. S(updates) 4. S(new model) 1. updates 5. query from S Figure 2: Privacy leakage in S1: Inferring user health data from individual model updates. Figure 1: Conceptual design of federated learning with sketching (S). main local on each device. This design benefits user privacy tive to enhance federated learning meet the privacy, perfor- and reduces the communication of the training process [29]. mance, and accuracy goals. Learning scenarios and their privacy issues. Federated Our vision in this work is to bring to light these sketching learning is useful for many applications. Unfortunately, as techniques to practical federated learning systems and show we show in the following representative scenarios, user pri- a viable path towards a private federated learning system. vacy can still be an issue even when user data remains local. The key challenge of realizing such a vision is in account- S1: Heart attack risk detection from wearable devices. Con- ing for federated learning’s unique distributed workflow and sider training a linear model via minimizing the empirical designing appropriate private communication mechanisms. loss over historical health data distributed across wearable Specifically, we consider the following questions: (1) How devices. The loss function of such a linear model can be de- can sketches be used in federated learning frameworks to fined as F (w; x; y) = (w · x − y)2 where F means the gap boost the overall privacy? (2) What current sketches are use- between the model’s predictions and the ground-truth label, ful in terms of privacy and accuracy? (3) Can we further w denotes the learnable parameters, and x represents user improve the privacy guarantees of current sketches? data (i.e., related health information). As shown in Figure 2, As a concrete start to answer these questions, we provide the gradient of F (w; x; y), used by the de facto optimizers a preliminary design that incurs only small changes to cur- (min-batch) Stochastic Gradient Descent (SGD) in machine rent federated learning frameworks, as shown in Figure 1. learning training, is c · x — some constant c multiplied by Our approach leverages sketching on the updates sent be- the raw data x. Hence, sending gradients (or model updates) tween users and the central server to prevent the identities may be equivalent to sending raw data. of private user information from being revealed. To verify S2: Next word prediction on mobile phones. In order to the feasibility of using sketches to privatize individual model enhance user experiences during typing on the phones, some updates without significant accuracy impact, we implement applications such as the Google keyboard (Gboard) lever- a proof-of-concept federated learning simulation with Count age federated learning to jointly learn a next-word prediction Sketch [15] to train three representative models: Linear Re- model over users’ historical text data and deploy the model gression, Multi-layer Perceptron (MLP), and Recurrent Neu- on the virtual keyboard [26]. In this scenario, the Recurrent ral Networks (RNN). Our simulation demonstrates that extra Neural Networks (RNNs), a state-of-the-art neural network privacy can be added into federated learning for “free” as model is used to handle sequence data such as texts. RNNs sketches privatize the original data and save the communica- are known to have the ability to memorize and expose sensi- tion cost by 10× with small errors. tive, unique patterns in the training data [13]. For instance, Our vision, if successful, has the ability to dramatically if someone’s credit numbers are typed before, the resulting mitigate the concerns of user privacy in federated learning model may contain the credit card numbers in some way. with strengthened confidence in user experiences. We also As shown in [13], one can efficiently extract the secret se- identify further directions to explore the theoretical under- quences from the final model. standing of the privacy features of sketches, along with sys- S3: Face recognition on cameras. Another application to tem and privacy requirements from various federated learn- consider is performing distributed face recognition on a cal- ing tasks, and design appropriate sketching-based federated ibrated camera sensor network [24]. In such settings, an in- learning framework based on user needs in §5.

Enhancing the Privacy of Federated Learning with Sketching

Short and Deep: Sketching and Neural Networks

Fast and Scalable Polynomial Kernels Via Explicit Feature Maps *

Learning-Based Frequency Estimation Algorithms

Incremental Randomized Sketching for Online Kernel Learning

Compressing Gradient Optimizers Via Count-Sketches

Communication-Efficient Federated Learning with Sketching

Tensors in Modern Statistical Learning 1 Introduction

(Learned) Frequency Estimation Algorithms Under Zipfian Distribution

Communication-Efficient Distributed SGD with Sketching

Privacy for Free: Communication-Efficient Learning

Frequency Estimation in Data Streams: Learning the Optimal Hashing Scheme

Question Type Guided Attention in Visual Question Answering