Adversarial Face Obfuscation
Total Page:16
File Type:pdf, Size:1020Kb
In 21st Proceedings of Privacy Enhancing Technologies Symposium Face-Off: Adversarial Face Obfuscation Varun Chandrasekaran*, Chuhan Gao*‡, Brian Tang, Kassem Fawaz, Somesh Jha, Suman Banerjee Microsoft‡, University of Wisconsin-Madison Abstract—Advances in deep learning have made face recog- nition technologies pervasive. While useful to social media platforms and users, this technology carries significant privacy threats. Coupled with the abundant information they have about users, service providers can associate users with social interac- tions, visited places, activities, and preferences–some of which the user may not want to share. Additionally, facial recognition models used by various agencies are trained by data scraped from social media platforms. Existing approaches to mitigate these privacy risks from unwanted face recognition result in an imbalanced privacy-utility trade-off to users. In this paper, we address this trade-off by proposing Face-Off, a privacy-preserving framework that introduces strategic perturbations to the user’s face to prevent it from being correctly recognized. To realize Face-Off, we overcome a set of challenges related to the black-box nature of commercial face recognition services, and the scarcity of literature for adversarial attacks on metric networks. We Fig. 1: Adversarial example of Matt Damon generated by implement and evaluate Face-Off to find that it deceives three Face-Off recognized as Leonardo DiCaprio by Google commercial face recognition services from Microsoft, Amazon, and Face++. Our user study with 423 participants further shows image search. that the perturbations come at an acceptable cost for the users. Relying on insights from previous work [4], [5], [6], we I. INTRODUCTION propose a new paradigm to improve the trade-off between Enabled by advances in deep learning, face recognition privacy and utility for such users. In adversarial machine permeates several contexts, such as social media, online photo learning, carefully crafted images with small and human- storage, and law enforcement [1]. Online platforms such as imperceptible perturbations cause misclassifications [7], [8], Facebook, Google, and Amazon provide their users with various [9]. In this paper, we extend this approach from classification services built atop face recognition, including automatic tagging models to metric learning, as used in face recognition systems. and grouping of faces. Users share their photos with these In particular, we propose Face-Off, a system that preserves platforms, which detect and recognize the faces present in the user’s privacy against real-world face recognition systems. these photos. Automated face recognition, however, poses By carefully designing the adversarial perturbation, Face-Off significant privacy threats to users, induces bias, and violates targets only face recognition (and not face detection), preserving legal frameworks [2], [1]. These platforms operate proprietary the user’s original intention along with context associated with (black-box) recognition models that allow for associating the image. Face-Off detects a user’s face from an image to-be- users with social interactions, visited places, activities, and uploaded, applies the necessary adversarial perturbation, and preferences–some of which the user may not want to share. returns the image with a perturbed face. However, the design Existing approaches to mitigate these privacy risks result of Face-Off faces the following challenges: arXiv:2003.08861v2 [cs.CR] 15 Dec 2020 in an imbalanced trade-off between privacy and utility. Such approaches rely on (a) blurring/obscuring/morphing faces [3], • Unlike classification networks, metric learning networks (b) having the users utilize physical objects, such as eyeglass (used for face verification/recognition) represent inputs frames, clothes, or surrounding scenery with special patterns [4], as feature embeddings [10], [11], [12]. Real-world face [5], and (c) evading the face detector (the necessary condition recognition maps the feature embedding of an input image for face recognition) [6]. These solutions, however, exhibit to the closest cluster of faces. Existing approaches target two main drawbacks to users. First, the user can no longer classification networks and must be retrofit for metric meet their original goal in using the social media platform, learning. especially when various applications built atop of face detection • As the models used by these organizations are proprietary, (such as face-enhancement features) are broken. Second, Face-Off needs to perform black-box attacks. This issue specially manufactured objects for physical obfuscation are not is already challenging in the classification domain [7], [9], omnipresent and might not be desirable by the user. [13]. Further, Face-Off cannot use the service provider’s face recognition API as a black-box oracle to generate *Authors contributed equally. the adversarially perturbed face [4] for two reasons. First, generating an adversarial example requires querying the 3) We perform two user studies (with 423 participants) to API extensively, which is not free and is often rate- assess the user-perceived utility of the images that Face- limited1. Second, querying the black-box model defeats Off generates (§ VII). our purpose for privacy protection as it sometimes begins with releasing the original face [4]. II. BACKGROUND We address the first challenge by designing two new loss This section describes the machine learning (ML) notation functions specifically targeting metric learning. These loss required in this paper. We assume a data distribution D over functions aim to pull the input face away from a cluster of X × Y, where X is the sample space and Y = fy1;··· ;yLg is faces belonging to the user (in the embedding space), which the finite space of labels. For example, X may be the space of results in incorrect face recognition. Both loss functions can be all images, and Y may be the labels of the images. integrated with the state-of-the-art adversarial attacks against Empirical Risk Minimization: In the empirical risk mini- classification networks [9], [15]. mization (ERM) framework, we wish to solve the following To meet the second challenge, we leverage transferability, optimization problem: ∗ where an adversarial example generated for one model is w = min E(x;y)∼D L (w;x;y); effective against another model targeting the same problem. We w2H where H is the hypothesis space and L is the loss function rely on surrogate face recognition models (which we have full (such as cross-entropy loss [21]). We denote vectors in access to) to generate adversarial examples. Then, Face-Off bold (e.g., x). Since the distribution is usually unknown, a amplifies the obtained perturbation by a small multiplicative learner solves the following problem over a dataset S = factor to enhance transferability. This property reduces the train f(x1;y1);··· :(xn;yn)g sampled from D: probability of metric learning networks correctly recognizing n ∗ 1 the perturbed faces. Further, we explore amplification, beyond w = min L (w;xi;yi) w2H n ∑ classifiers [16], [17], and show it enhances transferability in i=1 metric learning and reduces attack run-time. Once the learner has solved the optimization problem given above, it obtains a solution w∗ 2 H which yields a classifier We evaluate Face-Off across three major commercial face ∗ recognition services: Microsoft Azure Face API [18], AWS F : X ! Y (the classifier is usually parameterized by w i.e.,, Rekognition [19], and Face++ [20]. Face-Off generates per- Fw∗ , but we will omit this dependence for brevity). turbed images that transfer to these three services, preventing A. Metric Embeddings them from correctly recognizing the input face. Our adversarial m examples also transfer to Google image search (refer Figure 1) A deep metric embedding fq is function from X to R , where q 2 Q is a parameter chosen from a parameter space successfully with the target labels. Based on a longitudinal m study (across two years), we observe that commercial APIs and R is the space of m-dimensional real vectors. Throughout have not implemented defense mechanisms to safeguard against the section, we sometimes refer to fq (x) as the embedding of m m + 2 m adversarial inputs. We show that using adversarial training [15] x. Let f : R × R ! R be a distance metric on R . Given as a defense mechanism deteriorates natural accuracy, dropping a metric embedding function fq , we define d f (x;x1) to denote the accuracy by 11.91 percentage points for a subset of the f( fq (x); fq (x1)). [n] denotes the set f1;··· ;ng. VGGFace2 dataset. Finally, we perform two user studies on Loss Functions: Deep embeddings use different loss functions Amazon Mechanical Turk with 423 participants to evaluate than typical classification networks. We define two of these user perception of the perturbed faces. We find that users’ loss functions: contrastive and triplet. privacy consciousness determines the degree of acceptable + For a constant g 2 R , the contrastive loss is defined over perturbation; privacy-conscious users are willing to tolerate the pair (x;y) and (x1;y1) of labeled samples from X × Y as: greater perturbation levels for improved privacy. 2 2 L (q;(x;y);(x ;y )) = y=y · d (x;x ) + · [g − d (x;x )]; In summary, our contributions are: 1 1 I 1 f 1 Iy6=y1 f 1 where IE is the indicator function for event E (and is equal to 1) We propose two new loss functions to generate adversarial 1 if event E is true and 0 otherwise). examples for metric networks (§ II-B). We also highlight The triplet loss is defined over three labeled samples–(x;y), how amplification improves transferability for metric + (x1;y) and (x2;y2), given a constant g 2 R , as: networks (§ III-D). 2 2 2) We design, implement, and evaluate Face-Off, which L (q;(x;y);(x1;y);(x2;y2)) = [d f (x;x1) − d f (x;x2) + g]+; applies adversarial perturbations to prevent real-world face where [x]+ = max(x;0) and y 6= y2.