Am I a Real Or Fake Celebrity? Measuring Commercial Face Recognition Web Apis Under Deepfake Impersonation Attack

Am I a Real or Fake Celebrity? Measuring Commercial Face Recognition Web APIs under Deepfake Impersonation Attack Shahroz Tariq Sowon Jeon Simon S. Woo∗ Department of Computer Science Department of Computer Science Department of Applied and Engineering and Engineering Data Science Sungkyunkwan University Sungkyunkwan University Sungkyunkwan University Suwon, South Korea Suwon, South Korea Suwon, South Korea [email protected] [email protected] [email protected] ABSTRACT humanitarian groups for identifying and rescuing human traffick- Recently, significant advancements have been made in face recog- ing victims [3], and law enforcement agencies for protecting the nition technologies using Deep Neural Networks. As a result, com- public’s safety [36]. Moreover, Microsoft has released a mobile app panies such as Microsoft, Amazon, and Naver offer highly accurate named TwinsOrNot in 2015 which helps users find celebrities who commercial face recognition web services for diverse applications resemble them [40]. Naturally, these web APIs have a wide range of to meet the end-user needs. Naturally, however, such technologies clients. For example, Kyodo News MediaLab aims to build broader are threatened persistently, as virtually any individual can quickly image search services such as auto-tagging of celebrities and classi- implement impersonation attacks. In particular, these attacks can fication using custom labels from Amazon Rekognition APIs[42]; be a significant threat for authentication and identification services, the Marinus flagship software3 [ ] offers a tool named Traffic Jam, which heavily rely on their underlying face recognition technolo- which can be used by law enforcement agencies to investigate sex gies’ accuracy and robustness. Despite its gravity, the issue regard- trafficking. Using Amazon Rekognition API, investigators cantake ing deepfake abuse using commercial web APIs and their robustness appropriate actions by searching through millions of records in just has not yet been thoroughly investigated. This work provides a a few seconds to identify potential criminals [43]. measurement study on the robustness of black-box commercial On the other hand, the rapid development of synthetic image face recognition APIs against Deepfake Impersonation (DI) attacks and video generation methods has prompted significant concerns using celebrity recognition APIs as an example case study. We use over the authenticity of readily available contents on social media five deepfake datasets, two of which are created by us and planned and video hosting sites. Although fabrication and manipulation to be released. More specifically, we measure attack performance of digital images and videos are not new, the advent of so-called based on two scenarios (targeted and non-targeted) and further deepfake methods has made the process of creating convincing analyze the differing system behaviors using fidelity, confidence, fake videos much easier and faster. It is therefore now possible for and similarity metrics. Accordingly, we demonstrate how vulnera- anyone with a few images of a victim from the general public to ble face recognition technologies from popular companies are to create his/her fake videos, which can result in riots and social unrest, DI attack, achieving maximum success rates of 78.0% and 99.9% especially when combined with fake news [33, 39]. In fact, there are for targeted (i.e., precise match) and non-targeted (i.e., match with positive and creative use cases of deepfakes, such as Disney’s plan any celebrity) attacks, respectively. Moreover, we propose practical to replace traditional visual effects (VFX) with deepfakes [57] and defense strategies to mitigate DI attacks, reducing the attack suc- the creators of the HBO documentary ‘Welcome to Chechnya’ [22] cess rates to as low as 0% and 0.02% for targeted and non-targeted utilizing deepfakes to anonymize LGBTQ prosecution survivors. attacks, respectively. However, deepfakes are highly likely to be misused, for example, to shift political opinions using deepfake videos of Richard Nixon’s KEYWORDS moon disaster speech [12] and of Donald Trump’s regrets on climate change [35]. To make matters worse, 96% of the deepfake on the arXiv:2103.00847v2 [cs.CV] 2 Mar 2021 Deepfake, Impersonation Attack, Measurement, Recognition APIs, Internet are non-consensual deepfake pornography [34]. Most of Celebrity Face Recognition, Fake News and Media these deepfake pornography feature women, often celebrities, and are used to degrade, defame, and abuse them [38]. 1 INTRODUCTION In addition to privacy violations, we believe that this deepfake Deep learning algorithms have contributed significantly to the ad- technology can be further misused to carry out impersonation at- vancement of a wide range of applications, including computer tacks. The matter regarding deepfake abuse is not only of great vision, speech processing, and natural language processing. In par- significance from a security and privacy perspective, but alsoof ticular, due to the success of face recognition and identification grave urgency, which must be addressed before further exacerbating algorithms, companies such as Amazon, Google, Microsoft, Baidu, misinformation. More specifically, these days, the face recognition and Naver, have unveiled commercial face recognition web APIs for APIs have become more essentially integrated into daily life. There- various applications [5, 10, 41]. These services are widely adopted fore, in this work, we showcase a Deepfake Impersonation (DI) for face verification and identification in a number of areas, such attack with a scenario of deceiving commercial face recognition as the entertainment industry for indexing images and videos [42], APIs. We perform the attack on the well-known celebrity recognition APIs from Microsoft, Amazon, and Naver; because it is the ∗Corresponding Author Shahroz Tariq, Sowon Jeon, and Simon S. Woo quintessence of their face recognition capabilities. Nevertheless, methods such as face swaps are widely misused to generate revenge this attack can be easily generalized and extended to the masses by or celebrity porn. FaceSwap [17] is one widely used application attacking the general face recognition APIs. of replacement. Moreover, face synthesis is where the deepfake is Therefore, in this work, we aim to understand how different created by altering one or multiple reference images. FaceApp [14] celebrity recognition APIs respond to celebrity deepfakes. Based on is one example of synthesis, enabling users to alter their or morph extensive experiments with five deepfake datasets, we demonstrate different faces to make a new synthetic identity. We use facial re- that it is relatively easy to evade and fool celebrity recognition placement, reenactment, and synthesis as attack methods and also APIs, achieving up to a 99.9% attack success rate on the certain generate two new additional datasets in this work. We explain more commercial celebrity recognition API in a black-box setting. Among details of each dataset in Section 4. the five datasets used to carry out impersonation attacks, twoare Commercial Face Recognition Web APIs. Amazon Rekogni- publicly available benchmark datasets, one is collected online, and tion [41], Microsoft Azure Cognitive Services [5], Google Cloud the remaining two are created by ourselves. Lastly, we also propose Vision [4], Baidu AI [6], and Naver Clova [10] provide various im- defense strategies that can significantly reduce the impact of DI age analysis web APIs for the uses in object, scenes, faces detection attacks. Our main contributions are summarized as follows: and recognition as well as celebrity recognition. Many people use • Deepfake Impersonation Attack: We introduce a novel im- these APIs for visual search, image classification, and recognizing personation attack to deceive commercial face recognition APIs celebrities. Also, Baidu [6] and Naver [10] are the leading search using deepfakes. We also create two novel datasets, Celebrity engine and portal service providers in China and South Korea, re- First Order Motion (CelebFOM) and the Celebrity Blend (Celeb- spectively, where they both offer web APIs for celebrity recognition. Blend), for evaluation1. Naver Face Recognition web API can also detect various face image • Extensive Measurement and Evaluation of Commercial attributes, and Microsoft Azure also provides face image analysis Web APIs: We evaluate how popular major commercial celebrity information. recognition web APIs from Amazon, Microsoft, and Naver re- There are several commercial face recognition APIs. However, spond to celebrity deepfakes in a black-box setting, using four in this work, we choose Microsoft (MS), Amazon (AWS), and Naver different metrics. We demonstrate that celebrity deepfakes fool (NAV) celebrity recognition APIs for the following reasons: First, all three APIs with an attack success rate of up to 78.0% and these three web APIs all commonly offer a service to recognize 99.9% on targeted and non-targeted attacks, respectively. We celebrity faces, returning comparable face similarity scoring metric. also analyze the deepfakes and real images using the APIs’ face This allows us to evaluate APIs in a fair manner. Second, celebrity similarity module. face images are easily obtainable compared to ordinary people and • Defense Mechanism: We present a defense mechanism against can be easily spoofed to generate deepfakes and further perform the DI attack and compare several defense strategies. We show

Load more