Deep Learning for Deepfakes Creation and Detection: a Survey Thanh Thi Nguyen, Quoc Viet Hung Nguyen, Cuong M

1 Deep Learning for Deepfakes Creation and Detection: A Survey Thanh Thi Nguyen, Quoc Viet Hung Nguyen, Cuong M. Nguyen, Dung Nguyen, Duc Thanh Nguyen, Saeid Nahavandi, Fellow, IEEE Abstract—Deep learning has been successfully applied I. INTRODUCTION to solve various complex problems ranging from big data analytics to computer vision and human-level control. Deep In a narrow definition, deepfakes (stemming from learning advances however have also been employed to cre- “deep learning” and “fake”) are created by techniques ate software that can cause threats to privacy, democracy that can superimpose face images of a target person onto and national security. One of those deep learning-powered a video of a source person to make a video of the target applications recently emerged is deepfake. Deepfake algorithms can create fake images and videos that humans person doing or saying things the source person does. cannot distinguish them from authentic ones. The proposal This constitutes a category of deepfakes, namely face- of technologies that can automatically detect and assess the swap. In a broader definition, deepfakes are artificial integrity of digital visual media is therefore indispensable. intelligence-synthesized content that can also fall into This paper presents a survey of algorithms used to create two other categories, i.e., lip-sync and puppet-master. deepfakes and, more importantly, methods proposed to Lip-sync deepfakes refer to videos that are modified to detect deepfakes in the literature to date. We present make the mouth movements consistent with an audio extensive discussions on challenges, research trends and recording. Puppet-master deepfakes include videos of a directions related to deepfake technologies. By reviewing target person (puppet) who is animated following the the background of deepfakes and state-of-the-art deepfake detection methods, this study provides a comprehensive facial expressions, eye and head movements of another overview of deepfake techniques and facilitates the devel- person (master) sitting in front of a camera [1]. opment of new and more robust methods to deal with the While some deepfakes can be created by traditional increasingly challenging deepfakes. visual effects or computer-graphics approaches, the re- Impact Statement—This survey provides a timely cent common underlying mechanism for deepfake cre- overview of deepfake creation and detection methods ation is deep learning models such as autoencoders and presents a broad discussion of challenges, potential and generative adversarial networks, which have been trends, and future directions. We conduct the survey applied widely in the computer vision domain [2]–[4]. with a different perspective and taxonomy compared to existing survey papers on the same topic. Informative These models are used to examine facial expressions graphics are provided for guiding readers through the and movements of a person and synthesize facial images latest development in deepfake research. The methods of another person making analogous expressions and surveyed are comprehensive and will be valuable to the movements [5]. Deepfake methods normally require a arXiv:1909.11573v3 [cs.CV] 26 Apr 2021 artificial intelligence community in tackling the current large amount of image and video data to train models challenges of deepfakes. to create photo-realistic images and videos. As public figures such as celebrities and politicians may have a Keywords: deepfakes, face manipulation, artificial intelli- large number of videos and images available online, they gence, deep learning, autoencoders, GAN, forensics, survey. are initial targets of deepfakes. Deepfakes were used to swap faces of celebrities or politicians to bodies in porn T. T. Nguyen and D. T. Nguyen are with the School of Information Technology, Deakin University, Victoria, Australia. images and videos. The first deepfake video emerged in Q. V. H. Nguyen is with School of Information and Communication 2017 where face of a celebrity was swapped to the face Technology, Griffith University, Queensland, Australia. of a porn actor. It is threatening to world security when C. M. Nguyen is with LAMIH UMR CNRS 8201, Universite´ Polytechnique Hauts-de-France, 59313 Valenciennes, France. deepfake methods can be employed to create videos D. Nguyen is with Faculty of Information Technology, Monash of world leaders with fake speeches for falsification University, Victoria, Australia. purposes [6]–[8]. Deepfakes therefore can be abused to S. Nahavandi is with the Institute for Intelligent Systems Research cause political or religion tensions between countries, to and Innovation, Deakin University, Victoria, Australia. Corresponding e-mail: [email protected] (T. T. fool public and affect results in election campaigns, or Nguyen). create chaos in financial markets by creating fake news 2 [9]–[11]. It can be even used to generate fake satellite Deepfake Detection Challenge to catalyse more research images of the Earth to contain objects that do not really and development in detecting and preventing deepfakes exist to confuse military analysts, e.g., creating a fake from being used to mislead viewers [25]. Data obtained bridge across a river although there is no such a bridge from https://app.dimensions.ai at the end of 2020 show in reality. This can mislead a troop who have been guided that the number of deepfake papers has increased signif- to cross the bridge in a battle [12], [13]. icantly in recent years (Fig. 1). Although the obtained As the democratization of creating realistic digital numbers of deepfake papers may be lower than actual humans has positive implications, there is also positive numbers but the research trend of this topic is obviously use of deepfakes such as their applications in visual increasing. effects, digital avatars, snapchat filters, creating voices of those who have lost theirs or updating episodes of movies without reshooting them [14]. However, the number of malicious uses of deepfakes largely dominates that of the positive ones. The development of advanced deep neural networks and the availability of large amount of data have made the forged images and videos almost indistinguishable to humans and even to sophisticated computer algorithms. The process of creating those ma- nipulated images and videos is also much simpler today as it needs as little as an identity photo or a short video of a target individual. Less and less effort is required to produce a stunningly convincing tempered footage. Recent advances can even create a deepfake with just a still image [15]. Deepfakes therefore can be a threat Fig. 1. Number of papers related to deepfakes in years from 2016 to 2020, obtained from https://app.dimensions.ai at the end of 2020 affecting not only public figures but also ordinary people. with the search keyword “deepfake” applied to full text of scholarly For example, a voice deepfake was used to scam a CEO papers. The number of such papers in 2018, 2019 and 2020 are 64, out of $243,000 [16]. A recent release of a software 368 and 1268, respectively. called DeepNude shows more disturbing threats as it can transform a person to a non-consensual porn [17]. This paper presents a survey of methods for creating Likewise, the Chinese app Zao has gone viral lately as well as detecting deepfakes. There have been existing as less-skilled users can swap their faces onto bodies survey papers about this topic in [26]–[28], we however of movie stars and insert themselves into well-known carry out the survey with different perspective and taxon- movies and TV clips [18]. These forms of falsification omy. In Section II, we present the principles of deepfake create a huge threat to violation of privacy and identity, algorithms and how deep learning has been used to and affect many aspects of human lives. enable such disruptive technologies. Section III reviews Finding the truth in digital domain therefore has different methods for detecting deepfakes as well as their become increasingly critical. It is even more challeng- advantages and disadvantages. We discuss challenges, ing when dealing with deepfakes as they are majorly research trends and directions on deepfake detection and used to serve malicious purposes and almost anyone multimedia forensics problems in Section IV. can create deepfakes these days using existing deepfake tools. Thus far, there have been numerous methods II. DEEPFAKE CREATION proposed to detect deepfakes [19]–[23]. Most of them Deepfakes have become popular due to the quality are based on deep learning, and thus a battle between of tampered videos and also the easy-to-use ability of malicious and positive uses of deep learning methods their applications to a wide range of users with various has been arising. To address the threat of face-swapping computer skills from professional to novice. These ap- technology or deepfakes, the United States Defense plications are mostly developed based on deep learning Advanced Research Projects Agency (DARPA) initiated techniques. Deep learning is well known for its capability a research scheme in media forensics (named Media of representing complex and high-dimensional data. One Forensics or MediFor) to accelerate the development variant of the deep networks with that capability is of fake digital visual media detection methods [24]. deep autoencoders, which have been widely applied Recently, Facebook Inc. teaming up with Microsoft Corp for dimensionality reduction and image compression and the Partnership on AI coalition have launched the [29]–[31]. The first attempt of deepfake creation was 3 TABLE I SUMMARY OF NOTABLE DEEPFAKE TOOLS Tools Links Key Features Faceswap https://github.com/deepfakes/faceswap - Using two encoder-decoder pairs. - Parameters of the encoder are shared. Faceswap-GAN https://github.com/shaoanlu/faceswap-GAN Adversarial loss and perceptual loss (VGGface) are added to an auto-encoder archi- tecture. Few-Shot Face https://github.com/shaoanlu/fewshot-face- - Use a pre-trained face recognition model to extract latent embeddings for GAN Translation translation-GAN processing. - Incorporate semantic priors obtained by modules from FUNIT [42] and SPADE [43].

Load more