Cooperative Edge Deepfake Detection
Total Page:16
File Type:pdf, Size:1020Kb
Cooperative edge deepfake detection Main Subject area: Computer Engineering Authors: Enis Hasanaj, William Söder, Albert Aveler Supervisor: Garrit Schaap JÖNKÖPING June 2021 This final thesis has been carried out at the School of Engineering at Jönköping University within Computer Engineering. The authors are responsible for the presented opinions, conclusions, and results. Examiner: Rachid Oucheikh Supervisor: Garrit Schaap Scope: 15 hp (first-cycle education) Date: 2021-06-12 Abstract ........................................................................................................... 1 1 Introduction ........................................................................................... 2 1.1 PROBLEM STATEMENT ...................................................................................... 2 1.1.1 Deepfakes ................................................................................................. 2 1.1.2 Computational power in machine learning .............................................. 2 1.2 LITERATURE REVIEW ........................................................................................ 4 1.2.1 Edge training and federated learning ....................................................... 4 1.2.2 Deepfake detection ................................................................................... 4 1.3 PURPOSE AND RESEARCH QUESTIONS ............................................................... 5 1.4 SCOPE AND LIMITATIONS .................................................................................. 6 1.5 DISPOSITION ..................................................................................................... 7 2 Method and implementation ............................................................ 8 2.1 DATA COLLECTION ........................................................................................... 8 2.2 DATA ANALYSIS ............................................................................................... 9 2.3 VALIDITY AND RELIABILITY ........................................................................... 10 2.4 CONSIDERATIONS ........................................................................................... 10 2.5 PROPOSED SOLUTION OF COOPERATIVE EDGE DEEPFAKE DETECTION ............. 11 3 Theoretical framework .................................................................... 12 3.1 DARKNET YOLOV2 ....................................................................................... 12 3.1.1 Basics ..................................................................................................... 12 3.1.2 Details .................................................................................................... 13 3.2 COMBINING THE MODELS ............................................................................... 15 3.3 ENSEMBLE METHODS ...................................................................................... 16 3.3.1 Bagging .................................................................................................. 16 3.3.2 Boosting ................................................................................................. 17 3.3.3 Stacking .................................................................................................. 18 3.4 PROGRAMMING LANGUAGES AND PLATFORMS ............................................... 18 3.5 DEEPFAKES EXPLAINED .................................................................................. 19 3.6 DATASET ........................................................................................................ 21 3.7 TERMINOLOGY ............................................................................................... 22 3.7.1 Bounding box ......................................................................................... 22 3.7.2 True/False Positive/Negative ................................................................. 23 3.7.3 Recall ..................................................................................................... 23 3.7.4 Precision ................................................................................................. 23 3.7.5 Accuracy ................................................................................................ 23 3.7.6 Confidence ............................................................................................. 23 3.7.7 Aggregated vs. Non-Aggregated models ............................................... 24 3.7.8 Supervised vs. Unsupervised learning ................................................... 24 3.7.9 Overfitting .............................................................................................. 24 3.8 CONVOLUTIONAL NEURAL NETWORK (CNN) ................................................ 24 3.8.1 Convolutional layers .............................................................................. 24 3.8.2 Pooling layers ......................................................................................... 25 3.8.3 Fully connected layers ........................................................................... 25 4 Results .................................................................................................... 27 4.1 TRAINING WITH DIFFERENT SUBSETS .............................................................. 27 4.2 TRAINING WITH DIFFERENT NUMBER OF ITERATIONS ...................................... 29 4.3 EDGE TRAINING RESULTS ............................................................................... 32 4.4 ENSEMBLE RESULTS ....................................................................................... 34 5 Discussion ............................................................................................. 36 5.1 LIMITATIONS .................................................................................................. 36 5.2 RESULT DISCUSSION ....................................................................................... 36 5.3 METHOD DISCUSSION ..................................................................................... 38 6 Conclusions and further research ................................................. 39 6.1 CONCLUSIONS ................................................................................................ 39 6.1.1 Practical implications ............................................................................. 40 6.1.2 Scientific implications............................................................................ 40 6.2 FURTHER RESEARCH ....................................................................................... 40 6.2.1 Android .................................................................................................. 40 6.2.2 Apple support for object detection on the edge ..................................... 41 6.2.3 Federated learning .................................................................................. 41 6.2.4 Another ensemble method ..................................................................... 42 7 References............................................................................................. 44 Abstract Deepfakes are an emerging problem in social media and for celebrities and political profiles, it can be devastating to their reputation if the technology ends up in the wrong hands. Creating deepfakes is becoming increasingly easy. Attempts have been made at detecting whether a face in an image is real or not but training these machine learning models can be a very time-consuming process. This research proposes a solution to training deepfake detection models cooperatively on the edge. This is done in order to evaluate if the training process, among other things, can be made more efficient with this approach. The feasibility of edge training is evaluated by training machine learning models on several different types of iPhone devices. The models are trained using the YOLOv2 object detection system. To test if the YOLOv2 object detection system is able to distinguish between real and fake human faces in images, several models are trained on a computer. Each model is trained with either different number of iterations or different subsets of data, since these metrics have been identified as important to the performance of the models. The performance of the models is evaluated by measuring the accuracy in detecting deepfakes. Additionally, the deepfake detection models trained on a computer are ensembled using the bagging ensemble method. This is done in order to evaluate the feasibility of cooperatively training a deepfake detection model by combining several models. Results show that the proposed solution is not feasible due to the time the training process takes on each mobile device. Additionally, each trained model is about 200 MB, and the size of the ensemble model grows linearly by each model added to the ensemble. This can cause the ensemble model to grow to several hundred gigabytes in size. Keywords: Machine learning, deepfake, artificial intelligence, ensemble, convolutional neural networks, edge, YOLOv2 1 1 Introduction In this study, deepfake detection models are trained locally on multiple iPhone devices using the YOLOv2 system (Redmon, 2016), which is part of the Darknet open-source machine learning framework. YOLOv2 is used to train models using parts of an image dataset and the models are combined into an ensemble once the models have been trained. A method is proposed for how this can be done, and the feasibility of the