Deep Learning on Raspberry Pi3 for Face Recognition

Deep Learning on Raspberry Pi3 for Face Recognition by Kollu Nimshi A thesis submitted in partial fulfillment of the requirements for the degree of Master of Engineering in Microelectronics and Embedded Systems Examination Committee: Dr. Mongkol Ekpanyapong (Chairperson) Prof. Matthew N. Dailey Dr. A.M. Harsha S. Abeykoon Nationality: Indian Previous Degree: Bachelor of Technology in Electronics and Communication Engineering Jawaharlal Nehru Technological University, Hyderabad, Telangana, India Scholarship Donor: AIT Fellowship Asian Institute of Technology School of Engineering and Technology Thailand December 2019 Acknowledgements My heartfelt thanks to my dear advisor Dr. Mongkol Ekpanyapong. I couldn't have done this without his direct guidance and technical support. He drived me in the right direction whenever I needed it. I would also like to thank my committee members Prof. Matthew N. Dailey and Dr. A.M. Harsha S. Abeykoon for their encouragement and insightful comments. My sincere thanks also go to Mr. Chatchai Pruetong, Mr. Amit Prasad Nayak, Mr Clifford, and Mr Sahuri Bond for the participation in my project and all other technical help. I express my profound gratitude to my family for their unfailing support and constant encouragement and my friends at AIT for being very nice to me during my stay here. Kollu Nimshi December 2019 ii Abstract In the present context, there is one big issue regarding intelligent security system using Face Recognition. In fact, it is a valid question-why do we need to implement only Face Recognition as intelligent security system? There is an effort to implement this on low power edge devices like Raspberry Pi3 and improve the accuracy of face recognition software. Even the smallest changes in the light or orientation could reduce the overall performance of recognition leading to more false positives. Though it can be implemented on powerful machines like CPU, GPU etc., yet it is not the best solution as it consumes large size and more power. It also increases the cost and complexity to maintain. Thus, bringing this application into embedded single board computers is very important .Edge computing by reducing deep learning model size is next coming future scope in Embedded System field and see how we can build intelligent system on low power devices. To increase recognition accuracy, the Deep Neural Networks (DNN) can play a vital role for the implementation of deep learning based computer vision tasks. Earlier such systems have been implemented in this area has been done in two factors: (i) end to end learning for the task using a Convolutional neural network (CNN), and (ii) the availability of large scale training datasets. After training the CNN on a desktop PC we employed a Raspberry pi, model B, for the image classification purpose. However, to utilize this CNN model with millions of free parameters on a low power embedded is much more complex and a challenging objective. This constitutes a challenge for embedded vision systems performing edge inference as opposed to cloud processing. Therefore, this led to the idea of using a Intel Neural Compute Stick as a edge inference for accelerating the performance on Raspberry Pi3.The Intel Neural Compute Stick (NCS) provides a possible route for running large - scale neural networks on a low cost, low power, portable unit. Computer vision has made it possible to acquire, process, analyze and extract high-level understanding for digital images and videos. Researchers are also looking at ways to apply the latest advances in facial-recognition technologies to uncontrolled environments, where success rate is maximum up to only 50% only. In this study, Facenet model using one shot learning algorithm is implemented for face recognition and verification on Raspberry Pi3. This system replaces the use of complete trained Facenet model on pi3 by converting this large model into Intel NCS graph and OpenVINO models format by Intel NCS SDK tools and OpenVINO Model Optimizer. With the advanced NCS API and Inference Engine API we are able to perform the inference on pi3 thereby, improving the speed of recognition of objects/ faces. The goal of this experiment is to describe a simple and easy hardware implementation of face recognition system on Raspberry pi3 that run the trained model which is trained on Custom datasets. This system is programmed using Python and is operated and controlled by Raspberry Pi3 with an USB Camera. Keywords: Intelligent Security System , Face Recognition, Facenet and Dlib, Deep Neural Networks , Convolutional Neural Networks, Embedded Vision Systems , Raspberry Pi3, Edge Inference, Intel Neural Compute Stick (NCS) , NCS SDK AND Inference Engine API , OpenVINO Model Optimizer, NCS Graph , OpenVINO Models, Python iii Table of Contents Chapter Title Page Title Page i Acknowledgments ii Abstract iii Table of Contents iv List of Figures vi List of Tables ix List of Abbreviations x 1 Introduction 1 1.1 Overview 1 1.2 Problem Statement 3 1.3 Objective 4 1.4 Limitations and Scope 4 1.5 Thesis Outline 5 2 Literature Review 6 2.1 Background 6 2.2 Challenges of Face Recognition Algorithm and 11 How Deep Learning Algorithms Can Solve It? Outline of Deep Face Architecture 3 Methodology 12 3.1 Overview 12 3.2 Data Collection 14 3.3 Data Pre-Processing 16 3.4 Main drawback to implement Facenet Model- 25 on Embedded Devices 3.5 Neural Compute Stick Platform 27 3.6 Model Optimizer 30 3.7 Procedure to Convert Facenet Model to 31 NCSDK Graph Format 3.8 Face Recognition on Raspberry Pi3 Using 38 OpenVINO Toolkit 3.9 Software 44 4 Experimental Results 46 4.1 Overview 46 4.2 Face Recognition Results on Raspberry Pi3 46 without Using Intel NCS 4.3 Face Recognition Results on Raspberry Pi3 50 Using Intel NCS and NCSDK iv 5 Conclusion, Recommendations and Future Works 80 5.1 Conclusion 80 80 5.2 Recommendations and Future Works References 81 v List of Figures Figure Title Page 2 1.1 A system set up on Raspberry Pi3 using Intel NCS 2.1 Open Face vs. earlier non-exclusive face recognition 8 Implementations 3.1 Workflow Representation of Methodology 13 3.2 Training image folder 14 3.3 Data collection corresponding to Nimshi label 15 3.4 Hyper parameters Values 16 3.5 Screenshot taken during pre-processing the Facenet Model 16 3.6 Face Detection Outputs after applying MTCNN Algorithm 17 3.7 Bounding Boxes for Face Detection of 4 Person’s 18 3.8 . .npy files 18 3.9 Pre-Trained CNN Model 19 3.10 Screenshot taken during training the Facenet model 20 3.11 Face Embedding Matrices values of 4 Persons 21 3.12 Facenet Classifier Model File 22 3.13 Face Recognition Outputs of 4 Persons 23 3.14 Flowchart representation of training and testing Facenet model 24 with custom data 3.15 Size of my trained model(.pb) and classifier model(.pkl) 25 3.16 Time Taken To load Trained Mode (.pb) l On Raspberry PI3 26 3.17 Time Taken To load Trained Mode (.pb) l On Raspberry PI3 27 3.18 Implementation of the Myriad2 VPU used within the Neural 28 Compute Stick (NCS) platform 3.19 Illustration of using Intel NCS to develop a DNN based Embedded 29 System 3.20 Live-Object Detection on Raspberry Pi 3 using Intel- NCS 30 3.21 Intel NCSDK And OpenVINO Architecture 31 3.22 Facenet Checkpoint Files 32 3.23 Facenet Graph file after Compiling and its Size 34 3.24 Simple Inference Code Flow 35 3.25 FACENET MODEL VIEW 36 3.26 OPENVINO MODEL SIZE PROPERTIES 37 3.27 NEURAL COMPUTE STICK AND NEURAL 38 COMPUTE STICK 2 3.28 MYRIAD X ARCHITECTURE 39 3.29 Command To Convert OpenVINO FP16 Format 40 3.30 Successful Conversion To FP16 OpenVINO IR 40 3.31 Models 41 3.32 Visualization Of Network Topology Of .xml File 42 42 3.33 Convolution to Pooling Layer Of Different Data Reshape to Normalization Layer Showing Different Data Size 43 3.34 44 3.35 OpenVINO .XML Model Structure Transferred OpenVINO Models To Raspberry PI3 vi 3.36 Flowchart For OpenVINO Face Recognition Algorithm 45 4.1 . Face Recognition results on Raspberry pi3 without using Intel 47 NCS 4.2 Counting no of times Face Recognized to generate Confusion 48 Matrix 4.3 Implementation on Raspberry Pi3 using Intel NCS 50 4.4 Face Recognition Results of 5 Persons under Lighting 51 4.5 Face Recognition Results of 5 Persons under Low Lighting 52 4.6 Face Recognition Results under different Emotions 53 4.7 Python Code for Calculating Difference between 2 images 54 4.8 Distance Calculation based on Threshold Value as shown in 54 RaspberryPi3 Shell 4.9 Implementation on Raspberry Pi3 using Intel OpenVINO 55 Inference Engine 4.10 Face Recognition Results on Raspberry Pi3 Using Intel 56 OpenVINO Method 4.11 Face Recognition Results on Raspberry Pi3 Under HAT 57 Conditions Using Intel OpenVINO Method 4.12 Multiple Face Recognition Results on Raspberry Pi3 Using 58 Intel OpenVINO Method 4.13 Inference Time Calculation Using OpenVINO deployed on 59 Intel NCS 4.14 Inference Time Calculation For Multiple Face Recognition 59 Using OpenVINO deployed on Intel NCS 4.15 Inference Time Calculation Using OpenVINO deployed 60 on Intel NCS2 4.16 Inference Time Calculation For Multiple Face Recognition 60 Using OpenVINO deployed on Intel NCS 4.17 Flowchart of how Images are passed through Intel NCS 61 4.18 Time for loading SVM Models 63 4.19 OPENVINO MODEL SIZE PROPERTIES 63 4.20 Time for reading IR Models 63 4.21 Time taken to generate Input and Output Blobs 64 4.22 Time taken to Create Executable Network 64 4.23 Time taken for Pre-Processing on Raspberry Pi3 64 4.24 Time taken for performing inference on my Trained 65 Model 66 4.25 Benchmark Tool Results on my trained .XML File 4.26 Benchmark Tool Results of my trained .XML File on 67 Raspberry PI3 ARM Processor vii 4.27 Facenet Prediction Graph 69 4.28 Facenet Frame Rate Graphs 70 4.29 Printing the Maximum Probability Prediction Confidence Value 77 Corresponding to Clifford Label 4.30 Accuracy for 4 different cases 78 viii List of Tables Table Title Page 2.1 .

Load more