Pruning Convolution Neural Network (Squeezenet) For

PRUNING CONVOLUTION NEURAL NETWORK (SQUEEZENET) FOR EFFICIENT HARDWARE DEPLOYMENT A Thesis Submitted to the Faculty of Purdue University by Akash S. Gaikwad In Partial Fulfillment of the Requirements for the Degree of Master of Science in Electrical and Computer Engineering December 2018 Purdue University Indianapolis, Indiana ii THE PURDUE UNIVERSITY GRADUATE SCHOOL STATEMENT OF COMMITTEE APPROVAL Dr. Mohamed El-Sharkawy Department of Electrical and Computer Engineering Dr. Maher Rizkalla Department of Electrical and Computer Engineering Dr. Brian King Department of Electrical and Computer Engineering Approved by: Dr. Brian King Head of the Graduate Program iii This Thesis is dedicated to my Parents, Jayashree and Sunil Gaikwad, and my family, Alisha and Ragini. iv ACKNOWLEDGMENTS I would like to express my sincere gratitude to my adviser Dr. Mohamed El- Sharkawy for his patience, motivation and constant guidance throughout my masters education and research. Besides my advisor, I would like to thank my fellow lab- mates, Durvesh, Dewant, Surya, Shreeram and Raghavan for valuable advice and stimulating sleepless discussion and for all the fun we had in the last two years. My sincere thanks to IoT Collaboratory and department of Electrical and Computer Engineering, who provided the access to the laboratory and research facilities. Special thanks to Sherrie and Dr. Brian King for their constant support and motivation throughout my time at IUPUI. Last but not the least, I would like to thank and acknowledge, Government of Maharashtra and India, for sponsoring my education. v TABLE OF CONTENTS Page LIST OF TABLES :::::::::::::::::::::::::::::::::: viii LIST OF FIGURES ::::::::::::::::::::::::::::::::: ix SYMBOLS :::::::::::::::::::::::::::::::::::::: xii ABBREVIATIONS :::::::::::::::::::::::::::::::::: xiii ABSTRACT ::::::::::::::::::::::::::::::::::::: xiv 1 INTRODUCTION :::::::::::::::::::::::::::::::: 1 1.1 Motivation :::::::::::::::::::::::::::::::::: 1 1.2 Contribution ::::::::::::::::::::::::::::::::: 4 2 BACKGROUND ::::::::::::::::::::::::::::::::: 5 3 CONCEPTS :::::::::::::::::::::::::::::::::::: 7 3.1 Introduction to Neural Network :::::::::::::::::::::: 7 3.2 Network training :::::::::::::::::::::::::::::: 8 3.3 Introduction to CNN :::::::::::::::::::::::::::: 9 3.3.1 Convolution layer :::::::::::::::::::::::::: 13 3.3.2 Nonlinearity layers ::::::::::::::::::::::::: 16 3.3.3 Normalization layer ::::::::::::::::::::::::: 16 3.3.4 Dropout layers ::::::::::::::::::::::::::: 16 3.3.5 Pooling layers :::::::::::::::::::::::::::: 17 3.3.6 Fully connected layers (FC) :::::::::::::::::::: 19 3.4 Benchmarked CNN architectures ::::::::::::::::::::: 20 3.4.1 LeNet :::::::::::::::::::::::::::::::: 20 3.4.2 AlexNet ::::::::::::::::::::::::::::::: 21 3.4.3 GoogLeNet ::::::::::::::::::::::::::::: 21 3.4.4 VGGNet ::::::::::::::::::::::::::::::: 21 vi Page 3.4.5 SqueezeNet ::::::::::::::::::::::::::::: 22 3.5 Datasets ::::::::::::::::::::::::::::::::::: 23 3.5.1 MNIST ::::::::::::::::::::::::::::::: 24 3.5.2 CIFAR-10 :::::::::::::::::::::::::::::: 24 3.5.3 ImageNet :::::::::::::::::::::::::::::: 24 4 COMPARISON BETWEEN DIFFERENT DEEP LEARNING FRAME- WORK ::::::::::::::::::::::::::::::::::::::: 26 4.1 TensorFlow ::::::::::::::::::::::::::::::::: 27 4.2 Keras ::::::::::::::::::::::::::::::::::::: 27 4.3 Caffe ::::::::::::::::::::::::::::::::::::: 27 4.4 Theano :::::::::::::::::::::::::::::::::::: 28 4.5 PyTorch ::::::::::::::::::::::::::::::::::: 28 5 TRAINING THE SQUEEZENET WITH THE CIFAR-10 DATASET ON PYTORCH :::::::::::::::::::::::::::::::::::: 29 5.1 Transfer learning :::::::::::::::::::::::::::::: 29 5.2 Training the SqueezeNet :::::::::::::::::::::::::: 30 6 PRUNING METHODS :::::::::::::::::::::::::::::: 33 6.1 Types of Pruning :::::::::::::::::::::::::::::: 34 7 PRUNING BASED ON L2 NORMALIZATION OF ACTIVATION MAPS : 41 7.1 Implementation algorithm ::::::::::::::::::::::::: 41 7.2 Results :::::::::::::::::::::::::::::::::::: 48 8 PRUNING BASED ON TAYLOR EXPANSION OF COST FUNCTION :: 49 8.1 Implementation algorithm ::::::::::::::::::::::::: 49 8.2 Results :::::::::::::::::::::::::::::::::::: 52 9 PRUNING BASED ON A COMBINATION OF TAYLOR EXPANSION OF COST FUNCTION & L2 NORMALIZATION OF ACTIVATION MAPS :: 59 9.1 Implementation algorithm ::::::::::::::::::::::::: 59 9.2 Results :::::::::::::::::::::::::::::::::::: 62 vii Page 10 HARDWARE DEPLOYMENT OF PRUNED SQUEEZENET MODEL ON BLUEBOX USING RTMAPS :::::::::::::::::::::::::: 68 10.1 NXP BlueBox :::::::::::::::::::::::::::::::: 69 10.1.1 S32V234 ::::::::::::::::::::::::::::::: 70 10.1.2 LS2084A ::::::::::::::::::::::::::::::: 71 10.2 Real-Time Multisensor applications (RTMaps) :::::::::::::: 72 10.2.1 RTMaps Runtime Engine ::::::::::::::::::::: 73 10.2.2 RTMaps Component Library ::::::::::::::::::: 74 10.2.3 RTMaps Studio ::::::::::::::::::::::::::: 74 10.2.4 RTMaps Embedded ::::::::::::::::::::::::: 74 10.3 Hardware Implementation ::::::::::::::::::::::::: 75 11 SUMMARY :::::::::::::::::::::::::::::::::::: 77 REFERENCES :::::::::::::::::::::::::::::::::::: 78 viii LIST OF TABLES Table Page 1.1 Types of layers in modern Convolution Neural Network.(Percentage wise) : 2 1.2 Number of parameters in the modern Convolution Neural Network architecture. ::::::::::::::::::::::::::::::::::::: 3 7.1 Sensitivity to pruning the SqueezeNet model on CIFAR-10 dataset. :::: 42 8.1 Pruning Pattern per iteration of SqueezeNet Model using Taylor expansion- based criterion with pruning ratio = 67%. :::::::::::::::::: 57 8.2 Sensitivity to pruning the SqueezeNet model on CIFAR-10 dataset. :::: 58 ix LIST OF FIGURES Figure Page 1.1 Limited resources for embedded devices. ::::::::::::::::::: 2 3.1 Biological neuron (left) and its Mathematical model (right) [20]. :::::: 7 3.2 Artificial Neural Network which has three layers: Input, Hidden and Out- put layers. :::::::::::::::::::::::::::::::::::: 8 3.3 Cost function of CNN ::::::::::::::::::::::::::::: 10 3.4 Comparison of NN and CNN. ::::::::::::::::::::::::: 10 3.5 Fully Connected NN for Images. ::::::::::::::::::::::: 11 3.6 Locally Connected NN for Image ::::::::::::::::::::::: 11 3.7 Convolution layer for Image :::::::::::::::::::::::::: 12 3.8 CNN classifier example [20]. :::::::::::::::::::::::::: 12 3.9 Application of a single convolutional layer with N filters of size k × k × 3 with stride S = 1 to input data of size width height with three channels. : 14 3.10 Kernels in convolution layer. :::::::::::::::::::::::::: 14 3.11 Zero Padding :::::::::::::::::::::::::::::::::: 15 3.12 Convolution example with following parameters W = 7;K = 2;F = 3;S = 2;P = 1 [21] ::::::::::::::::::::::::::::::: 17 3.13 Rectified linear Unit function (ReLU) ::::::::::::::::::::: 18 3.14 Dropout layer :::::::::::::::::::::::::::::::::: 18 3.15 Pooling layer :::::::::::::::::::::::::::::::::: 19 3.16 Maxpool with kernel size = 2 and stride = 2 ::::::::::::::::: 19 3.17 Fully connected layer in CNN architecture. :::::::::::::::::: 20 3.18 LeNet-5 architecture [22]. ::::::::::::::::::::::::::: 20 3.19 AlexNet architecture. :::::::::::::::::::::::::::::: 21 3.20 GoogLeNet Network [2] (From Left to Right) :::::::::::::::: 22 3.21 VGG16 Architecture (From Left to Right) :::::::::::::::::: 22 x Figure Page 3.22 Fire layer of SqueezeNet Architecture [1]. ::::::::::::::::::: 24 4.1 Deep learning framework logos [23] [24] [25] [26] [27]. :::::::::::: 26 5.1 SqueezeNet model accuracy - trained from scratch on CIFAR-10 dataset. : 30 5.2 SqueezeNet model accuracy - (Pretrained model- ImageNet) on CIFAR-10 dataset. ::::::::::::::::::::::::::::::::::::: 31 5.3 Training loss for SqueezeNet model - (Trained from scratch) :::::::: 31 5.4 Training loss for SqueezeNet model - (Pretrained model) :::::::::: 31 5.5 Modified SqueezeNet architecture for CIFAR-10 dataset ::::::::::: 32 6.1 Vehicle classification with small CNN. :::::::::::::::::::: 36 6.2 Vehicle classification with large CNN. ::::::::::::::::::::: 37 6.3 Vehicle classification with large CNN and pruning. ::::::::::::: 38 6.4 Fine Pruning. :::::::::::::::::::::::::::::::::: 39 6.5 Coarse Pruning. ::::::::::::::::::::::::::::::::: 39 6.6 Coarse Pruning. ::::::::::::::::::::::::::::::::: 39 6.7 Pruning steps to reduce the model based on pruning criteria. :::::::: 40 7.1 Activation maps generated by convolution :::::::::::::::::: 41 7.2 Pruning steps to reduce the model based on L2 normalization of activation map. ::::::::::::::::::::::::::::::::::::::: 44 7.3 Pruning with L2 Normalization of activation with pruning ratio - 67% ::: 45 7.4 Pruning with L2 Normalization of activation with pruning ratio - 75% ::: 45 7.5 Pruning with L2 Normalization of activation with pruning ratio - 80% ::: 45 7.6 Pruning with L2 Normalization of activation with pruning ratio - 85% ::: 46 7.7 Pruning with L2 Normalization of activation with pruning ratio - 90% ::: 46 7.8 Pruning with L2 Normalization of activation with pruning ratio - 95% ::: 46 7.9 Accuracy vs Pruning ratio for L2 Normalization of activation map based pruning. ::::::::::::::::::::::::::::::::::::: 47 8.1 Cost function value before pruning and after pruning. :::::::::::: 50 8.2 Pruning with Taylor expansion based criteria with pruning ratio - 67% :: 52 8.3 Pruning with Taylor expansion based criteria with pruning ratio - 70% :: 52 xi Figure Page 8.4 Pruning with Taylor expansion based criteria with pruning ratio - 80% :: 53 8.5 Pruning with Taylor expansion based criteria with pruning ratio - 85% :: 53 8.6 Pruning with Taylor expansion based criteria with pruning ratio - 90% :: 54 8.7 Pruning

Load more