Neural Network in Hardware
Total Page:16
File Type:pdf, Size:1020Kb
UNLV Theses, Dissertations, Professional Papers, and Capstones 12-15-2019 Neural Network in Hardware Jiong Si Follow this and additional works at: https://digitalscholarship.unlv.edu/thesesdissertations Part of the Electrical and Computer Engineering Commons Repository Citation Si, Jiong, "Neural Network in Hardware" (2019). UNLV Theses, Dissertations, Professional Papers, and Capstones. 3845. http://dx.doi.org/10.34917/18608784 This Dissertation is protected by copyright and/or related rights. It has been brought to you by Digital Scholarship@UNLV with permission from the rights-holder(s). You are free to use this Dissertation in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s) directly, unless additional rights are indicated by a Creative Commons license in the record and/or on the work itself. This Dissertation has been accepted for inclusion in UNLV Theses, Dissertations, Professional Papers, and Capstones by an authorized administrator of Digital Scholarship@UNLV. For more information, please contact [email protected]. NEURAL NETWORKS IN HARDWARE By Jiong Si Bachelor of Engineering – Automation Chongqing University of Science and Technology 2008 Master of Engineering – Precision Instrument and Machinery Hefei University of Technology 2011 A dissertation submitted in partial fulfillment of the requirements for the Doctor of Philosophy – Electrical Engineering Department of Electrical and Computer Engineering Howard R. Hughes College of Engineering The Graduate College University of Nevada, Las Vegas December 2019 Copyright 2019 by Jiong Si All Rights Reserved Dissertation Approval The Graduate College The University of Nevada, Las Vegas November 6, 2019 This dissertation prepared by Jiong Si entitled Neural Networks in Hardware is approved in partial fulfillment of the requirements for the degree of Doctor of Philosophy – Electrical Engineering Department of Electrical and Computer Engineering Sarah Harris, Ph.D. Kathryn Hausbeck Korgan, Ph.D. Examination Committee Chair Graduate College Dean Shahram Latifi, Ph.D. Examination Committee Member R. Jacob Baker, Ph.D. Examination Committee Member Evangelos Yfantis, Ph.D. Graduate College Faculty Representative ii Abstract This dissertation describes the implementation of several neural networks built on a field programmable gate array (FPGA) and used to recognize a handwritten digit dataset – the Modified National Institute of Standards and Technology (MNIST) database. A novel hardware- friendly activation function called the dynamic ReLU (D-ReLU) function is proposed. This activation function can decrease chip area and power of neural networks when compared to traditional activation functions at no cost to prediction accuracy. The implementations of three neural networks on FPGA are presented: 2-layer online training fully-connected neural network, 3-layer offline training fully-connected neural network, and two solutions of Super-Skinny Convolutional Neural Network (SS-CNN). The 2-layer online training fully-connected neural network was built on an FPGA with varying data width. Reducing the data width from 8 to 4 bits only reduces prediction accuracy by 11%, but the FPGA area decreases by 41%. The 3-layer offline training fully-connected neural network was built on an FPGA with both the sigmoid and the proposed D-ReLU activation functions. Compared to networks that use the sigmoid function, the proposed D-ReLU function uses 24-41% less area with no loss to prediction accuracy. Further reducing the data width of the 3-layer networks from 8 to 4 bits, the prediction accuracy only decreased by 3-5%, with area being reduced by 9-28%. The proposed sequential and parallel SS-CNN networks perform state-of-the-art (99%) recognition accuracy but with fewer layers and less neurons than prior works, for example, the LeNet-5 network. Using parameters with 8 bits of precision, the FPGA solutions of this SS-CNN show no recognition accuracy loss when compared to the 32-bit floating point software solution. In addition to high recognition accuracy, both of the proposed FPGA solutions are low power and can fit in a low- iii cost Cyclone IVE FPGA. Moreover, these FPGA solutions have maximally 145× faster execution time than software solutions, even despite running at 97× to 120× lower clock rate. Thus, FPGA implementations of neural networks offer a high-performance, low-power alternative to traditional software methods, and the proposed novel D-ReLU activation function offers additional improvements to performance and power savings. Furthermore, the hardware solutions of the proposed SS-CNN provide a high-performance, hardware-friendly, and power efficient solution compared to other bulky convolutional neural networks. iv Acknowledgements I’d like to thank my advisor Dr. Sarah L. Harris for her continuous guidance, support and encouragement during my five years’ study and work at UNLV. She shared a lot of her experience with me; we talk about research, being a Ph.D student, career, family, playing… She is my mentor, model, elder sister and good friend. I would also like to thank one of my idols, Dr. Evangelos Yfantis from the Computer Science department at UNLV. I enjoyed every minute in his class “Neural Networks and Genetic Algorithms”. It is always inspiring and exciting talking with him. I hope one day I can fly a plane like him and enjoy working as much as he does. I also thank my other committee members, Dr. Shahram Latifi, Dr. R. Jacob Baker, and Dr. Justin Zhan for being my committee members and giving me a lot of advice on my research and dissertation. Dr. Yingtao Jiang, Dr. Grzegorz Chmaj, Ms. Jennifer Reff and other professors and staff members in ECE department have also been very supportive and helpful in my journey here at UNLV. Outside of UNLV, I would like to thank my hiking friends – they are my family in Las Vegas. I always get boosted and refreshed after hiking, scrambling, talking, and laughing with them. v Table of Contents Abstract ......................................................................................................................................... iii Acknowledgements ....................................................................................................................... v Table of Contents ......................................................................................................................... vi List of Tables .............................................................................................................................. viii List of Figures ............................................................................................................................... ix Chapter 1 Introduction ............................................................................................................ 1 Chapter 2 Background ............................................................................................................ 4 2.1 Fully-connected Neural Network .................................................................................. 5 2.2 Convolutional Neural Network ..................................................................................... 6 2.3 Activation Functions ...................................................................................................... 9 2.3.1 Sigmoid Function .................................................................................................... 9 2.3.2 ReLU Function ...................................................................................................... 10 2.4 Background Summary ................................................................................................. 12 Chapter 3 Methodology and Procedures ............................................................................. 13 3.1 MNIST Dataset ............................................................................................................. 13 3.2 2-layer Fully-connected Neural network.................................................................... 14 3.2.1 Network Architecture ........................................................................................... 14 3.2.2 FPGA System ........................................................................................................ 16 3.3 3-layer Fully-connected Neural network.................................................................... 22 3.3.1 Network Architecture ........................................................................................... 22 3.3.2 FPGA System ........................................................................................................ 24 3.4 Super-Skinny Convolutional Neural Network (SS-CNN) ........................................ 29 3.4.1 Network Design and Architecture ....................................................................... 29 3.4.2 FPGA System ........................................................................................................ 34 3.5 Sigmoid Function Approximation .............................................................................. 42 3.6 Dynamic ReLU Function ............................................................................................. 44 Chapter 4 Results and Discussion ......................................................................................... 48 4.1 2-layer Fully-connected Neural network on an FPGA ............................................