A Novel Architecture for a High Performance Low Complexity Neural Device

A Novel Architecture for a High Performance Low Complexity Neural Device Richard Patrick Palmer A thesis presented for a Doctor of Philosophy Department of Computer Science University College, London. February 1992 ProQuest Number: 10609310 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a com plete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. uest ProQuest 10609310 Published by ProQuest LLC(2017). Copyright of the Dissertation is held by the Author. All rights reserved. This work is protected against unauthorized copying under Title 17, United States C ode Microform Edition © ProQuest LLC. ProQuest LLC. 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106- 1346 ABSTRACT This thesis presents a novel architecture on which to implement neural networks and simple digital signal processing models in a compact and low cost manner. The need for this architecture was identified from a particular application; this application requiring a portable real-time device with which to perform neural network and simple digital signal processing models on speech signals. The development of the architecture was driven by this, and other similar applications, and did not start with any pre-conceived ideas. This allowed complete freedom to be expressed in its development, enabling a more novel architecture to be developed. In this thesis the development of this architecture is reported. This extends from the analysis of current neural models, the identification of the necessary features from these, and the creation of a unified architecture to provide these features. The design of a silicon chip based on this architecture is then presented. The architecture that has been developed combines the programmability and high efficiency of MIMD arrays, with the low complexity in both communications and control of SIMD arrays. The device built using this architecture incorporates a single processing element that consists of only 5300 transistors. Projections indicate that a totally integrated processor array could be built; incorporating fourteen such processors with on-chip memory, A/D and D/A converters. An analysis of the architecture, and other architectures that fit into the same classification class, is then presented. This analysis has been able to identify some common features with MIMD/SIMD machines, and to conclude with some projections for the future. 2 Acknowl e< I would like to thank many people in helping me during my time working on this thesis. Mostly I would like to thank Peter Rounce for helping me during the development of this thesis, and for his tireless reading, and rereading, of various drafts. I would also like to thank Chris Clack for volunteering to read this thesis over his Christmas break, and to Mike Brent for allowing me to use the London VLSI Consortium PAD Frame utility. Many thanks also to my parents, flat mates and Simone for putting up with me during the time spent working on my Ph.D. Lastly thanks to Steve Wright and the afternoon posse, for keeping a smile on my face during the many afternoons spent working on this thesis. 3 Contents Title Page 1 Abstract 2 Acknowledgements 3 Contents 4 List of Figures 9 Chapter 1 Introduction 1.1 The Need for Alternative Computational Methods 12 1.2 Putting Neural Networks to Work 13 1.3 Objectives of this Thesis 14 1.4 Research Contributions 15 1.5 Organisation of Thesis 15 Chapter 2 Neural Networks 2.1 Desirable Features of Neural Networks 18 2.2 Development of Neural Networks 19 2.2.1 Biological Neuron 19 2.2.2 Artificial Neuron 21 2.3 Analysis of Neural Networks 21 2.3.1 Single Layer Network 22 2.3.2 Multilayer Networks 24 2.4 Learning in Neural Networks 26 2.4.1 Unsupervised Learning 26 2.4.2 Supervised Learning 26 2.4.2.1 Single Layer Learning Rule (Delta Rule) 27 2.4.2.2 Multilayer Learning Rule (Backpropagation) 27 2.4.3 Other Learning Techniques 28 2.5 Summary of Neural Networks 29 Chapter 3 Application Areas 3.1 The Development of Speech Processing Systems 30 3.2 Problems to be Addressed 31 3.2.1 Speech Features 31 3.2.2 Phonemes 32 3.3 Difficulties Encountered by Deaf People 33 3.4 First Generation of Devices - SiVo 33 4 Richard Palmer Phd.Thesis 3.4.1 The 'Peakpicker' Chip 36 3.4.2 Results Obtained by using SiVo 37 3.4.3 Limitations of SiVo 38 3.5 Neural Networks and Speech Processing 39 3.5.1 Use of Neural Networks at UCL 41 3.5.1.1 Analysis and Resynthesis of Speech 41 3.5.1.2 Fundamental Period Extraction Network 43 3.5.2 Real-Time Implementation for Fx Extraction 45 3.5.3 Requirements for an Alternative Architecture 48 3.6 Conclusion 49 Chapter 4 Architectural Considerations 4.1 The Diversity of Neural Implementations 50 4.1.1 Neural Co-processors 51 4.1.1.1 NETSIM - Texas Instruments 52 4.1.1.2 DELTA Co-Processor - SAIC Corp. 52 4.1.2 Neurocomputers and ASIC Neural Devices 53 4.1.2.1 ETANN - Intel Corporation. 54 4.1.2.2 DNNA - Neural Semiconductor Ltd. 56 4.1.2.3 Intelligent Memory Chips - Oxford Computer 57 4.1.2.4 Hitachi Neural Wafer 59 4.1.2.5 Digital Snake - Univ. Ancona, Italy. 61 4.1.2.5 A Flexible WSI Neural Network - InstNat Poly 62 4.2 Assessment of Neural Implementations 64 4.2.1 Analogue Implementations 65 4.2.2 Hybrid and Pulse Neural Implementations 66 4.2.3 Digital Implementations 67 4.2.3.1 MIMD Arrays 67 4.2.3.2 SIMD Arrays 69 4.3 Analysis of SIMD Arrays 72 4.3.1 Flexibility 73 4.3.2 Performance 73 4.3.3 Real-Time Constraints 75 4.4 Summary of Available Neural Implementations 76 Chapter 5 Proposed Architecture 5.1 Objectives for the Architecture 78 5.2 Development of Architecture 79 5 rna.xmsls 5.2.1 Desirable Feature of MIMD Arrays gQ 5.2.2 Desirable Feature of SIMD Arrays g-^ 5.3 How to Combine these Features g2 5.3.1 Combining MIMD and SIMD Control 22 5.3 1.1 Storage of MIMD and SIMD Control Strategies 83 5.3 1.2 Control Switching & Virtual Neuron Definition 83 5.3 1.3 Proposed Communications Scheme g-j 5.3 1.4 Synchronisation of Processors gg 5.4 Proposed Architecture gg 5.4.1 Architecture of Processing Element g-^ 5.5 Programming Strategy gg 5.5.1 SIMD Program ^ 5.5.2 MIMD Program ^ 5.6 Application Notes and Examples ^y 5.6.1 Programming Techniques jy 5.6 1.1 Definition of Virtual Neuron 5.6 1.2 Initialisation and Synchronisation k i 5.6 1.3 Implementation of Input Window K 2 5.6 1.4 Communication of Partial Sums 1(4 5.6.2 Application Examples k & 5.6 ,2.1 Fully Connected Network Kg 5.6 ,2.2 State Feedback Network Kg 5.6 ,2.3 Finite Impulse Response Filter Kg 5.6 2.4 Infinite Impulse Response Filter 1]q 5.6.3 Tx Extraction Implementation 121 5.6 3.1 Silicon Efficiency 122 5.6 .3.2 Reduced Sample Period I23 5.6 3.3 Reduced Response Time IK 5.6.4 Summary of Implementation Examples IK Chapter 6 Detailed Description of Design 6.1 Design of CMOS Implementation IK 6.2 Algorithms used for the Arithmetic Fuic^ions 11; 6 . 2.1 Pointer Manipulations H, 6 . 2.2 Partial Sum Additions IK 6.2.3 Multiplication Instructions 12( 6.3 Datapath Construction 12' 6.3.1 Timing Strategy 12 ‘ 6 Kicnara maimer Pha.Thesis 6.3.2 Arithmetic Unit 125 6.3.3 Saturated Addition 126 6.3.4 Double Length Shift 128 6.3.5 Execution of a Single Multiply Iteration 129 6.3.6 Simulation of Datapath 130 6.3.7 Datapath Performance 133 6.4 Generation of Instruction Decoder 134 6.4.1 PLA Template 135 6.4.2 PLA Personality Matrix 136 6.5 Development of Control Switching 137 6.5.1 Transition from MIMD to SIMD 140 6.5.2 Transition from SIMD to MIMD 141 6.5.3 Implementation of Control Switching 144 6.6 Development of SIMD Control 147 6.7 Implementation of Bus Logic Functions 150 6.7.1 Modulo Address Generation 150 6.7.2 Sigma Table Bus 152 6.8 Full Simulation of Processor Array 155 6.9 Design Conclusion 155 6.10 Future Large-Scale Integration 158 6.11 Design Environment 164 Chapter 7 Analysis of Architecture 7.1 Classification of Parallel Machines 169 7.2 Other MIMD/SIMD Machines 172 7.2.1 Non-Von 4 - Columbia University 172 7.2.2 DADO - Columbia University 173 7.2.3 XIMD - Carnegie Mellon University 174 7.3 MIMD/SIMD Machines in the Future 175 7.4 Postscript 177 Chapter 8 Conclusion 8.1 Summary of Thesis 178 8.2 Research Contribution 178 8.3 Literature Published 180 Plates P.l Confocal Light Microscope 181 7 Richard Palner Phd.Thesis P.2 Operation of a Confocal Light Microscope 181 Appendix 1 LOCAL Finite State Machine 190 2 GLOBAL Finite State Machine 195 3 Instruction Decoder 200 4 Netlist Generation 214 i) NET 215 ii) LABELS 216 iii) NETLIST 217 5 ESP Program Listing 220 6 PAD Frame Request File 223 7 Performance Analysis 227 8 Literature Published 231 i) R.

Load more