Memristive Device Based Brain-Inspired Navigation And

Memristive Device based Brain- Inspired Navigation and Localization for Robots A dissertation submitted to the Graduate School of the University of Cincinnati in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY in the Department of Mechanical and Materials Engineering of the College of Engineering and Applied Sciences by Mohammad Sarim Master of Technology Aligarh Muslim University, UP, India December 2012 Committee Chair: Manish Kumar, Ph.D. Abstract Biomimetic robots have gathered tremendous research interests for various applications ranging from resource hunting in unknown terrains to search and rescue operations in situations of dis- asters such as earthquakes, fires, or terror-attacks. Biological species have been known to learn from the environment, collect and process data, and make appropriate decisions rather seamlessly. Such sophisticated computing capabilities in artificial robotics counterpart are difficult to achieve, especially if done in real-time with ultra-low energy consumption. Traditionally, researchers have explored the possibility of employing computational methods such as artificial neural networks. However, a biological neuronal system is inherently complex. Simulating such a large network of neurons on conventional computing platforms is computationally very expensive and consumes tremendous power. Further, large-scale networks suffer from the famous \memory wall" problem. The memory devices are ideal for this purpose since they are capable of simultaneous computation and learning through resistance switching of their oxide layer on application of a voltage signal, thus avoiding the memory wall. In this work, we have developed a memristive device based learning scheme for imparting deep learning capabilities to robots. These devices are arranged in a crossbar array to develop a neuromorphic computing platform that can provide a highly scalable and energy efficient computing architecture for learning in robots. A device-physics based model of a two-terminal synaptic memory device is used to simulate the learning behavior in the robot. The resistance switching is modeled using the Frenkel-Poole emission in the memory device. We have demonstrated the validity of this approach by navigating a robot, integrated with just a few devices, through an unknown environment rigged with obstacles. A major advantage of our approach, as compared to traditional navigation methods, such as potential field algorithms, is the i ability to escape the region of local minima. Further, our learning scheme is very scalable and can be applied as such to any miniature robot readily. We demonstrated this ability by integrating our learning scheme on a commercially available Khepera III robot by K-Team and guiding it through an unknown environment filled with obstacles. The Khepera III robot is a very capable robot with on-board processing capabilities and integrated ultrasonic and infrared proximity sensors. The sensors give current to the modeled memory device which, in turn, moves the robot according to the resistance values. These resistance values or synaptic weights, to say, are then `learned' through the mechanism of Spike Timing Dependent Plasticity (STDP). To explore the scalability of this approach, we also developed a robot localization mechanism using a large-scale network of such devices. This approach is motivated by the `place cells' located in the hippocampus of the animal brain that fire when the animal enters a corresponding `place field’ in the environment, thus making them responsible for providing a cognitive map of the environment. As the robot explores the environment, the localization network associates the environment features around it to the location information coming from place cells. We demonstrated this approach on the Khepera III robot moving in an unknown environment with randomly located distinct landmarks. We also integrated this localization mechanism with the navigational ability developed earlier to complete the robot navigational scheme. The experimental results show that, after learning, the robot is capable of localizing itself with good precision, thus corroborating the validity of this approach and establish the potential and robustness of memristive device based networks. ii Acknowledgments \In the name of Allah, the Most Beneficent, the Most Merciful" First and foremost, I would like to express my sincere gratitude to my research advisor and committee chair, Dr. Manish Kumar, for his constant support and motivation throughout the development of this work. This work would not be possible without the encouragement and guidance of Dr. Kumar and I am highly grateful for his excellent advice and support. I would also like to thank him for all the financial support that enabled me to carry out this work with cutting-edge research and computational facilities. His constant encouragement towards publications, conference presentations and talks have added a lot of value to my research understanding during my PhD journey. Many thanks to my committee members, Dr. Rashmi Jha, and Dr. Ali Minai, for their excellent guidance and inputs towards making this work an outstanding research breakthrough. Their inputs on hardware modeling and simulation aspects, respectively, have greatly added to the value of this work and the publications. I would also like to thank Dr. David Thompson and Dr. Tamara Lorenz, for their support towards the refinement of this work through their reviews, inputs and suggestions. I am deeply grateful to all my committee members, collectively, for serving on my dissertation committee and reviewing the work in time to make it perfect. I would also like to thank Dr. Balaji Sharma, my senior at CDS Lab, for assisting me with my robots and getting them ready for experiments. Further thanks to my lab members, Alireza, Ruoyu, Mohammadreza, Gaurav, Rumit, Aditya, Hans, Matthew, and everyone else at CDS Lab for their support and help during the past few years. Many thanks to Nicole Jenkins and Lorri Blanton from UC International for shaping me up as a leader through the IPALs program. Cheers to fellow IPALs and my best wishes for this initiative. iv My Mom and Dad, who supported my decision to travel 7600 miles for seeking knowledge, I can never thank them enough for their motivation and constant encouragement while being away from their son. Thanks to my siblings Sana and Yasir, for all their love. I would like to especially thank my dearest wife, Sara, for her patience and motivation throughout this PhD journey which I could not have imagined without her. I am grateful to The University of Toledo as well as University of Cincinnati, the College of Engineering and Applied Sciences, the Department of Mechanical and Materials Engineering, and the Graduate School at UC for providing me an excellent research environment. I also acknowledge the National Science Foundation for funding this research work. v Contents 1 Introduction 1 2 Literature Review 5 2.1 Types of Machine Learning . .5 2.1.1 Supervised Learning Techniques . .6 2.1.2 Unsupervised Learning Techniques . .9 2.2 Learning in Biological Systems . 10 2.3 Biological Navigation and Localization . 11 2.4 Learning in Robotics . 12 2.4.1 Navigation . 12 2.4.2 Simultaneous Localization and Mapping (SLAM) . 13 2.5 Problem Identification . 14 3 Spike Timing Dependent Plasticity 16 4 Device Models 19 4.1 Macro-Model of the Memristor Device . 19 4.2 Leaky Integrate and Fire Neuron Model . 21 4.3 Device-Physics Derived Model . 22 5 Learning Schemes 26 5.1 System Description . 26 5.1.1 Sensor Configuration . 26 5.1.2 Robot Kinematics . 28 vi 5.1.3 Place Cell Configuration . 29 5.1.4 Environment Features . 30 5.2 Learning Scheme for Navigation . 31 5.3 Learning Scheme for Localization . 32 5.3.1 Network Design . 32 5.3.2 Learning Rule . 34 5.3.3 Learning Scheme . 35 6 Simulation Results 37 6.1 Simulation Results for Navigation . 37 6.1.1 Macro-Model (M1) . 37 6.1.2 Device-Physics Derived Model (M2) . 38 6.1.3 Comparison with Paths Based on Reinforcement Learning Algorithms . 39 6.2 Device Variability Study . 42 6.2.1 Variability in Device Doping Concentration . 43 6.2.2 Variability in Update of Resistive States . 44 6.2.3 Device Malfunction . 44 6.3 Simulation Results for Localization . 46 6.3.1 Memristive Device Model . 46 6.3.2 Comparison with Computational SLAM . 46 7 Experimental Results 60 7.1 Khepera III Robot . 60 7.2 Robot Navigation . 60 7.2.1 Experimental Setup . 60 7.2.2 Results . 61 7.3 Localization Results . 63 7.3.1 Environment Setup . 63 7.3.2 Memristive Device based Model . 63 7.3.3 Comparison with Computational SLAM . 64 7.4 Discussions . 65 vii 8 Discussions 71 8.1 Future Work . 72 viii List of Figures 1.1 A Pascaline . .2 1.2 Charles Babbage's differential engine . .3 1.3 IBM's synaptic computer . .4 2.1 An Artificial Neural Network (ANN) . .7 2.2 Bayesian network . .8 3.1 Synaptic plasticity . 17 4.1 Membrane action potential . 20 4.2 STDP based learning rule . 21 4.3 Device properties . 24 4.4 Memristive device based learning rule . 25 5.1 Sensor simulation . 28 5.2 Robot modeling . 30 5.3 Learning scheme for robot navigation . 33 5.4 Neuromorphic array for robot localization . 33 5.5 Memristive device based learning rule for robot localization . 34 5.6 The integrated learning scheme . 36 6.1 Robot navigation for model M1 . 38 6.2 Robot navigation for model M2 . 50 6.3 Robot navigation with Reinforcement learning - global knowledge . 51 6.4 Robot navigation with Reinforcement learning - local knowledge . 52 ix 6.5 Device performance - uniformly distributed variability in doping concentration . 53 6.6 Probability distribution functions for device initial doping concentration . 53 6.7 Device performance - normally distributed variability in doping concentration . 53 6.8 Device performance - variability in state update . 54 6.9 Device array layout .

Load more