Neurorobotics Lecture
Total Page:16
File Type:pdf, Size:1020Kb
Neurorobotics An introduction Marc-Oliver Gewaltig In this lecture you’ll learn 1. What is Neurorobotics 2. Examples of simple neurorobots 1. attraction and avoidance 2. reflexes vs. learned behavior 3. The sensory-motor loop 4. Learning in neurorobotics 1. unsupervised learning for sensory representations 2. reinforcement learning for action learning What is Neurorobotics Neurorobotics, is the combined study of neuroscience, robotics, and artificial intelligence. It is the science and technology of embodied autonomous neural systems. https://en.wikipedia.org/wiki/Neurorobotics Neurorobotics: Embodied in silico neuroscience Spinal Cord Reconstructed Reflexes spinal cord/ CDPs brain models Embodiment and virtual environments Musculo-skeletal system – compliant actuators and mechanics Starting simple: Valentino Braitenberg’s Vehicles Valentino Braitenberg (1926-2011) Braitenberg, V. (1984). Vehicles: Experiments in Photo: Alfred Wegener, commons.wikimedia.org synthetic psychology. Cambridge, MA: MIT Press. Vehicle 1 1 Vehicle 2a 1 2a Vehicle 2b 1 2a 2b Vehicle 3 1 2a 2b 3 Vehicle 3 1 2a 2b 3 Exercise How will vehicle 3 move? Generalizing the Braitenberg vehicle Exercise Using weights in {-1,+1}, which weight configurations implement the vehicles 2a, 2b, and 2c? speed light Biological and Non-biological bodies Sensors: cameras, microphones, etc Artificial brain with neurons Servo motors with wheels Biological and Non-biological bodies Sensors: cameras, microphones, etc encode Artificial brain with neurons Servo motors with wheels Biological and Non-biological bodies Sensors: cameras, microphones, etc encode Artificial brain with neurons Servo motors with wheels decode Perception Action Vision Behaviors Hearing Smell Action Central pattern generators Touch Perception Reflexes Temperature Vestibular Muscle contraction Proprioception Perception Short-term Long-term Action memory memory Drives & Working Vision Cognitive Motivation memory control Action Behaviors Sensor Reward & selection Hearing fusion punish. Smell Action Central pattern generators Touch Perception Reflexes Temperature Vestibular Muscle contraction Proprioception Perception Short-term Long-term Action memory memory Drives & Working Vision Cognitive Motivation memory control Action Behaviors Sensor Reward & selection Hearing fusion punish. SmellLearning of Action Central pattern generators Touchsensory representations Perception Reflexes Temperature Vestibular Muscle contraction Proprioception Perception Short-term Long-term Action memory memory Drives & Working Vision Cognitive Motivation memory control Action Behaviors Sensor Reward & selection Hearing fusion punish. SmellLearning of Action Central pattern generators Touchsensory representations Perception Reflexes Temperature Vestibular Muscle contraction Proprioception Perception Short-term Long-term Action memory memory Drives & Working Vision Cognitive Motivation memory control Action Behaviors Sensor Reward & selection Hearing fusion punish. SmellLearning of Action Central pattern Learninggenerators of skills Touchsensory Perception and behaviours representations Reflexes Temperature Vestibular Muscle contraction Proprioception Example: Somato-sensory maps in the cortex 1 Touch sensitive 2 Somato-sensory map on the 3 Schema of how these regions regions on the mouse mouse brain are mapped to the the mouse body somato-sensory cortex Hind limbs Trunk Forelimbs Whiskers Mouth Nose Sensory cortex limits cortical maps and drives top-down plasticity in thalamocortical circuits Zembrzycki et al. 2013 Example: Somato-sensory maps in the cortex Trunk: largest area of the body – smallest part of the cortical map Hind limbs Trunk Forelimbs Whiskers Nose: small area of the Mouth Nose body – larges area of the map The size of a somato-sensory representation in the brain corresponds to the frequency of its stimulation. Perception Short-term Long-term Action memory memory Drives & Working Vision Cognitive Motivation memory control Action Behaviors Sensor Reward & selection Hearing fusion punish. SmellLearning of Action Central pattern generators Touchsensory representations Perception Reflexes Temperature Vestibular Muscle contraction Proprioception Perception Short-term Long-term Action memory memory Drives & Working Vision Cognitive Motivation memory control Action Behaviors Sensor Reward & selection Hearing fusion punish. SmellLearning of Action Central pattern Learninggenerators of skills Touchsensory Perception and behaviours representations Reflexes Temperature Vestibular Muscle contraction Proprioception Example of behavior learning: the Morris Water Maze Example of behavior learning: the Morris Water Maze Rats learn to find a hidden Time to find platform platform – they don’t like cold water. 10 trials Foster, Morris, Dayan 2000 Different types of learning: 1. Supervised learning • learning from labelled examples 2. Unsupervised learning • learning from unknown examples 3. Reinforcement learning • Learning actions from rewards Different types of learning: 1. Supervised learning • learning from labelled examples 2. Unsupervised learning • learning from unknown examples 3. Reinforcement learning • Learning actions from rewards Supervised learning tree tree tree label tree feature 1,...,n Different types of learning: 1. Supervised learning • learning from labelled examples 2. Unsupervised learning • learning from unknown examples 3. Reinforcement learning • Learning actions from rewards Unsupervised learning Find structure in the data = • Data: D={x1,x2,x3,...} • discover different classes of stimuli (e.g. trees and non-trees) • find ‘useful’ feature basis Trees Trees Something else 2 feature Something else feature 1 Different types of learning: 1. Supervised learning • learning from labelled examples 2. Unsupervised learning • learning from unknown examples 3. Reinforcement learning • Learning actions from rewards Reinforcement-Learning: Some examples • An agent is in a state S • It can take one of several actions • Each action leads to a new state S’. • In some states, the agent is rewarded or punished. Reinforcement-Learning: Some examples • An agent is in a state S • It can take one of several actions • Each action leads to a new state S’. • In some states, the agent is rewarded or punished. • Goal: Maximize reward Reinforcement-Learning: Some examples S = State S’ = new state A = Action R= Reward Agent environment interaction Agent Environment Learning through interaction with the environment Agent Rt St At Rt+1 St+1 Environment At St St+1 Rt+1 Sensory stimulation Perceptual/Behavioural changes Stimulus, response + reward Learning synaptic plasticity Learning and synaptic plasticity Synaptic plasticity: change in connection strengths Behavioural learning and synaptic plasticity Synaptic plasticity: change in connection strengths Behavioural learning and synaptic plasticity Synaptic plasticity: change in connection strengths Axon terminal Neurotransmitter Vesicle Synaptic cleft Receptor Dendrite Learning through synaptic plasticity action potential (spike) pre time post i j Axon terminal Neurotransmitter amplitude time Before learning Vesicle Synaptic cleft Receptor Dendrite Synapse Learning through synaptic plasticity action potential (spike) pre time post i j Axon terminal Neurotransmitter amplitude time After learning Vesicle Synaptic cleft Receptor Dendrite Synapse Hebb’s postulate “When an axon of cell j repeatedly or persistently takes part in firing cell i, then j’s efficiency as one of the cells firing i is increased” Donald O. Hebb The Organization of Behavior (1949) Hebbian postulate explained: cell assemblies • Items are encoded by groups of cells, so-called assemblies Hebbian postulate explained: cell assemblies Item A • If an item is sensed or recalled, the neurons in the assembly are activated Hebbian postulate explained: cell assemblies Item A • as a result, the strength of the connection in the assembly increases Hebbian postulate explained: cell assemblies Item B Hebbian postulate explained: cell assemblies Item B Hebbian postulate explained: cell assemblies Item B Hebbian postulate explained: cell assemblies Partial activation of A... Hebbian postulate explained: cell assemblies Item A Partial activation of A... ...triggers activation of the remaining neurons ! pattern completion Hebbian learning in experiments (schematic) pre j w ij u EPSP no spike of i i post Hebbian learning in experiments (schematic) pre j w ij u EPSP no spike of i i post pre j Both neurons wij simultaneously active i post Hebbian learning in experiments (schematic) pre j w ij u EPSP no spike of i i post pre j Both neurons wij simultaneously active i post pre j EPSP wij no spike of i i post Increased amplitude ⇒ Δwij > 0 Hebbian plasticity Donald Hebb’s postulate (1949): When an axon of cell j repeatedly or persistently takes part in firing cell i, then j’s efficiency as one of the cells firing i is increased pre j wij i k post time • learns correlations (simultaneous activity) • acts locally on the neurons activated Summary: Hebbian plasticity in experiments • Synaptic changes are induced by co-activation of pre- and post-synaptic pre post neurons i • Changes persist for a long time j • Changes can lead to an increase or decrease of the post-synaptic potential Functionality • useful for learning a new behaviour • useful for development (e.g., wiring for receptive field development) • useful for activity control in network (homeostatis) • useful for coding Hebbian learning is unsupervised learning pre post i j local Reinforcement