Progress in Robot Learning – Is Modeling Crucial for Success? Justus Piater, University of Innsbruck
Table of Contents
1. Progress in Robot Modeling – Why Modeling is Crucial for Success ...... 1 2. Robot Learning ...... 1 3. Skill Learning Using Stacked Classifiers ...... 2 4. Skill Learning Using Projective Simulation ...... 4 5. Scaling Up ...... 6 6. Progress in Robot Learning – Is Modeling Crucial for Success? ...... 7
1. Progress in Robot Modeling – Why Modeling is Crucial for Success
1.1. Progress in Robot Learning – Is Modeling Crucial for Success?
No.
I’m a learning guy.
2. Robot Learning
2.1. Robot Learning
[From the 1986 movie Short Circuit] Here’s a robot learning:
• collecting sensorimotor contingencies • reading • exploring properties of household items
1 Skill Learning Using Stacked Classifiers
2.2. General Knowledge
Structural Bootstrapping
• By analogy with syntactic bootstrapping [Gleitman 1990]
• Learn structure that helps future learning problems:
• learn scaffolding to bias or constrain the space of solutions
• learn concepts upon which build other concepts
3. Skill Learning Using Stacked Classifiers
Emre Ugur
3.1. Learning About Objects
[Ugur et al. 2014]
2 Skill Learning Using Stacked Classifiers
3.2. Sensorimotor Exploration: Poking
Simple Konzept: How does one object behave under a manipulation?
3.3. Sensorimotor Exploration: Stacking
More Complex Concept: How do two objects interact under a manipulation?
3.4. From Sensorimotor Interaction to Symbolic Planning
3 Skill Learning Using Projective Simulation
3.5. Playing With Building Blocks!
• Symbol formation by sensorimotor interaction Emre’s prior work among the pioneers
• Hierarchical concept learning
4. Skill Learning Using Projective Simulation
Simon Hangl
4.1. Projective Simulation p37 p13 Clip 3 Clip 7 ≡ Action 1 p Clip 1 Percept 1 67 ≡ p14 p63 Clip 4 Clip 6 p24 p64 p15 p56 p68
Clip 2 Percept 2 p25 ≡ p58 Clip 5 Clip 8 ≡ Action 2
• Episodic Compositional Memory (ECM): Markov network plus machinery for learning transition probabilities from experience
• Clip: elementary piece of experience
• Learning: random walk with ECM parameters updated according to rewards
• Execution: random walk
4 Skill Learning Using Projective Simulation
4.2. Picking and Placing Books: Resulting Behavior
[Hangl et al. 2016]
4.3. Picking and Placing Books: Learned ECM
Slide Poke Press
bottom binding open top 1 2 3 4 1 2 3 4
push90 push180 push270 flip nothing
4.4. Enhancement: Environment Model
P i P i P i P i P i E A1 E A2 ... E AM ... E A1 ... E AM 1S j 1S j 1S j N S j S j N S j S j
E P i E P i ... E P i 1S j 2S j N S j S j
4.5. Boredom and Skill Creation
Execute Exe c ut e complex wi t h Boredom skill Pi or i gi na l a ppr oa c h unt i l • Entropy is low, i.e., the agent r e a c hi ng E P i n 1 S j knows what to do in state .
Comput e Isbored? bor e dom f or E P i • Find a higher-entropy state, and determine, n 1 S j using the Environment Model, an action to yes no transition to it. Cr e a t i ve l y c r e a t e c l i p Opt i mi z e pr opos a l e xc i t e ment a nd i ns e r t f unc t i on Creative Skill Creation wi t h s t a r t i ng f r om E P i c e r t a i n n 1 S j probability • Using the Environment Model, synthesize Exe c ut e a new compound preparatory skill clip, and Cont i nue opt i mal r a ndom s t a t e add it to the ECM. wal k wi t h t r a ns i t i on or i gi na l a ppr oa c h Re s t a r t e xe c ut i on • Akin to compiling cognitively-controlled a nd t r a i ni ng (cortical) complex skills into automated wi t hout bor e dom (cerebellar) routines.
5 Scaling Up
4.6. Learning Complex Manipulation Sequences!
Complex skill learning by sequencing actions
• with unified learning and execution, • guided by reinforcement signals, • adaptively creating compound skills. 5. Scaling Up 5.1. Scaling Up
Robot Companions Intelligence Explosion Humans Obsoleted [Isaac Asimov2] [Ray Kurzweil3] [The Matrix4]
5.2. Scaling up machine learning does not lead to human- level AI. All machine learning can do today is solve specific, externally-provided problems, defined in terms of objective functions fixed a priori:
• recognize objects in images • play Go • match ads to users • recognize or translate speech • optimize motor trajectories • construct symbolic abstractions from sensorimotor experience • find robust strategies for complex manipulation tasks
There is definitely no agency. 5.3. How to Scale Up Robot Learning Problem: Unbounded learning potential entails unbounded perception/ action spaces, diminishing the odds of finding useful structure. Approaches to Solutions:
• Structural Bootstrapping • Teaching human capabilities exploded with knowledge preservation and sharing • Knowledge Sharing • Maturation
2 http://www.biography.com/people/isaac-asimov-9190737 3 http://www.mirror.co.uk/news/technology-science/technology/ray-kurzweil-robots-smarter- humans-3178027 4 http://aihndigitalimagingiisummer2010.wikispaces.com/Cyborg+Research
6 Progress in Robot Learning – Is Modeling Crucial for Success?
• Controlled interaction and feedback (parenting) • Knowledge Mining • Modeling
All of the above can be viewed as different ways of constructing models. 6. Progress in Robot Learning – Is Modeling Crucial for Success?
6.1. Progress in Robot Learning – Is Modeling Crucial for Success?
Yes.
6.2. The IMAGINE Project
Explicitly about structural modeling. 6.3. Conclusions
• All practical Machine Learning How to Provide Structure: systems are designed and/or trained for given tasks; higher levels of autonomy • During learning: are not currently within reach. teaching, structural bootstrapping, knowledge • Learning is inherently hard; it won’t sharing, … work without guiding structure that (transiently) limits the learnable scope. • At the design phase: modeling physics, robot • Promising research towards increased maturation, … autonomy is underway (artificial curiosity; structural bootstrapping). By building models.
7 Progress in Robot Learning – Is Modeling Crucial for Success?
[I, Robot; extracted from YouTube5]
6.4. References
Bibliography S. Hangl, E. Ugur, S. Szedmak, J. Piater, “Robotic Playing for Hierarchical Complex Skill Learning1”. IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016. E. Ugur, S. Szedmak, J. Piater, “Bootstrapping paired-object affordance learning with learned single-affordance features2”. The Fourth Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, pp. 476–481, 2014.
5 http://www.youtube.com/watch?v=1H3Cy09LwQM 1 http://www.iros2016.org/ 2 https://iis.uibk.ac.at/public/papers/Ugur-2014-ICDLEPIROB-119.pdf
8