Progress in Robot Learning – Is Modeling Crucial for Success? Justus Piater, University of Innsbruck

Table of Contents

1. Progress in Robot Modeling – Why Modeling is Crucial for Success ...... 1 2. Robot Learning ...... 1 3. Skill Learning Using Stacked Classifiers ...... 2 4. Skill Learning Using Projective Simulation ...... 4 5. Scaling Up ...... 6 6. Progress in Robot Learning – Is Modeling Crucial for Success? ...... 7

1. Progress in Robot Modeling – Why Modeling is Crucial for Success

1.1. Progress in Robot Learning – Is Modeling Crucial for Success?

No.

I’m a learning guy.

2. Robot Learning

2.1. Robot Learning

[From the 1986 movie Short Circuit] Here’s a robot learning:

• collecting sensorimotor contingencies • reading • exploring properties of household items

1 Skill Learning Using Stacked Classifiers

2.2. General Knowledge

Structural Bootstrapping

• By analogy with syntactic bootstrapping [Gleitman 1990]

• Learn structure that helps future learning problems:

• learn scaffolding to bias or constrain the space of solutions

• learn concepts upon which build other concepts

3. Skill Learning Using Stacked Classifiers

Emre Ugur

3.1. Learning About Objects

[Ugur et al. 2014]

2 Skill Learning Using Stacked Classifiers

3.2. Sensorimotor Exploration: Poking

Simple Konzept: How does one object behave under a manipulation?

3.3. Sensorimotor Exploration: Stacking

More Complex Concept: How do two objects interact under a manipulation?

3.4. From Sensorimotor Interaction to Symbolic Planning

3 Skill Learning Using Projective Simulation

3.5. Playing With Building Blocks!

• Symbol formation by sensorimotor interaction Emre’s prior work among the pioneers

• Hierarchical concept learning

4. Skill Learning Using Projective Simulation

Simon Hangl

4.1. Projective Simulation p37 p13 Clip 3 Clip 7 ≡ Action 1 p Clip 1 Percept 1 67 ≡ p14 p63 Clip 4 Clip 6 p24 p64 p15 p56 p68

Clip 2 Percept 2 p25 ≡ p58 Clip 5 Clip 8 ≡ Action 2

• Episodic Compositional Memory (ECM): Markov network plus machinery for learning transition probabilities from experience

• Clip: elementary piece of experience

• Learning: random walk with ECM parameters updated according to rewards

• Execution: random walk

4 Skill Learning Using Projective Simulation

4.2. Picking and Placing Books: Resulting Behavior

[Hangl et al. 2016]

4.3. Picking and Placing Books: Learned ECM

Slide Poke Press

bottom binding open top 1 2 3 4 1 2 3 4

push90 push180 push270 ﬂip nothing

4.4. Enhancement: Environment Model

P i P i P i P i P i E A1 E A2 ... E AM ... E A1 ... E AM 1S j 1S j 1S j N S j S j N S j S j

E P i E P i ... E P i 1S j 2S j N S j S j

4.5. Boredom and Skill Creation

Execute Exe c ut e complex wi t h Boredom skill Pi or i gi na l a ppr oa c h unt i l • Entropy is low, i.e., the agent r e a c hi ng E P i n 1 S j knows what to do in state .

Comput e Isbored? bor e dom f or E P i • Find a higher-entropy state, and determine, n 1 S j using the Environment Model, an action to yes no transition to it. Cr e a t i ve l y c r e a t e c l i p Opt i mi z e pr opos a l e xc i t e ment a nd i ns e r t f unc t i on Creative Skill Creation wi t h s t a r t i ng f r om E P i c e r t a i n n 1 S j probability • Using the Environment Model, synthesize Exe c ut e a new compound preparatory skill clip, and Cont i nue opt i mal r a ndom s t a t e add it to the ECM. wal k wi t h t r a ns i t i on or i gi na l a ppr oa c h Re s t a r t e xe c ut i on • Akin to compiling cognitively-controlled a nd t r a i ni ng (cortical) complex skills into automated wi t hout bor e dom (cerebellar) routines.

5 Scaling Up

4.6. Learning Complex Manipulation Sequences!

Complex skill learning by sequencing actions

• with unified learning and execution, • guided by reinforcement signals, • adaptively creating compound skills. 5. Scaling Up 5.1. Scaling Up

Robot Companions Intelligence Explosion Humans Obsoleted [Isaac Asimov2] [Ray Kurzweil3] [The Matrix4]

5.2. Scaling up machine learning does not lead to human- level AI. All machine learning can do today is solve specific, externally-provided problems, defined in terms of objective functions fixed a priori:

• recognize objects in images • play Go • match ads to users • recognize or translate speech • optimize motor trajectories • construct symbolic abstractions from sensorimotor experience • find robust strategies for complex manipulation tasks

There is definitely no agency. 5.3. How to Scale Up Robot Learning Problem: Unbounded learning potential entails unbounded perception/ action spaces, diminishing the odds of finding useful structure. Approaches to Solutions:

• Structural Bootstrapping • Teaching human capabilities exploded with knowledge preservation and sharing • Knowledge Sharing • Maturation

2 http://www.biography.com/people/isaac-asimov-9190737 3 http://www.mirror.co.uk/news/technology-science/technology/ray-kurzweil-robots-smarter- humans-3178027 4 http://aihndigitalimagingiisummer2010.wikispaces.com/Cyborg+Research

6 Progress in Robot Learning – Is Modeling Crucial for Success?

• Controlled interaction and feedback (parenting) • Knowledge Mining • Modeling

All of the above can be viewed as different ways of constructing models. 6. Progress in Robot Learning – Is Modeling Crucial for Success?

6.1. Progress in Robot Learning – Is Modeling Crucial for Success?

Yes.

6.2. The IMAGINE Project

Explicitly about structural modeling. 6.3. Conclusions

• All practical Machine Learning How to Provide Structure: systems are designed and/or trained for given tasks; higher levels of autonomy • During learning: are not currently within reach. teaching, structural bootstrapping, knowledge • Learning is inherently hard; it won’t sharing, … work without guiding structure that (transiently) limits the learnable scope. • At the design phase: modeling physics, robot • Promising research towards increased maturation, … autonomy is underway (artificial curiosity; structural bootstrapping). By building models.

7 Progress in Robot Learning – Is Modeling Crucial for Success?

[I, Robot; extracted from YouTube5]

6.4. References

Bibliography S. Hangl, E. Ugur, S. Szedmak, J. Piater, “Robotic Playing for Hierarchical Complex Skill Learning1”. IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016. E. Ugur, S. Szedmak, J. Piater, “Bootstrapping paired-object affordance learning with learned single-affordance features2”. The Fourth Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, pp. 476–481, 2014.

5 http://www.youtube.com/watch?v=1H3Cy09LwQM 1 http://www.iros2016.org/ 2 https://iis.uibk.ac.at/public/papers/Ugur-2014-ICDLEPIROB-119.pdf