AN APPROACH TO SELF-EXTENSION:

REASONING ABOUT DIFFERENT EPISTEMIC ACTIONS AND UNCERTAINTY FOR AUTONOMOUS KNOWLEDGE GATHERING

Marc Hanheide, University of Lincoln DISCLAIMER: NO ICUB! AN APPROACH TO SELF-EXTENSION:

REASONING ABOUT DIFFERENT EPISTEMIC ACTIONS AND UNCERTAINTY FOR AUTONOMOUS KNOWLEDGE GATHERING

Marc Hanheide University of Lincoln (was Birmingham)

Alper Aydemir Andrzej Pronobis Richard Dearden Kristoffer Sjöö Moritz Göbelbecker Markus Vincze Charles Gretton Jeremy Wyatt Nick Hawes Michael Zillich Cognitive System that Patric Jensfelt Self-Understand and Self-Extend and many more...

ABILITIES

... in human-shared provide services... environments

operate in the ... even though it’s real world... full of uncertainties

... but don’t learn be adaptive... from scratch ARE YOU HUNGRY?

> 45000 views Kunze, L. et al., 2012. Searching Objects in Large-scale Indoor Environments: A Decision-theoretic Approach. In IEEE International Conference on and Automation. pp. 4385-4390. ARE YOU HUNGRY?

> 45000 views Kunze, L. et al., 2012. Searching Objects in Large-scale Indoor Environments: A Decision-theoretic Approach. In IEEE International Conference on Robotics and Automation. pp. 4385-4390. WHAT IF...

... the object is not ... the at (X,Y,Z)? wouldn’t know ... the map would be where the shop/ unknown (no global fridge is? coordinates)? ... the common- ... sensing would sense knowledge be uncertain? would be wrong?

Kunze, L. et al., 2012. Searching Objects in Large-scale Indoor Environments: A Decision-theoretic Approach. In IEEE International Conference on Robotics and Automation. pp. 4385-4390. A ROBOT IN MY (OLD) HOME

... the object is not ... the robot at (X,Y,Z)? wouldn’t know ... the map would be where the shop/ unknown (no global fridge is? coordinates)? ... the common- ... sensing would sense knowledge be uncertain? would be wrong? “Essentially a T-shirt on a stick with a wheeled base and camera for a face, Dora’s draw is her brain.”

[New Scientist] IF... THEN!

Autonomous Room Visual Categorisation ...Search the object is not ... the robot at (X,Y,Z)? wouldn’t know ... the map would be where the shop/ unknown (no global kitchen is? coordinates)? Autonomous Exploration ... the common- ... sensing would sense knowledge Probabilistic would be wrong? be uncertain? Planning Interactive Hypothesis Verification Hanheide, M. et al., 2011. Exploiting Probabilistic Knowledge under Uncertain Sensing for Efficient Robot Behaviour. In Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI). DORA SEARCHING Autonomous Visual Search HOW DOES DORA SEE OBJECTS?

• 2D image SIFTs + 3D surface SIFTs correspondence

• RANSAC based pose estimation [Prankl, 2010]

• based on known CAD model and texture

• requires pre- trained models PLANNING TO SEARCH

• Take into account the characteristics of CV operators

• Tsotsos et al. (2010)

• Limitations:

• range

• pre-mapped

• focus

MORE KNOWLEDGE

Room Categorisation

where is the kitchen? where can I find cereals?

Common- sense knowledge

how do I get Probabilistic there? Planning

Hanheide, M. et al. Exploiting Probabilistic Knowledge under Uncertain Sensing for Efficient Robot Behaviour. IJCAI 2011. WHERE TO SEARCH?

appearance models say: this is not a kitchen!

Pronobis et al.: Multi-modal Semantic Place Classification, International Journal of Robotics Research (IJRR), 2010 DORA DORA SEARCHES THE KITCHEN RESULTS Dora failed to find the object in first place, went back to the living room and returned to the kitchen

20 runs BHAM

... sensing would be uncertain?

10 runs KTH THE MESSY STUDENT APARTMENT

... the common- sense knowledge would be wrong? DEALING WITH UNCERTAINTY

•sensing algorithms can fail You know that, •not seeing the object do you?

•mis-classifying a room

•common-sense knowledge can be wrong, even though useful in most cases it’s all so uncertain... how to deal with it? Hanheide et al.: IJCAI, 2011 THE COGX APPROACH

Deliberative • domain-independent deliberative layer Layer Decision- Goal tasks Continual switches theoretic Management Planner Planner • plan autonomous behaviour to accomplish

various goals in complex, probabilistic state

plan

state state prob.

Conceptual actions Layer spaces

Chain Graph Executor Inference • domain-dependent conceptual layer

Domain Belief Model Model • accommodates probabilistic domain and instance knowledge actions

Competence instances translated • competence layer Layer

MappingMapping && ObjectObject Categori-Categori- NavigationNavigation SearchSearch sationsation • implements sensing and acting Co- Dialogue PersonPerson Dialogue occurrence Management DetectionSearch Analysis • implemented using CAST (not YARP)

http://www.cs.bham.ac.uk/research/projects/cosy/cast/

QUANTIFYING RELATIONS

Common- sense knowledge “HARVESTING” COMMON SENSE

• leverage common-sense knowledge implicitly available in the world wide web

• object and location concepts are “bootstrapped” from the ‘locations’ database provided by the Open Mind Indoor Common Sense (OMICS) database (Honda Research USA)

• co-occurrence frequency yielding probabilities estimated by counting the number of hits in an image search engine use Composed Receptive Field Histograms [Linde and Lindeberg] + SVM

[Pronobis, ICRA 12] CHAIN GRAPH REASONING

probabilistic belief state

Lauritzen et al: Chain graph models and their causal interpretations, J. Roy. Statistical Society, 2002 INFER THE ROOM CATEGORY

Room Categorisation THE COGX APPROACH

Deliberative • domain-independent deliberative layer Layer Decision- Goal tasks Continual switches theoretic Management Planner Planner • plan autonomous behaviour to accomplish

various goals in complex, probabilistic state

plan

state state prob.

Conceptual actions Layer spaces

Chain Graph Executor Inference • domain-dependent conceptual layer

Domain Belief Model Model • accommodates probabilistic domain and instance knowledge actions

Competence instances translated • competence layer Layer

MappingMapping && ObjectObject Categori-Categori- NavigationNavigation SearchSearch sationsation • implements sensing and acting Co- Dialogue PersonPerson Dialogue occurrence Management DetectionSearch Analysis • implemented using CAST (not YARP)

http://www.cs.bham.ac.uk/research/projects/cosy/cast/ PLANNING UNDER Probabilistic Planning UNCERTAINTY

• world state cannot be observed directly (uncertain sensing) • “naturally” represented as POMDP • probabilistic state space can be intractably huge (here ∼1027) • domain-independent solution: switching planner • make assumptions in a sequential session to generate most rewarding trace • contingent session for observation planning • (dis)confirmation of assumptions by contingent session

Hanheide et al., IJCAI 2011 Göbelbecker et al., AAAI 2011 Probabilistic SWITCHING PLANER Planning

Continual Planner

• Continual (classical) planner: good Straight line plan

performance, given useful observation

assumptions replanning

• Decision-theoretic POMDP planner: Decision-theoretic Planner very good at reasoning about how observations provide evidence about the underlying world state Policy for achieving observation

• Switch between sessions to exploit

both’s strengths Plan Execution

Continual Planner

Straight line plan

observation replanning

Decision-theoretic Planner

Policy for achieving observation

Plan Execution feature partial observability. In this paper we restrict our at- (A) Partially constrained abstract (B) Underlying DTPDDL belief tention to DTPDDL models that correspond to deterministic- belief-state action goal-oriented POMDPs where all actions have non- (:init (=(is-in Robot)kitchen) (:init (=(is-in Robot)office) 4 (.6(and(=(is-in cereal)kitchen) (.6(and(=(is-in cereal)kitchen) zero cost – i.e., an optimal policy can be formatted as a finite (.9(=(is-in milk) kitchen)) (.9(=(is-in milk)kitchen)) horizon contingent plan. Also, without a loss of generality .1(=(is-in milk)office)) .1(=(is-in milk) office)) .4(and(=(is-in cereal)office) .4(and(= (is-in cereal)office) we assume goals (i.e., conditions for reward) and action pre- (.1(=(is-in milk)kitchen)) (.1(=(is-in milk)kitchen)) conditions are conjuncts over positive propositions(/facts). .9(=(is-in milk)office))) .9(=(is-in milk)office))) (.6(=(is-in cup)office) Our continual planning system switches, in the sense that .4(=(is-in cup)kitchen))) the underlying planning procedure changes depending on our (C) Fully constrained abstract belief-state robot’s subjective degrees of belief, and progress in plan exe- (:init (=(is-in Robot)kitchen) cution. The system is continual in the usual sense that, what- (.6(=(is-in cereal)kitchen))) the session, plans are adapted and rebuilt online in reac- tion to changes to the planning model – e.g. when objectives are modified, or when our robot’s path is obstructed by a door Figure 4: Simplified examples of abstract belief-states from DT sessions. being closed. When the underlying planner is a deterministic sequential planner, i.e., a classical planner, we say planning is in a sequential session, and otherwise it is in a DT ses- those which are true with probability 1, or which are assumed sion. By autonomously mixing these two types of sessions true in the current trace. For example, if coffee cups are not our robot is able to be robust and responsive to changes in its necessarily in the kitchen, and the serial plan does not sched- environment, and make appropriate decisions in the face of ule actions whose outcomes are preconditioned on a cup be- uncertainty. ing somewhere in particular (e.g., searching for or retrieving During a sequential session a rewarding trace of a possible the cup from a view in the kitchen), then on first construc- execution is computed, in our experiments using a modified tion the states of the abstract process do not mention cups. version of Temporal Fast Downward (TFD) [Eyerich et al., Next, those static constraints are removed, one proposition at 2009]. Taking the form of a classical plan, the trace specifies a time, until the number of states that can be true with non- a sequence of actions that achieves the objectives following zero probability in the initial belief of the abstract process a deterministic approximation of the problem at hand, i.e., a reaches a given threshold (in our real-world experiments, 150 assume category room0 states). In detail, for each statically-false proposition we com- determinisation [Yoon et al., 2007]. A trace iskitchen a sequence of • execute sequential plan until there is a entropy elements that are either: (i) actions from the DTPDDL de- pute the of the state assumptions of the current trace conditionalsensingon action that proposition. Let X be a set of proposi- scription of the world, or (ii) atomic assumptions, modelled X assume exists-in-room tions and 2 the powerset of X, then taking as deterministic actions, made about the truth value of facts move robot place3 cornflakes room0 kitchen that can only be determined at runtime – e.g., that the cereal is X • invoke⇤ =DT plannerx on abstractx X0 2 problem, by located in the kitchen. During sequential sessions the planner { ⇤ ¬ | } x X X x X X trades action costs, goal rewards, and determinacy, finding a 2⇤0\ 2⇤\ 0 assume is-in cornflakes retractinglook-for-object assumptions cornflakes highly valuable plan = s0,a0,s1,a1,..,sNplace2according to we have that ⌅ isplace3 a set of conjunctions each of which corre- Equation 2. sponds to one truth assignment to elements in X. Where p(⇤) negative observation •givesassumptions the probability thatare a retracted conjunction ⇤ inholds ascending in the belief- V ()= ⇥i R(si,ai) (2) state of the DTPDDL process, the entropy of X conditional i=1..N 1 i=1..N 1 move robot place2 move robot place4 ⇥ onorder a proposition of they, written conditionalH(X y), is entropy given by Equation 3. | Here, ⇥i is the probability that the outcome, state si+1, of th p(y0) the i sequenced action ai occurs, and R(sj,aj) is the in- H(X y)= p(x y0)log2 (3) look-at-object cornflakes look-for-object| cornflakes ⇤ p(x y0) stantaneous reward received for executing action aj in state x ,y0 y, y place2 place42 2{ ¬ } ⇤ sj. The system always begins with a sequential session, and once TFD produces a trace, plan execution proceeds by ap- A low H(X y) valuepositive suggests observation that knowing the truth value ofwithy is useful y being for| determining the assumption, whether or not and some X assump- the set plying actions in sequence from that trace until ⇥i <.95 move robot place17 tions X hold.move When robot place5 removing a static constraint on propo- for the next scheduled action ai. A DT session then begins of propositions in the abstract DT process which tailors sensory processing to determine whether the as- sitions during the abstract process construction, yi is consid- sumptions made in the trace hold, or which otherwise acts to ered before yj if H(X yi) < H(X yj). For example, if the serial plan assumes the| robot is in| a kitchen, then proposi- achieve the overall objectives. report-position robot • retractionlook-for-object of cornfldeterministicakes assumptions Because DT planning in large problemshuman is too cornfl slowakes for our tions about the contentsplace5 of kitchens, e.g. that there is a cup in the kitchen, are added to characterise the abstract process’ purposes, DT sessions plan in an abstract decision process stops when N(=150)negative observation states are true with determined by the current trace and underlying belief-state. states. If sensing scheduled during the DT session fails to find The abstract process posed to the DT planner is constructed a cupnon-zero in thecommit room, is-inprobability then cornfl theakes kitchen assumption can be judged by first constraining as statically false all propositions except during DT deliberations.place4 To the abstract model we add dis- confirm and confirm actions that judge each assumption in the 4In the case of finite-horizon planning, POMDPs with stochas- trace. These actions yield a small reward if the correspond- tic actions can be compiled into equivalent deterministic-action ing judgement is true and small penalty otherwise. Once a POMDPs, where all the original action uncertainty is expressed in judgement action is scheduled for execution the DT session the starting-state distribution [Ng and Jordan, 2000]. is terminated, and a new sequential session begins. assume category room0 kitchen Probabilistic Planning assume exists-in-room move robot place3 cornflakes room0 kitchen

assume is-in cornflakes look-for-object cornflakes place2 place3

negative observation

move robot place2 move robot place4

look-at-object cornflakes look-for-object cornflakes place2 place4

positive observation

move robot place17 move robot place5

report-position robot look-for-object cornflakes human cornflakes place5

negative observation

commit is-in cornflakes place4 IF... THEN!

Autonomous Room Visual Categorisation ...Search the object is not ... the robot at (X,Y,Z)? ✓ wouldn’t know ... the map would be where the shop/ unknown (no global kitchen is? ✓ coordinates)? Autonomous Exploration ... the common- ... sensing would ✓ sense knowledge Probabilistic would be wrong? (✓) be uncertain? Planning Interactive Hypothesis Verification Hanheide, M. et al., 2011. Exploiting Probabilistic Knowledge under Uncertain Sensing for Efficient Robot Behaviour. In Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI). acquire a large map “by hand” prior to any useful services by the robot? better reason in an “open” world and allow to self-extend EXPLORATION • active visual search in unknown larger scale space

Placeholder • exploit existing knowledge or (explicit hypothesis) explore new hypotheses generated from an analysis of the local metric map (frontiers)

• rooms yield probabilities to find objects

• informed, task-driven exploration or curiosity-driven exploration This could also be a placeholder! (exists (?o - visualobject) (and (= (label ?o) magazine) (kval [ROBOT] (related-to ?o) ) )

place hypothesis with explored places with probability about category this room category placeholder could lead to

detected obstacles using Kinect

(sped up) TASK-DRIVEN OPEN-WORLD SEARCH

(sped up) IF... THEN!

Autonomous Room Visual Categorisation ...Search the object is not ... the robot at (X,Y,Z)? ✓ wouldn’t know ... the map would be where the shop/ unknown (no global kitchen is? ✓ coordinates)? Autonomous ✓ Exploration ... the common- ... sensing would ✓ sense knowledge Probabilistic would be wrong? (✓) be uncertain? Planning Interactive Hypothesis Verification Hanheide, M. et al., 2011. Exploiting Probabilistic Knowledge under Uncertain Sensing for Efficient Robot Behaviour. In Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI). “Why did I not find the cornflakes?...

•Could that cornflakes are in a cupboard? •Could it be that this room isn’t a kitchen? •Could it be that cornflakes are usually in offices? •Could it be that the cornflakes are not in this room?” EXPLAINING SURPRISES

Oh, PLAN EXECution failed unexpectedly. CREATING HYPOTHESES

Perhaps the magazine is

assumptions in a container which is in Room 2. which is

new most likely a Meeting room. VERIFYING HYPOTHESES

ARE CONTAINERS TYPically in Yes Meeting Rooms?

Is There a container in This room? Yes

Perhaps the magazine is in a container which is in Room 2. which is most likely a Meeting room. plan verification verification attribution

kitchen “IS IT AN OFFICE?” init

find person

engage ask-polar-cat

goal IF... THEN!

Autonomous Room Visual Categorisation ...Search the object is not ... the robot at (X,Y,Z)? ✓ wouldn’t know ... the map would be where the shop/ unknown (no global kitchen is? ✓ coordinates)? Autonomous Exploration ... the common- ... sensing would ✓ sense knowledge Probabilistic would be wrong? be uncertain? Planning ✓ Interactive Hypothesis Verification BUT... I can’t get you a sandwich... THE END

• Dora: A system (not only) for provide services...... in human-shared environments object search operate in the real ... even though it’s full of world... uncertainties • exploiting common-sense be adaptive...... but don’t learn from scratch knowledge for efficiency

exploit deal with uncertainties! Grazie! common-sense! • deals with uncertainty for robustness

• explores its environment and learns through interaction to extend its knowledge http://cogx.eu