Quick viewing(Text Mode)

Grounding the Unobservable in the Observable: the Role and Representation of Hidden State in Concept Formation and Refinement Claytont

Grounding the Unobservable in the Observable: the Role and Representation of Hidden State in Concept Formation and Refinement Claytont

From: AAAI Technical Report SS-01-05. Compilation copyright © 2001, AAAI (www.aaai.org). All rights reserved.

Grounding the Unobservable in the Observable: The Role and Representation of Hidden State in Formation and Refinement ClaytonT. Morrison Tim Oates Gary King Experimental KnowledgeSystems Lab Artificial Lab Experimental KnowledgeSystems Lab Department of Computer Science Massachusetts Institute of Technology Department of Computer Science University of Massachusetts, Amherst 545 Technology Square University of Massachusetts, Amherst Amherst, MA01003, USA Cambridge, MA02139 Amherst, MA01003, USA clayton @cs.umass.edu oates @ai.mit.edu [email protected] Introduction that explain as manyobservable phenomenaas compactly One of the great mysteries of humancognition is how we and accurately as possible. Predictiveness is central to this learn to discover meaningfuland useful categories and con- enterprise. If theory A makeseither more accurate predic- cepts about the world based on the data flowing from our tions or a larger of verifiable predictions than theory B, then theory A is preferred. To explain observed phenomena, sensors. Whydo very young children acquire like support and animate (Leslie 1988) rather than between three scientists often posit the of unobservableentities and six feet wide or blue with red and green dots? Onean- (Harr6 1970; 1986). No one has ever seen gravity or black swer to this question is that categories are created, refined holes, but they explain such a wide range of observable phe- and maintainedto support accurate prediction. Knowingthat nomenaso accurately that their existence goes virtually un- an entity is animate is generally muchmore useful for the challenged. Scientific wouldcome to a standstill if purpose of predicting howit will behave than knowingthat not for the ability to posit and collect evidencefor the ex- it is blue with red and green dots. istence of causally efficacious entities that do not manifest themselves directly in our percepts in the same way that, The of using predictability, or a lack thereof, as the say, the color blue does. driving force behind the creation and refinement of knowl- To bring the discussion back to concept acquisition in edge structures has been applied in a variety of contexts. children, consider the following example. Humansclearly Drescher (1991) and Shen (1993) used uncertainty in cannot perceive the mass of an in the sameway that tion outcomesto trigger refinement of action models, and they can perceive its color. Thoughit mayseem as if we per- McCallum(1995) and Whitehead and Ballard (1991) ceive mass directly fromvisual observationthis is an illusion uncertainty in predicted reward in a reinforcement learning groundedin a vast corpus of knowledgeabout objects gath- setting to refine action policies. ered over a lifetime of physical interaction with the world. Virtually all of the workin this vein is based on two key Without this grounding we would be, like an infant, unable assumptions. First, an assumptionis madethat the world is to makejudgements about the masses of objects based solely in priciple deterministic; that given enoughknowledge, out- on their visual appearance. Indeed, it is equivocal whether comescan be predicted with certainty. Giventhis, an agent’s we wouldeven have a concept of mass at all. failure to predict implies that it is either missinginformation or incorrectly representing the informationthat it has. Sec- Representingthe Unobservable ond, it is assumedthat knowledgestructures sufficient for the task can be created by combiningraw perceptual infor- Howmight a child that cannot perceive mass directly ever mation in various ways. That is, everything the agent needs hope to gain a concept of mass?Our answeris that the child to makeaccurate predictions is available in its percepts, and posits the existence of a that explains howobjects the problemfacing the agent is to find the right combina- behaveand that they later learn that this property is named tion of elements of its perceptual data for this task. (See "mass". Objects with different massesyield different propri- (Drescher 1991) for an early and notable exception.) oceptive sensations whenyou lift them and different haptic Our position is that the first of these assumptionsrepre- sensations whenyou drop them on your foot, they require sents an exceedingly useful mechanismfor driving unsuper- different amounts of force to get them movingat a given vised concept acquisition, whereas blind adherence to the speed and they decelerate at different rates whenthat force second makesit difficult or impossible to discover someof is removed.Positing a hiddenfeature of objects that explains the most fundamental concepts. To better understand this and is correlated with any of these suffices to position, consider the child-as-scientist metaphor(Gopnik predict (to somedegree of accuracy) all of the others. That 1997). Generally speaking, scientists aim towards an under- is, this hidden quantity makesa wider array of moreaccu- standing of the way the world works by developing theories rate predictions than any directly observablefeature such as color or size. CopyrightO 2001, AmericanAssociation for Artificial Intelli- So far we have mentionedone of the fundamentaltriggers gence(www.aaai.org). All rights reserved. for positing new unobservable representational elements:

45 failure to predict. But an account of howthe neonate human carrying out an experiment). Irrespective of where along or machine might support the acquisition of knowledgeof the continuum from atomic/concrete to compound/abstract these hiddenstates requires a representational infrastructure these actions are specified, they all share in commonthe key that can property of affecting, in someway, what future input the ¯ accommodaterecognition of failure to predict, and agent will receive from the world. In this view, there is no such thing as a system that is purely disembodiedor pas- ¯ accommodaterepresentations whose content is funda- sive with respect to the environmentit is learning about: the mentally reviseable and extendable. world affects the agent and the agent affects the world (see The first property highlights that these representations must (Bickhard 1993)). Furthermore, note that even "passive" provide someway for the agent to recognize that something tions that don’t directly causally impinge on states of the has gonewrong: its current representational repertoire fails world, such as passively observing a scene, still involve an to be useful in predicting someaspect of the environment. "action" that affects the subsequentinput - e.g., passive ob- Thesecond property is a little morecomplicated. First, it servation specifically involves sitting still and not altering highlights that, in its intial, naked form, the proposednew states of the environmentwhile there is a passageof . representational element serves simply as a "place-marker" associated with the context of the failure to anticipate some Constructing a NewPredicate aspect of the world. At this point, the agent does not even Withinthis representational framework,it is nowpossible to knowwhether the new element represents a hidden state, create (discover) newpredicates that maycorrespond to un- someunobservable property, or someunseen process. Next, observable states or properties of the world. Thesenew pred- given the newelement, the agent must nowsearch for condi- icates are then useful for deterministically predicting previ- tions under whichit can reliably predict its state . Once ously unpredictable behavior of the world. Following the these conditions are discovered, the state value of this repre- base representational form of the , and using a "just sentational element can then be used to makepredictions of so" story involving the dicovery of a mass-like concept, such the the previously unpredictable aspects of the world. These creation and discovery works as follows: conditions are themselvesstates of the worldthat are directly Initially, the agent does not knowhow to predict the out- observable or can be accurately anticipated by the agent. comesof actions or interactions involving objects and such They are associated with the newly constructed represen- diverse outcomesas proprioreceptive feedback whenlifting tational element, serving as conditions for determiningits an object, haptic feedback whenthe object is dropped on state value. In this way, the content (the )of the one’s foot, the force required to achieve somevelocity in proposed new representation of an unobservable aspect of movingthe object, and the rate of deceleration of the object the world is fundamentallyrevisable and extendable. whenone stops pushing. Instead, the different outcomesof Interestingly, a commontheme is shared amongseveral these events are simply experienced as they happen. Fig- computationalmodels of representation learning along these ure 2 depicts a pair of schemasrepresenting two different lines (e.g., (Drescher 1991; Kwok1995)). Namely,the experiencesof lifting objects that are otherwiseperceptually data structure for a representational unit in these discov- identical (also in conditionsthat are essentially identical). ery systemsis a triple involving preconditions, someaction, and postconditions; following Drescher’s (1991) terminol- ogy, we will refer to this data structure as a schema(see Figure 1). Lift Object

I Action Lift ,.@Object Figure 1: Base structure of representation - preconditions, action, postconditions. Figure 2: Initial set of schemaswith outcomesthat are not predictable based on the observable preconditions. The preconditions and postconditions initially consist of sets of predicates whosevalues represent directly observable As is clear from the schemasin Figure 2, the agent does states of the world (e.g., colors, shapes, relative positions not have enoughinformation to properly anticipate what will of visual objects in the retinal field), and later mayinclude happen when lifting the small object in the two different newlylearned states (e.g., states that are not directly observ- cases. Sometimessmall objects require a high amount of able, including newrepresentational elements whosevalue- musclestrain in order to be lifted, other they do not. determining conditions have already been learned). The This inability to anticipate what will happenserves as the actions can be simple physical-motor actions (e.g., open- trigger for the agent to propose a new predicate, P1, whose ing one’s mouth), or more abstracted activities that them- values will help distinguish betweenthe two situations (sim- selves consist of a structured set of composedactions (e.g., ilar to Drescher’s(1991) synthetic item). /:’1 currently

46 no content (meaning)other than it will be used to determine the outcomeof lifting small objects in the described context (Figure 3).

PI=? J~Muscle -’~Muscle Strain Strain = highlow

Figure 3: Newpredicate P1 - has the function of determin- ing outcomeof lifting small objects (context), but currently has no associated conditions for determiningits value.

Giventhis "empty"predicate, the agent nowtries to find the conditions which wouldserve to consistently determine PushlObjec~~ what the non-observable present value of the predicate is (which, in turn, will determine what the outcomeof lift- ing small objects will be). (Note that the "empty" predi- Figure 5: Schemasconditioning the state-value of the pred- cate currently has the function of driving the agent to search icate/91. for the conditions that determinethe predicate’s state-value; such search could consist of a random walk through the action/state space, or be a result of other behavior of the teraction does hold), we would like the agent to eventually agent dedicated to other tasks, or could result from specific only need to determine the prior predicate-value setting con- strategies (learned or innate) for finding predicate value- ditions once, thereafter knowingthat the predicate has a cer- determining conditions. Wedo not address such strategies tain value and all subsequentactions (e.g., lifting) will have here.) Supposeafter someexploration, the agent discovers particular outcomes. This allows the agent to keep track of that if it pushessmall objects before lifting them, outcomes a property of an object not directly observable, yet which of the pushing interactions serve to determine what the out- holds over time. This is a powerfuladdition to the agent’s comeof lifting those particular objects will be. Andthis in- representational repertoir. For example,if a particular small creased predictability occurs consistently wheninteracting object is pushed and found to moveslowly, the agent does with the small objects (Figure 4). not need to perform this test again (or every time prior to lifting) whenthe agent wants to anticipate what the outcome of lifting will be (Figure 6).

r Lift Object Figure 4: Discoveryof consistent outcomesof pushing con- ditioning consistent outcomesof lifting.

Nowthe agent has found a set of schemasthat can be used to determine under what conditions the proposed unobserv- able predicate wouldhave particular state-values (Figure 5). That is, even thoughthe initial situations are indistinguish- Lift Object able (small objects cannot be distinguished by simple obser- vation), after carrying out a particular interaction (pushing) with a particular small object, then goingon to lift that object Figure 6: The "hidden state" predicate P1 can nowbe used will have a particular outcome. These precondition schemas to distinguish outcomesof situations that are currently ob- nowfill-out moreof the content of P1- servationally identical. Onemight argue that at this point the agent can nowjet- tison the original predicate, simply relying on the test of It is also not a problem that there are somepredicates first pushing small objects before lifting them to see what whosevalues are not stable over time. Somepredicates, for outcome is expected of subsequent lifting. Removingthe example,may only hold a particular value for a short amount predicate, however, would makethe immediateprior testing of time; others maychange their vlaues based on other con- by pushing absolutely necessary. While consistent immedi- ditions. The conditions under whichthese values change are ate prior testing maybe desireable initially (e.g., to ensure conditions that the agent would need to further explore and that the suspected relationship between preconditions and discover in order to makeaccurate predictions. But they are the desired ability to anticipate the outcomeof a future in- not in unattainable within this framework.

47 In the above experiment, the agent discovered term mass as scientifically understoodtoday. Still, we can that the outcomeof pushing could reliably condition the ex- see a clear trajectory headedin that direction, as newmass- pected outcomeof lifting. But the discovery did not need determined relationships are discovered and unified under to happenin this order. For examplethe agent could have, the developing predicate’s semantics. and still can, discover that the outcomeof lifting is a reli- This is a natural segue to considering the powerful role able predictor of outcomesof pushing. If this was an ad- language plays in representational development- particu- ditional discovery, a newpredicate (/92) could be posited. larly predicate discovery of the kind described here. Lan- However,because of the strong symmetrybetween pushing guage can provide labels for these theoretical entities and determininglifting and vice versa, the two predicates could can serve a structuring role in helping the agent learn about be reducedto a single predicate whosevalue could be deter- other theoretical entities, as well as facilitate unification. For minedby either initial action tests (in turn determiningwhat example,use of linguistic labels by other languageusers can the other action outcomewould be). Now,a single unified help to focus attention on specific aspects of the environment predicate, Pa, wouldhave the value 1 wheneither lifting in- requiring explanation; merely hearing the use of a term nat- volved high musclestrain or pushing resulted in slow move- urally leads one to seek an understanding of what that term ment of the object, and a value of 1 would predict either refers to or howit is properly used. Or, having learned a outcomeas well. term that the agent suspects refers to a propertythat its pred- In fact, the use of this symmetryholds generally for icate represents, the agent maythen test whether their use any hidden properties, states or processes that maygov- of the term following this groundingin their predicate rep- ern manydifferent potential observable situations. As men- resentation is correct and/or complete. In general, language tioned above, mass plays an important role in determining: provides a proliferation of tools for positing, analyzing and (a) the outcomeof muscle proprioreceptive feedback while evaluating theoretical entities - useful both for the agent by lifting, (b) the amountof pushing force to bring an object itself, and in its social interactions. Languagegreatly im- on a surface to somevelocity, (c) the kind of haptic feed- proves the speed and correctness of predicate discovery and back whenobjects are droppedon our feet, and (d) howlong refinement in children, and should for our would-be epis- an object might movebefore comingto rest after a certain temic machines as well. amountof force has been imparted to it. Becauseof the sym- (A final note: Weare not makinga positivist claim that all metries in the determining relationships betweenthese dif- theoretical entities are to be understoodas defined in terms ferent situations (as each is used to alternatively predict the of observables. Rather, our claim is that, in the limit case others), they mayall be unified into one general predicate, of no prior knowledge(such as neonates and our machines), whosevalue could be determined by particular outcomesof this is the base mechanismby whichtheoretical entities are any one of the individual schemaaction tests (Figure 7). discovered, understood and grounded. We do need ground- ing in observables(derived frompotentially ellaborate histo- ries of interaction) in order to have someidea of whatvalues those hiddenstates might currently have - this is no differ- ent than the idea of measurementin science. But the ultimate ontological status of the entities, processesor properties that these predicates might represent, while informedby, are not feedback when achieve whendropped | deceleration determined by the predicate values and howthey’re deter- lifted on foot J velocity mined; the former are futher inferences madeas part of the logic and structure of a scientific theory.) Figure 7: In the terminology of graphical models, the mass of an object renders the other observations conditionally in- dependent. Concluding Remarks Science, of course, has developed an ellaborate logic for It is here that we see powerof keeping the pro- investigating theoretical entities and the conditions under posed representational predicate: after learning of the con- which they are posited and evaluated. Also, given an al- nections betweenthese different outcomesin different situ- ready robust base of knowledgeabout howthe world works, ations, a single outcomein one situation will immediately positing of theoretical entities mayinvolve ellaborate use of determine the outcomeof all the others. This also demon- analogy and metaphor, borrowing from already well-known strates another feature of this model,again related to the base environmentalstructure (this wouldinvolve wholestructures mutability of the content (the meaningor semantics) of the transfered tothe newsituation, not just single predi- predicate: the agent’s understanding of what the predicate cate invention; e.g., (Gentner 1989)). Nonetheless, we claim represents slowly evolves as new relationships betweenini- that the above account forms the foundation for ground- tially independent outcomesare discovered. Certainly, the ing discovery and semantics of representation of theoretical content of the predicate representation is not determinedby entities in the limit of no prior knowledge;and we claim the correspondence that it mayin fact have with the true that this form of representation and discovery mechanismis property of mass in the world. Also, we have been careful likely present in all systems that makethe leap from purely not to call any one of these predicates "mass",as thoughthat sensorimotorrepresentation to ellaborated representation of predicate exhaustively captured all of the semantics of the the environmentthat includes unobservablecausal structure.

48 At the core, this mechanism,by which an agent can posit McCallum,A. K. 1995. Reinforcement Learning with Se- and refine theoretical entities, is fundedby a repesentational lective and HiddenState. Ph.D. Dissertation, substrait that (a) accommodatesrecognition of failure to an- University of Rochester, Rochester, NY. ticipate (predict) howthe world will behave, and (b) accom- Shen, W.-M. 1993. Discovery as autonomous learning modates new proposed predicate representations that have from the environment. Machine Learning 12(1-3):143- extensable and revisable content. As we have shown, the 165. base schemadata structure, with its integral relationship be- tween preconditions, actions, and postconditions, is suit- Whitehead, S. D., and Ballard, D. H. 1991. Learning to able for both of these requirements: (a’) in the schema,the perceive and act by trial and error. MachineLearning 7:45- agent’s ownaction, indexed by preconditions, entails out- 83. comeconditions (effects), allowing the agent to test for it- self whetherits ownrepresentations are true or false - that is, these repesentations bear a value that is detectable by the system itself, while also contingent on the environ- mentalcategory (in this case, the potential unobservedstate, process or property) that the predicate is intended to antic- ipate (see (Bickhard 1993)); and ~) the conditions under which the value for a predicate is determined is based on an evolving set of "triggering" conditions, in turn based on other schemas- these triggering conditions constitute the evolving content (meaning)of that predicate.

Acknowledgements This research is supported by DARPAcontract #DASG60- 99-C-0074. The U.S. Government is authorized to re- produce and distribute reprints for governmental purposes notwithstanding any copyright notation hereon. The views an conclusionsherein are those of the authors and should not be interpreted as necessarily representing the official poli- cies or endorsementseither expressed or implied, of DARPA or the U.S. Government.

References Bickhard, M. H. 1993. Representational Content in Hu- mans and Machines. Journal of Experimental and Theo- retical Artificial Intelligence 5:285-333. Drescher, G. L. 1991. Made-UpMinds. MITPress. Gentner, D. 1989. The Mechanismsof Analogical Learn- ing. In Vosniadou, S., and Ortony, A., eds., Similarity and Analogical Reasoning (pp.199-241). Cambridge Uni- versity Press. Gopnik, A., and Meltzoff, A. N. 1997. Words, , and Theories. MITPress. Harr6, R. 1970. The of Scientific Thinking. Macmillian. Harr6, R. 1986. Varieties of Realism: a Rationale for the Natural Sciences. Blackwell. Kwok, R. B.H. 1996. Creating Theoretical Terms for Non-deterministic Actions. In Foo, N. Y., and Goebel, R., eds., PRICAI’96:Topics in Artificial Intelligence, 4th Pacific RimInternational Conferenceon Artificial Intelli- gence, Cairns, Australia, August26-30, 1996, Proceedings. Lecture Notes in ComputerScience, Vol. 1114, Springer. Leslie, A.M.1988. The necessity of illusion: perception and thought in infancy. In Weiskrantz, L., ed., Thought Without Language. Oxford University Press (Clarendon).

49