<<

IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, VOL. 1, NO. 2, AUGUST 2009 1 Some Basic Principles of Developmental Alexander Stoytchev, Member, IEEE

Abstract—This paper formulates five basic principles of devel- ioms of the field. There are some recurring themes in the devel- opmental robotics. These principles are formulated based on some opmental learning literature and in the author’s own research, of the recurring themes in the developmental learning literature however, that can be used to formulate some basic principles. and in the author’s own research. The five principles follow logi- cally from the verification principle (postulated by Richard Sutton) These principles are neither laws (as they cannot be proved at which is assumed to be self-evident. This paper also gives an ex- this point) nor axioms (as it would be hard to argue at this point ample of how these principles can be applied to the problem of au- that they are self-evident and/or form a consistent set). Never- tonomous tool use in . theless, they can be used to guide future research until they are Index Terms— Artificial intelligence, developmental robotics, in- found to be inadequate and it is time to modify or reject them. telligent robots, learning systems, principles, psychology, robots, Five basic principles are described below. programming.

II. THE VERIFICATION PRINCIPLE I. INTRODUCTION Developmental robotics emerged as a field partly as a reac- tion to the inability of traditional robot architectures to scale up EVELOPMENTAL robotics is one of the newest to tasks that require close to human levels of intelligence. One of D branches of robotics [1]–[3]. The basic research as- the primary reasons for scalability problems is that the amount sumption of this field is that “true intelligence in natural and of programming and knowledge engineering that the robot de- (possibly) artificial systems presupposes three crucial proper- signers have to perform grows very rapidly with the complexity ties: embodiment of the system, situatedness in a physical or of the robot’s tasks. There is mounting evidence that pre-pro- social environment, and a prolonged epigenetic developmental gramming cannot be the solution to the scalability problem. The process through which increasingly more complex cognitive environments in which the robots are expected to operate are structures emerge in the system as a result of interactions with simply too complex and unpredictable. It is naive to think that the physical or social environment” [2]. this complexity can be captured in code before the robot is al- Many fields of science are organized around a small set of lowed to experience the world through its own sensors and ef- fundamental laws (e.g., Newton’s laws in Physics or the fun- fectors. damental laws of Thermodynamics). Progress in a field without For example, consider the task of programming a household any fundamental laws tends to be slow and incoherent. Once the robot with the ability to handle all possible objects that it can fundamental laws are formulated, however, the field can thrive encounter inside a home. It is simply not possible for any robot by building upon them. This progress continues until the laws designer to predict the number of objects that the robot may are found to be too insufficient to explain the latest experimental encounter and the contexts in which they can be used over the evidence. At that point the old laws must be rejected and new robot’s projected service time. laws must be formulated so the scientific progress can continue. There is yet another fundamental problem that pre-program- In some fields of science, however, it is not possible to formu- ming not only cannot address, but actually makes worse. The late fundamental laws because it would be impossible to prove problem is that programmers introduce too many hidden as- them, empirically or otherwise. Nevertheless, it is still possible sumptions in the robot’s code. If the assumptions fail, and they to get around this obstacle by formulating a set of basic prin- almost always do, the robot begins to act strangely and the pro- ciples that are stated in the form of postulates or axioms, i.e., grammers are sent back to the drawing board to try and fix what is wrong. The robot has no way of testing and verifying these statements that are presented without proof because they are considered to be self-evident. The most famous example of this hidden assumptions because they are not made explicit. There- approach, of course, is Euclid’s formulation of the fundamental fore, the robot is not capable of autonomously adapting to situa- axioms of Geometry. tions that violate these assumptions. The only way to overcome this problem is to put the robot in charge of testing and verifying Developmental robotics is still in its infancy and it would be premature to try to come up with the fundamental laws or ax- everything that it learns. To make this point more clear consider the following ex- ample. Ever since the first autonomous mobile robots were de- Manuscript received February 14, 2009; revised July 27, 2009. First published veloped [4], [5] robots have been avoiding obstacles. After al- date August 11, 2009; current version published October 23, 2009. A prelimi- nary version of this paper appeared in the NIPS 2006 Workshop on Grounding most 30 years of mobile robotics research, however, there is still Perception, Knowledge, and Cognition in Sensori-Motor Experience. not a single robot today that really “understands” what an ob- The author is with the Developmental Robotics Laboratory, Department of stacle is and why it should be avoided. The predominant ap- Electrical and Computer Engineering, Iowa State University, Ames, IA 50011 USA (e-mail: [email protected]). proach in robotics is still to ignore this question entirely and to Digital Object Identifier 10.1109/TAMD.2009.2029989 make the hidden assumption that an obstacle is equivalent to a

1943-0604/$26.00 © 2009 IEEE 2 IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, VOL. 1, NO. 2, AUGUST 2009 short reading on one of the proximity sensors (e.g., laser range this statement in the “weak” sense as he is physically capable finder, sonar, etc.). The problem with this assumption is that it is of performing the verification procedure if necessary. However, true only in some specific situations and not true in general. As the same blind person will not be able to verify, neither in the a result, outdoor robots cannot distinguish grass from steel rods “strong” nor in the “weak” sense, the statement “this object is using a laser range finder. A better approach would be for the red” as he does not have the ability to see and thus to perceive robot to try to drive over the detected object at low speed and see colors. In Ayer’s own words: if it is displaceable and, thus, traversable. Some recent studies “But there remain a number of significant propositions, have used this approach successfully for [6]. concerning matters of fact, which we could not verify even After this introduction, the first basic principle can be stated. if we chose; simply because we lack the practical means It is the so-called verification principle that was first postulated of placing ourselves in the situation where the relevant ob- by Richard Sutton in a series of on-line essays in 2001 [7], [8]. servations could be made” [9, p. 36]. The principle is stated as follows: The verification principle is easy to state. However, once a The Verification Principle: An AI system can create commitment is made to follow this principle the implications and maintain knowledge only to the extent that it can verify are far-reaching. In fact, the principle is so different from the that knowledge itself [8]. practices of traditional autonomous robotics that it changes al- According to Sutton, “the key to a successful AI is that it can most everything. In particular, it forces the programmer to re- tell for itself whether or not it is working correctly”[8]. The only think the ways in which learnable quantities are encoded in the reasonable way to achieve this goal is to put the AI system in robot architecture as anything that is potentially learnable must charge of its own learning using the verification principle. In also be autonomously verifiable. other words, all AI systems and AI learning algorithms should The verification principle is so profound that the remaining follow the motto: No verification, no learning. If verification is four principles can be considered as its corollaries. As the con- not possible for some concept then the AI system should not be nection may not be intuitively obvious, however, they will be handcoded with that concept. stated as principles. Sutton also points out that the verification principle eventu- ally will be adopted by many AI practitioners because it of- III. THE PRINCIPLE OF EMBODIMENT fers fundamental practical advantages over alternative methods when it comes to scalability. Another way of saying the same An important implication of the verification principle is that thing is: “Never program anything bigger than your head”[8]. the robot must have the ability to verify everything that it learns. Thus, the verification principle stands for autonomous testing Because verification cannot be performed in the absence of ac- and verification performed by the robot and for the robot. As tions the robot must have some means of affecting the world, explained above, it would be unrealistic to expect the robot pro- i.e., it must have a body. grammers to fix their robots every time the robots encounter a The principle of embodiment has been defended many times problem due to a hidden assumption. in the literature, e.g., [10]–[15]. It seems that at least in robotics Sutton was the first researcher in AI to state the verifica- there is a consensus that this principle must be followed. After tion principle explicitly. However, the origins of the verification all, there are not any robots without bodies. principle go back to the ideas of the logical positivists philoso- Most of the arguments in favor of the embodiment principle phers of the 1930s. The two most prominent among them were that have been put forward by roboticists, however, are about Rudolf Carnap and Alfred Ayer. They both argued that state- justifying this principle to its opponents (e.g., [11] and [12]). ments that cannot be either proven or disproven by experience The reasons for this are historical. The early AI systems, or as (i.e., metaphysical statements) are meaningless. Ayer defined Brooks calls them Good Old Fashioned AI (GOFAI), were dis- two types of verifiability, “strong” and “weak,” which he for- embodied and their learning algorithms manipulated data in the mulated as follows: computer’s memory without the need to interact with the ex- ternal world. The creators of these early AI systems believed “A proposition is said to be verifiable, in the strong sense that a body is not strictly required as the AI system could con- of the term, if and only if, its truth could be conclusively sist of just pure code, which can still learn and perform intel- established in experience. But it is verifiable, in the weak ligently. This reasoning is flawed, however, for the following sense, if it is possible for experience to render it probable” reason: code cannot be executed in a vacuum. The CPU, the [9, p. 37]. memory bus, and the hard disk play the role of the body. While Thus, in order to verify something in the “strong” sense, one the code does not make this assumption it is implicitly made for would have to physically perform the verification sequence. On it by the compiler which must know how to translate the code the other hand, to verify something in the “weak” sense, one into the target machine language. does not have to perform the verification sequence directly, but As a result of this historic debate most of the arguments in one must have the prerequisite sensors, effectors, and abilities favor of embodiment miss the main point. The debate should not to perform the verification sequence if necessary. be about whether or not to embrace the principle of embodiment. For example, a blind person may be able to verify in the Instead, the debate should be about the different ways that can be “strong” sense the statement “this object is soft” by physically used to program truly embodied robots. Gibbs makes a similar touching the object and testing its softness. He can also verify observation about the current state of the art in AI and robotics: STOYTCHEV: BASIC PRINCIPLES OF DEVELOPMENTAL ROBOTICS 3

“Despite embracing both embodiment and situatedness in de- just a phantom created by their brains. There is a very simple signing enactive robots, most systems fail to capture the way experiment which can be performed without any special equip- bodily mechanisms are truly embedded in their environments” ment that exposes the phantom body [18]. The experiment goes [15, p. 73]. like this: a subject places his arm under a table. The person con- Some of the arguments used to justify the embodiment prin- ducting the experiment sits right next to the subject and uses ciple can easily be explained from the point of view of the veri- both of his hands to deliver simultaneous taps and strokes to fication principle. Nevertheless, the connection between the two both the subject’s arm (which is under the table) and the surface has not been made explicit so far. Instead of rehashing the de- of the table. If the taps and strokes are delivered synchronously bate in favor of embodiment, which has been argued very elo- then, after about two minutes, the subject will have the bizarre quently by others, e.g., [10], [15], I am only going to focus on sensation that the table is part of his body and that part of his a slightly different interpretation of embodiment in light of the skin is stretched out to lie on the surface of the table. Similar verification principle. extensions and re-mappings of the body have been reported by In my opinion, most arguments in favor of the embodiment others [19]–[21]. principle make a distinction between the body and the world and The conclusions from these studies may seem strange be- treat the body as something special. In other words, they make cause typically one would assume that embodiment implies that the body/world boundary explicit. This distinction, however, is there is a solid representation of the body somewhere in the artificial. The only reason why the body may seem special is brain. One possible reason for the phantom body is that the body because the body is the most consistent, the most predictable, itself is not constant, but changes over time. Our bodies change and the most verifiable part of the environment. Other than that, with age. They change as we gain or lose weight. They change there should be no difference between the body and the external when we suffer the results of injuries or accidents. In short, our world. To the brain, the body may seem special, but that is just bodies are constantly changing. Thus, it seems impossible that because “the brain is the body’s captive audience” [16, p. 160]. the brain should keep a fixed representation for the body. If this In other words, the body is always there and we can’t run away representation is not flexible then sooner or later it will become from it. obsolete and useless. According to the new interpretation of the embodiment prin- Another possible reason for the phantom body is that it may ciple described here, the body is still required for the sake of be impossible for the brain to predict all complicated events verification. However, the verification principle must also be ap- that occur within the body. Therefore, the composition of the plicable to the properties of the body, i.e., the properties of the body must be constructed continuously from the latest available body must be autonomously verifiable as well. Therefore, the information. This is eloquently stated by Damasio: learning and exploration principles that the robot uses to explore “Moreover, the brain is not likely to predict how all the the external world must be the same as the ones that it uses to commands—neural and chemical, but especially the latter- explore the properties of its own body. will play out in the body, because the play-out and the re- This interpretation reduces the special status of the body. In- sulting states depend on local biochemical contexts and on stead of treating the body as something special, the new inter- numerous variables within the body itself which are not pretation treats the body as simply the most consistent, the most fully represented neurally. What is played out in the body is predictable, and the most verifiable part of the environment. Be- constructed anew, moment by moment, and is not an exact cause of that the body can be easily distinguished from the envi- replica of anything that happened before. I suspect that ronment. Furthermore, in any developmental trajectory the body the body states are not algorithmically predictable by the must be explored first. brain, but rather that the brain waits for the body to report Distinguishing the body from the external world should be what actually has transpired” [16, p. 158]. relatively easy because there are certain events that only the owner of the body can experience and no one else. Rochat [17] The author’s previous work [22] describes a computational calls these events self-specifying and lists three such events: 1) representation for a Robot body schema (RBS). This represen- efferent-afferent loops (e.g., moving one’s hand and seeing it tation is learned by the robot from self-observation data. The move); 2) double touch (e.g., touching one’s two index fingers RBS representation meets the requirements of both the verifica- together); 3) vocalization behaviors followed by hearing their tion principle and the embodiment principle as the robot builds results (e.g., crying and hearing oneself cry). These events are a model for its own body from self-observation data that is re- characterized by the fact that they are multimodal, i.e., they in- peatably observable. volve more than one sensory or motor modality. Also, these The benefits of self-observation learning have also been events are autonomously verifiable because you can always re- pointed out by others. For example, Chaminade et al. [23] peat the action and observe the same result. tested the hypothesis that sensorimotor associations for hands Because the body is constructed from actual verifiable experi- and fingers learned from self-observation during motor bab- ence, in theory, it should be possible to change one’s body repre- bling could be used to bootstrap imitative abilities. sentation. In fact, it turns out that this is surprisingly easy to do. Some experiments have shown that the body/world boundary IV. THE PRINCIPLE OF SUBJECTIVITY is very pliable and can be altered in a matter of seconds [18], The principle of subjectivity also follows quite naturally from [19]. For example, it comes as a total surprise for many people the verification principle. If a robot is allowed to learn and main- to realize that what they normally think of as their own body is tain only knowledge that it can autonomously verify for itself, 4 IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, VOL. 1, NO. 2, AUGUST 2009 then it follows that what the robot learns must be a function of build its own perceptual world, which would be autonomously what the robot has experienced through its own sensors and ef- verifiable by the robot. fectors. In other words, learning must be a function of experi- From what has been said so far one can infer that the essence ence. As a consequence, two robots with the same control archi- of the principle of subjectivity is that it imposes limitations on tectures, but with different interaction histories with a given ob- what is potentially learnable by a specific agent. In particular, ject could have two unique representations for the same object, there are two types of limitations: sensorimotor and experien- i.e., the two representations will be subjective. The subjectivity, tial. Each of them is discussed below along with the adaptation or uniqueness, is due to the capabilities of the robots (sensori- mechanisms that have been adopted by animals and humans to motor limitations) and their specific interaction histories (expe- reduce the impact of these limitations. riential limitations). Ayer was probably the first one to recognize that the veri- A. Sensorimotor Limitations fication principle implies subjectivity. He observed that if all The first limitation imposed on the robot by the subjectivity knowledge must be verifiable through experience, then it fol- principle is that what is potentially learnable is determined lows that all knowledge is subjective as it has to be formed by the sensorimotor capabilities of the robot’s body. In other through individual experiences [9, p. 125–126]. Thus, what is words, the subjectivity principle implies that all learning is learned depends entirely on the capabilities of the learner and pre-conditioned on what the body is capable of doing. For the history of interactions between the learner and the environ- example, a blind robot cannot learn what the meaning of the ment, or between the learner and its own body. Furthermore, if color red is because it does not have the ability to perceive the learner does not have the capacity to perform a specific veri- colors. fication procedure, then the learner would never be able to learn While it may be impossible to learn something that is beyond something that depends on that procedure (as in the blind person the sensorimotor limitations of the body, it is certainly possible example given above). Thus, subjectivity may be for develop- to push these limits farther by building tools and instruments. mental learning what relativity is for physics—a fundamental It seems that a common theme in the history of human techno- limitation that cannot be avoided or circumvented. logical progress is the constant augmentation and extension of The subjectivity principle captures very well the subjective the existing capabilities of our bodies. For example, Campbell nature of object affordances. A similar notion was suggested outlines several technological milestones which have essentially by Gibson who stated that a child learns “his scale of sizes as pushed one body limit after another [28]. The technological pro- commensurate with his body, not with a measuring stick” [24, gression described by Campbell starts with tools that augment p. 235]. Thus, an object affords different things to people with our physical abilities (e.g., sticks, stone axes, and spears), then different body sizes; an object might be graspable for an adult, moves to tools and instruments that augment our perceptual abil- but may not be graspable for a child. Noë has recently given a ities (e.g., telescopes and microscopes), and it is currently at the modern interpretation of Gibson’s ideas and has stressed that stage of tools that augment our cognitive abilities (e.g., com- affordances are also skill relative: puters and PDAs). “Affordances are animal-relative, depending, for ex- Regardless of how complicated these tools and instruments ample, on the size and shape of the animal. It is worth are, however, their capabilities will always be learned, con- noting that they are also skill-relative. To give an example, ceptualized, and understood relative to our own sensorimotor a good hitter in baseball is someone for whom a thrown capabilities. In other words, the tools and instruments are pitch affords certain possibilities for movement. The ex- nothing more than prosthetic devices that can only be used if cellence of a hitter does not consist primarily in having they are somehow tied to the pre-existing capabilities of our excellent vision. But it may very well consist in the mastery bodies. Furthermore, this tool-body connection can only be of sensorimotor skills, the possession of which enables a established through the verification principle. The only way in situation to afford an opportunity for action not otherwise which we can understand how a new tool works is by expressing available” [25, p. 106]. its functionality in terms of our own sensorimotor repertoire. This is true even for tools and instruments that substitute one In robotics, Brooks [26], [27] was the first one to argue that sensing modality for another. For example, humans have no the robot’s Merkwelt—the perceptual world in which the robot natural means of reading magnetic fields, but we have invented lives—is different from the programmer’s Merkwelt. Brooks the compass which allows us to do that. The compass, however, strongly objects to the common practice in robotics of using per- does not convert the direction of the magnetic field into a ceptual abstraction which “reduces the input data so that the pro- modality that we cannot interpret, e.g., infrared light. Instead, gram experiences the same perceptual world (Merkwelt) as hu- it converts it to human readable form with the help of a needle. mans.” [26]. Thus, it is hopeless for a human to try to impose his The exploration process involved in learning the functional own Merkwelt on the robot because “each animal species, and properties or affordances of a new tool is not always straight clearly each robot species with their own distinctly non-human forward. Typically, this process involves active trial and error. sensor suites, will have their own different Merkwelt. [… T]he Probably the most interesting aspect of this exploration is that Merkwelt we humans provide our programs is based on our own the functional properties of the new tool are learned in relation introspection. It is by no means clear that such a Merkwelt is to the existing behavioral repertoire of the learner. anything like what we actually use internally” [26]. Instead, the The related work on animal object exploration indicates that robot should be allowed to use its own sensorimotor apparatus to animals use stereotyped exploratory behaviors when faced with STOYTCHEV: BASIC PRINCIPLES OF DEVELOPMENTAL ROBOTICS 5 a new object [29], [30]. This set of behaviors is species specific not be able to function normally. Nevertheless, many philoso- and may be genetically pre-determined. For some species of an- phers have grappled with this fundamental question. imals, these tests include almost their entire behavioral reper- To answer this question without violating the basic principles toire: “A young corvide bird, confronted with an object it has that have been stated so far, we must allow for the fact that the never seen, runs through practically all of its behavioral patterns, representations that two agents have may be functionally dif- except social and sexual ones”[30, p. 44]. ferent, but nevertheless they can be qualitatively the same. Fur- Unlike crows, adult humans rarely explore a new object by thermore, the verification principle can be used to establish the subjecting it to all possible behaviors in their behavioral reper- qualitative equivalence between the representations of two dif- toire. Human object exploration tends to be more focused, al- ferent agents. This was well understood by Ayer who stated the though that is not always the case with human infants [29]. Nev- following: ertheless, an extensive exploration process similar to the one dis- “For we define the qualitative identity and difference played by crows can sometimes be observed in adult humans as of two people’s sense-experiences in terms of the simi- well. This process is easily observed in the members of tech- larity and dissimilarity of their reactions to empirical tests. nologically “primitive” societies when they are exposed to an To determine, for instance, whether two people have the object for the first time from a technologically advanced society same colour sense we observe whether they classify all the [31, p. 246]. colour expanses with which they are confronted in the same In a previous paper the author described a method for au- way; and, when we say that a man is colour-blind, what we tonomous learning of object affordances by a robot [32]. The are asserting is that he classifies certain colour expanses robot learns the affordances of different tools in terms of the in a different way from that in which they would be classi- expected outcomes of specific exploratory behaviors. The affor- fied by the majority of people” [9, p. 132]. dance representation is inherently subjective as it is expressed in terms of the behavioral repertoire of the robot (i.e., it is skill rel- Another reason why two humans can understand each other ative). The affordance representation is also subjective because even though they have totally different life experiences is be- the affordances are expressed relative to the capabilities of the cause they have very similar physical bodies. While no two robot’s body. For example, if an object is too thick to be grasped human bodies are exactly the same, they still have very similar by the robot, the robot learns that the object is not graspable even structure. Furthermore, our bodies have limits which determine though it might be graspable for a different robot with a larger how we can explore the world through them (e.g., we can only gripper [33]. move our hands so fast). On the other hand, the world is also structured and imposes restrictions on how we can explore it B. Experiential Limitations through our actions (e.g., an object that is too wide may not be In addition to sensorimotor limitations, the subjectivity prin- graspable). Because we have similar bodies and because we live ciple also imposes experiential limitations on the robot. Expe- in the same physical world, there is a significant overlap which riential limitations restrict what is potentially learnable simply allows us to have a shared understanding. Similar ideas have because learning depends on the history of interactions between been proposed in psychology and have been gaining popularity the robot and the environment, i.e., it depends on experience. in recent years [15], [25], [34], [35]. Because experience is a function of time, this limitation is es- Consequently, experience must constantly shape or change all sentially due to the finite amount of time that is available for internal representations of the agent over time. Whatever repre- any type of learning. One interesting corollary of this is that: sentations are used they must be flexible enough to be able to the more intelligent the life form, the longer it has to spend in change and adapt when new experience becomes available. A the developmental stage. good amount of experimental evidence suggests that such adap- Time is a key factor in developmental learning. By default tation takes place in biological systems. For example, the repre- developmental learning requires interaction with the external sentation of the fingers in the somatosensory cortex of a monkey world. There is a limit on how fast this interaction can occur, depends on the pattern of their use [36]. If two of the fingers are which ultimately restricts the speed of learning. While the used more often than other fingers then the number of neurons limitation of time cannot be avoided it is possible to speed up in the somatosensory cortex that are used to encode these two learning by relying on the experience of others. The reason fingers will increase [36]. why this does not violate the subjectivity principle is because The affordance representation described in [32] is influenced verification can be performed in the “weak sense,” and not only by the actual history of interactions between the robot and the in the “strong sense.” Humans, for example, often exploit this tools. The affordance representation is pliable and can accom- shortcut. Ever since writing was invented, we have been able to modate the latest empirical evidence about the properties of the experience places and events through the words and pictures of tool. For example, the representation can accommodate tools others. These vicarious experiences are essential for us. that can break—a drastic change that significantly alters their Vicarious experiences, however, require some sort of basic affordances. overlap between our understanding of the world and that of others. Thus, the following question arises: if everything that V. T HE PRINCIPLE OF GROUNDING is learned is subjective, then how can two different people have While the verification principle states that all things that the a common understanding about anything? Obviously this is not robot learns must be verifiable, the grounding principle de- a big issue for humans because, otherwise, our civilization will scribes what constitutes a valid verification. Grounding is very 6 IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, VOL. 1, NO. 2, AUGUST 2009 important because if the verification principle is left unchecked Stating that grounding is performed in terms of act-outcome it can easily go into an infinite recursion. At some point there pairs coupled with a probabilistic estimate is a good start, but needs to be an indivisible entity which is not brought under leaves the formulation of grounding somewhat vague. Each ac- further scrutiny, i.e., an entity which does not require additional tion or behavior is itself a very complicated process that involves verification. Thus, figuratively speaking, grounding puts the multiple levels of detail. The same is true for the outcomes or brakes on verification. observations. Thus, what remains to be addressed is how to iden- Grounding is a familiar problem in AI. In fact, one of the tify the persistent features of a verification sequence that are oldest open problems in AI is the so-called symbol grounding constant across different contexts. In other words, one needs to problem [37]. Grounding, however, is also a very loaded term. identify the sensorimotor invariants. Because the invariants re- Unfortunately, it is difficult to come up with another term to re- main unchanged they are worth remembering and thus, can be place it with. Therefore, for the purposes of this document, the used for grounding. term grounding is used only to refer to the process or the out- While there could be potentially infinite number of ways to come of the process which determines what constitutes a suc- ground some information, this section will focus on only one of cessful verification. them. It is arguably the easiest one to pick out from the sensori- Despite the challenges in defining what constitutes motor flux and probably the first one to be discovered develop- grounding, if we follow the principles outlined so far, we mentally. This mechanism for grounding is based on detection can arrive at the basic components of grounding. The motiva- of temporal contingency. tion for stating the embodiment principle was that verification Temporal contingency is a very appealing method for is impossible without the ability to affect the world. This grounding because it abstracts away the nature and complexity implies that the first component that is necessary for successful of the stimuli involved and reduces them to the relative time verification (i.e., grounding) is an action or a behavior. of their co-occurrence. The signals could come from different The action by itself, however, is not very useful for the pur- parts of the body and can have their origins in different sensors poses of successful verification (i.e., grounding) because it does and actuators. not provide any sort of feedback. In order to verify anything, Temporal contingency is easy to calculate. The only require- the robot needs to be able to observe the outcomes of its own ment is to have a mechanism for reliable detection of the interval actions. Thus, the second component of any verification proce- between two events. The events can be represented as binary dure must be the outcome or outcomes that are associated with and the detection can be performed only at the times in which the action that was performed. these signals change from 0 to 1 or from 1 to 0. Furthermore, This leads us to one of the main insights of this sec- once the delay between two signals is estimated it can be used tion, namely, that grounding consists of Act-Outcome (or to predict future events. Based on this, the robot can easily de- Behavior-Observation) pairs. In other words, grounding is tect that something does not feel right even if the cause for that achieved through the coupling of actions and their observable is not immediately identifiable. A more sophisticated model of outcomes. Piaget expressed this idea when he said that “chil- contingency detection is used by the Neo system [40], which dren are real explorers” and that “they perform experiments in maintains contingency tables for all pairs of its streams. order to see.” Similar ideas have been proposed and defended Timing contingency detection is used in [41] and [42] to de- by others, e.g., [10], [15], [24], [25], [29], [35], [38], and [39]. tect which perceptual features belong to the body of the robot. Grounding of information based on a single act-outcome pair In order to do that, the robot learns the characteristic delay be- is not sufficient, however, as the outcome may be due to a lucky tween its motor actions (efferent stimuli) and the movements of coincidence. Thus, before grounding can occur the outcome perceptual features in the environment (afferent stimuli). This must be replicated at least several times in the same context. If delay can then be used to classify the perceptual stimuli that the the act-outcome pair can be replicated, then the robot can build robot can detect into “self” and “other.” up probabilistic confidence that what was observed was not just Detection of temporal contingency is very important for the due to pure coincidence, but that there is a real relationship that normal development of social skills as well. In fact, it has often can be reliably reproduced in the future. been suggested that contingency alone is a powerful social Thus, grounding requires that action-outcome pairs be cou- signal that plays an important role in language acquisition pled with some sort of probabilistic estimates of repeatability. [43], learning to imitate [44], social perception [45], and more Confidence can be built up over time if multiple executions of generally in social cognition [46]. Temporal contingency has the same action lead to the same outcome under similar con- also been used to explain why you cannot tickle yourself [47]. ditions. In many situations, the robot should be able to repeat Watson [48] proposed that the “contingency relation between a the action (or sequence of actions) that were executed just prior behavior and a subsequent stimulus may serve as a social signal to the detection of the outcome. If the outcome can be repli- beyond (possibly even independent of) the signal value of the cated, then the act-outcome pair is worth remembering as it stimulus itself.” This might be a fruitful area of future robotics is autonomously verifiable. Another way to achieve this is to research. remember only long sequences of (possibly different) act-out- come pairs which are unlikely to occur in any other context due VI. THE PRINCIPLE OF INCREMENTAL DEVELOPMENT to the length of the sequence. This latter method is closer to The principle of incremental development recognizes the fact Gibson’s ideas for representing affordances. that it is impossible to learn everything at the same time. Before STOYTCHEV: BASIC PRINCIPLES OF DEVELOPMENTAL ROBOTICS 7 we learn to walk, we must learn how to crawl. Before we learn panzees are not as interested in sitting down and manually ex- to read we must learn to recognize individual letters. There is ploring objects because they learn to walk at a much younger no way around that. Development is incremental and cumula- age. To the extent that object exploration occurs in chimpanzees tive in nature. The “continuous development and integration of it usually is performed when the objects are on the ground [29], new skills” is a process that has has often been called Ongoing [56]. Chimpanzees rarely pick up an object in order to bring it Emergence [49]. The gradual accumulation of knowledge and to the eyes and explore it [29]. The full implications of these skills during development is similar to what Michael Tomasello developmental differences on the overall intelligence of the two has called “the ratchet effect” [50]. Once a skill has been learned species are still being debated. Perhaps developmental robotics it can readily be adopted by the individual or by other members can help clarify this in the near future. of the society to discover new skills, thus, the ratchet has gone Another interesting result from comparative studies is that up one notch. object exploration (and exploration in general) seems to be self- Every major developmental theory either assumes or explic- guided and does not require external reinforcement. What is not itly states that development proceeds in stages [51]–[53]. These yet clear, however, is what process initiates exploration and what theories, however, often disagree about what causes the stages process terminates it. and what triggers the transitions between them. Variations in The principle of incremental development states that explo- the timing of these stages have also been observed between ration is self-regulated and always proceeds from the most ver- the members of the same species. Therefore, the age limits set ifiable to the least verifiable parts of the environment. In other by some of these theories about what developmental milestone words, the exploration is guided by an attention mechanism that should happen when must be treated as rough guidelines and not is continually attracted to parts of the environment that exhibit as fixed rules. medium levels of verifiability. When some parts of the environ- E. J. Gibson (who was J. J. Gibson’s wife) has even expressed ment are explored fully they begin to exhibit perfect levels of some doubts about the usefulness of formulating stages in de- verifiability and thus are no longer interesting. Therefore, the ex- velopmental learning: “I want to look for trends in development, ploration process can chart a developmental trajectory without but I am very dubious about stages. To repeat, trends do not external reinforcement because what is worth exploring next imply stages in each of which a radically new process emerges, depends on what is being explored now. Note that the “envi- nor do they imply maturation in which a new direction exclusive ronment” may refer to the robot’s body as well. For example, of learning is created.” [38, p. 450] Others have embraced the once a robot has mastered its body movements, object manipu- idea of stages, but only as far as they are useful in forming cer- lation movements begin to have medium levels of contingency tain developmental invariants: “Although the stages correspond and should draw the robot’s attention automatically. roughly to age levels (at least in the children studied [by Pi- The previous section described how temporal contingency aget]), their significance is not that they offer behavior norms can be used for successful verifiability (i.e., grounding). This for specific ages but that the sequence of the stages and the tran- section builds upon that example, but also takes into account sition from one stage to the next is invariant.” [54, p. 37]. Still the level of contingency that is detected. At any point in time others, have suggested that an information-processing perspec- the parts of the environment that are the most interesting, and tive might be the only common organizing principle in the liter- thus worth exploring, exhibit medium levels of contingency. To ature on infant perception and cognition [55]. see why this might be the case consider the following example. Regardless of what causes the stages (or the appearance of In his experiments with infants, Watson (1985) observed that stages), one of the most important lessons that roboticists can the level of contingency that is detected by the infants is very im- draw from developmental studies is that the final outcome de- portant. For example, he observed that 16-week-old infants only pends not just on the stages, but on their relative order and dura- paid attention to imperfect contingencies. In his experiment, tion. Some really good lessons can be learned from an inspira- the infants watched a TV monitor which showed a woman’s tional area of research that compares and contrasts the develop- face. The TV image was manipulated such that the woman’s mental sequences of different organisms. Comparative studies face would become animated for 2-s intervals after the infant between primates and humans are useful precisely because they kicked with his legs. The level of this contingency was varied by expose the major developmental differences between different adjusting the timing delay between the infants’ kicking move- species that follow Piaget’s sequence in their development [29], ments and the animation. Somewhat surprisingly, the infants in [56]. this study paid more attention to faces that did not show the per- For example, the time during which autonomous locomotion fect contingency (i.e., faces that did not move immediately after emerges after birth in primates varies significantly between dif- the infants’ kicking movements). This result led Watson to con- ferent species [29], [56]. In chimpanzees this is achieved fairly clude that the infant’s attentional mechanisms may be modu- rapidly and then they begin to move about the environment on lated by an inverted U-shaped function based on the contingency their own. In humans, on the other hand, independent locomo- of the stimulus [48]. tion does not emerge until about a year after birth. An important An attention function that has these properties seems ideal consequence of this is that human infants have a much longer for an . If a stimulus exhibits perfect contin- developmental period during which they can manually explore gency then it is not very interesting as the robot can already and manipulate objects. They tend to play with objects, rotate predict everything about that stimulus. On the other hand, if the them, chew them, throw them, relate them to one another, and stimulus exhibits very low levels of contingency then the robot bring them to their eyes to take a closer look. In contrast, chim- cannot learn a predictive model of that stimulus which makes 8 IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, VOL. 1, NO. 2, AUGUST 2009 that stimulus uninteresting as well. Therefore, the really inter- During the next stage, the robot uses its body as a well defined esting stimuli are those that exhibit medium levels of contin- reference frame from which the movements and positions of en- gency. vironmental objects can be observed. In particular, the robot can E. J. Gibson reached conclusions similar to those of Watson. learn that certain behaviors (e.g., grasping) can reliably cause She argued that perceptual systems are self-organized in such an environmental object to move in the same way as some part a way that they always try to reduce uncertainty. Furthermore, of the robot’s body (e.g., its wrist) during subsequent robot be- this search is self-regulated and does not require external rein- haviors. Thus, the robot can learn that the grasping behavior forcement, i.e., it is intrinsically motivating [57], [58]: is necessary in order to control the position of the object reli- ably. This knowledge is used for subsequent tool-using behav- “The search is directed by the task and by intrinsic cog- iors. One method for learning these first-order (or binding) af- nitive motives. The need to get information from the en- fordances is described in [33]. vironment is as strong as to get food from it, and obvi- Next, the robot can use the previously explored properties of ously useful for survival. The search is terminated not by objects and relate them to other objects. In this way, the robot externally provided rewards and punishments, but by in- can learn that certain actions with objects can affect other ob- ternal reduction of uncertainty. The products of the search jects, i.e., they can be used as tools. Using the principles of ver- have the property of reducing the information to be pro- ification and grounding the robot can learn the affordances of cessed. Perception is thus active, adaptive, and self-regu- tools. The robot can autonomously verify and correct these af- lated” [38, p. 144]. fordances if the tool changes or breaks [32]. Thus, the main message of this section is that roboticists should try to identify attention functions for autonomous robots VIII. SUMMARY that have properties similar to the ones described above. This seems to be a promising area of future research. This paper proposed five basic principles of developmental robotics. These principles were formulated based on some of the recurring themes in the developmental learning literature and in the author’s own research. The five principles follow logically VII. AN EXAMPLE:DEVELOPMENTAL SEQUENCE FOR from the verification principle (postulated by Richard Sutton) AUTONOMOUS TOOL USE which is assumed to be self-evident. The paper also described an example of how these principles This section provides an example that uses the five princi- can be applied to autonomous tool use in robots. The author’s ples described above to formulate a developmental sequence. previous work describes the individual components of this se- This sequence can be used by autonomous robots to acquire tool quence in more details [59], [60]. using abilities. By following this sequence, a robot can explore progressively larger chunks of the initially unknown environ- ment that surrounds it. Incremental development is achieved by ACKNOWLEDGMENT detecting regularities that can be explained and replicated with the sensorimotor repertoire of the robot. This exploration pro- The author would like to thank A. Silvescu who was one of ceeds from the most predictable to the least predictable parts of the graduate students in my Developmental Robotics class in fall the environment. 2005. He was the first person to tell me about Ayer’s book and The developmental sequence begins with learning a model its relevance to the current debates in developmental robotics. of the robot’s body since the body is the most consistent and predictable part of the environment. Internal models that re- liably identify the sensorimotor contingencies associated with REFERENCES the robot’s body are learned from self-observation data. For ex- ample, the robot can learn the characteristic delay between its [1] J. Weng, J. McClelland, A. Pentland, O. Sporns, I. Stockman, M. Sur, and E. Thelen, “Autonomous mental development by robots and ani- motor actions (efferent stimuli) and the movements of percep- mals,” Science, vol. 291, no. 5504, pp. 599–600, Jan. 2001. tual features in the environment (afferent stimuli). By selecting [2] J. Zlatev and C. Balkenius, “Introduction: Why epigenetic robotics?,” the most consistently observed delay the robot can learn its own in Proc. 1st Int. Workshop Epigenetic Robot., Sep. 2001. [3] M. Asada et al., “Cognitive developmental robotics: A survey,” IEEE efferent-afferent delay. Furthermore, this delay can be used to Trans. Auton. Mental Develop., vol. 1, no. 1, pp. 12–34, May 2009. classify the perceptual stimuli that the robot can detect into [4] H. P. Moravec, “Obstacle avoidance and navigation in the real world “self” and “other” [41]. by a seeing robot rover,” Ph.D. dissertation, Stanford University, Stand- ford, CA, 1980. Once the perceptual features associated with the robot’s body [5] N. J. Nilsson, “Shakey the Robot, AI Center” Menlo Park, CA, 1984, are identified, the robot can begin to learn certain patterns exhib- Tech. Rep. 323. ited by the body itself. For example, the features that belong to [6] E. Ugur, M. R. Dogar, M. Cakmak, and E. Sahin, “The learning and use of traversability affordance using range images on a ,” the body can be clustered into groups based on their movement in IEEE Int.Conf. Robot. Autom., Rome, Italy, Apr. 10–14, 2007. contingencies. These groups can then be used to form frames [7] R. Sutton, Verification 2001 [Online]. Available: http://www.cs.ual- of reference (or body frames) which in turn can be used to both berta.ca/~ sutton/IncIdeas/Verification.html [8] R. Sutton, Verification, the Key to AI 2001 [Online]. Available: http:// control the movements of the robot as well as to predict the lo- www.cs.ualberta.ca/~ sutton/IncIdeas/KeytoAI.html cations of certain stimuli [59]. [9] A. J. Ayer, Language Truth and Logic. New York: Dover, 1952. STOYTCHEV: BASIC PRINCIPLES OF DEVELOPMENTAL ROBOTICS 9

[10] F. Varela, E. Thompson, and E. Rosch, The Embodied Mind: Cognitive [40] P. R. Cohen, M. S. Atkin, T. Oates, and C. R. Beal, “Neo: Learning con- Science and Human Experience. Cambridge, MA: MIT Press, 1991. ceptual knowledge by sensorimotor interaction with an environment,” [11] R. Brooks and L. Stein, “Building brains for bodies,” Auton. Robots, in Proc. 1st Int. Conf. Auton. Agents, Marina del Ray, CA, 1997, pp. vol. 1, no. 1, pp. 7–25, Nov. 1994. 170–177. [12] R. Brooks, C. Breazeal, M. Marjanovic, B. Scassellati, and M. [41] A. Stoytchev, “Toward video-guided robot behaviors,” in Proc. 7th Int. Williamson, “The Cog project: Building a ,” in Conf. Epigenetic Robot., NJ, Nov. 5–7, 2007, pp. 165–172. Comput. Metaphors, Analogy, Agents, C. Nehaniv, Ed. New York: [42] P. Michel, K. Gold, and B. Scassellati, “Motion-based robotic self- Springer, 1999, pp. 52–87. recognition,” in Proc. IEEE RSJ IROS, Sendai, Japan, 2004. [13] A. Clark, Being There: Putting Brain, Body, and World Together [43] M. Goldstein, A. King, and M. West, “Social interaction shapes bab- Again. Cambridge, MA: MIT Press, 1997. bling: Testing parallels between birdsong and speech,” Proc. Nat. Acad. [14] R. Pfeifer and C. Scheier, Understanding Intelligence. Cambridge, Sci., vol. 100, no. 13, Jun. 2003. MA: MIT Press, 1999. [44] S. S. Jones, “Infants learn to imitate by being imitated,” in Proc. 5th [15] R. W. Gibbs, Jr, Embodiment and Cognitive Science. Cambridge, Int. Conf. Develop. Learning, Bloomington, IN, Jun. 3, 2006. MA: Cambridge University Press, 2006. [45] D. Wolpert, K. Doya, and M. Kawato, “A unifying computational [16] A. R. Damasio, Descartes’ Error: Emotion, Reason and the Human framework for motor control and social interaction,” Philos. Trans. Brain. New York: Gosset/Putnam Press, 1994. Roy. Soc. B: Biol. Sci., vol. 358, no. 1431, pp. 593–602, Mar. 2003. [17] P. Rochat, “Five levels of self-awareness as they unfold early in life,” [46] C. D. Frith, Making up the Mind: How the Brain Creates Our Mental Consciousness Cogn., vol. 12, pp. 717–731, 2003. World. Malden, MA: Wiley-Blackwell, 2007. [18] V. Ramachandran and S. Blakeslee, Phantoms in the Brain: Probing [47] S.-J. Blakemore, D. M. Wolpert, and C. D. Frith, “Why can’t you tickle the Mysteries of the Human Mind. New York: William Morrow, 1998. yourself?,” NeuroReport, vol. 11, no. 11, pp. R11–6, Aug. 2000. [19] A. Iriki et al., “Coding of modified body schema during tool use by [48] J. S. Watson, “Contingency perception in early social development,” in macaque postcentral neurons,” Neuroreport, vol. 7, pp. 2325–30, 1996. Social Perception in Infants, T. M. Field and N. A. Fox, Eds. Nor- [20] J. Botvinick and J. Cohen, “Rubber hands ’feel’ touch that eyes can wood, N. J: Ablex Pub. Corp, 1985, pp. 157–176. see,” Nature, vol. 391, pp. 756–756, 1998. [49] C. G. Prince, N. A. Helder, and G. J. Hollich, “Ongoing emergence: A [21] H. Ishibashi, S. Hihara, and A. Iriki, “Acquisition and development core concept in epigenetic robotics,” in Proc. 5th Int. Workshop Epige- of monkey tool-use: Behavioral and kinematic analyses,” Canadian J. netic Robotics, Nara, Japan, Jul. 22–24, 2005, pp. 63–70. Physiol. Pharmacol., vol. 78, pp. 958–966, 2000. [50] M. Tomasello, The Cultural Origins of Human Cognition. Cam- [22] A. Stoytchev, “Computational model for an extendable robot body bridge, MA: Harvard University Press, 2001. schema” Georgia Inst. Tech., Atlanta, GA, College of Computing [51] J. Piaget, The Origins of Intelligence in Children. New York: Inter- Technical Report, GIT-CC-03-44, Oct. 21, 2003. national Universities Press, 1952. [23] T. Chaminade, E. Oztop, G. Cheng, and M. Kawato, “From self-obser- [52] S. Freud, New Introductory Lectures on Psychoanalysis. New York: vation to imitation: Visuomotor association on a robotic hand,” Brain Norton, 1965. Res. Bulletin, vol. 75, no. 6, pp. 775–84, Apr. 2008. [53] J. Bowlby, Attachment. New York: Basic Books, 1969, vol. 1. [24] J. J. Gibson, The Ecological Approach to Visual Perception. Boston, [54] P. H. Wolff, “The developmental psychologies of Jean Piaget and psy- MA: Houghton Mifflin, 1979. choanalysis,” Psychol. Issues, vol. 2, no. 1, pp. 3–181, 1960. [25] A. Noë, Action in Perception. Cambridge, MA: MIT Press, 2004. [55] L. B. Cohen and C. H. Cashon, “Infant perception and cognition,” in [26] R. A. Brooks, “Achieving artificial intelligence through building Comprehensive Handbook of Psychol., Ser. Develop. Psychol. II. In- robots” Massachusetts Inst. Tech., Cambridge, MA, Tech. Rep. AI fancy, R. Lerner, A. Easterbrooks, and J. Mistry, Eds. New York: Memo 899, 1986. Wiley, 2003, vol. 6, pp. 65–89. [27] R. A. Brooks, “Intelligence without representation,” Artificial Intelli- [56] M. Tomasello and J. Call, Primate Cognition. New York: Oxford gence, vol. 47, no. 1–3, pp. 139–159, Jan. 1991. University Press, 1997. [28] B. Campbell, Human Evolution: An Introduction to Man’s Adapta- [57] A. Barto and O. S¸ims¸ec, “Intrinsic motivation for reinforcement tions, 3rd ed. New York: Aldine, 1985. learning systems,” in Proc. 13th Yale Workshop Adaptive Learning [29] T. G. Power, Play and Exploration in Children and Animals. Syst., New Haven, CT, 2005, pp. 113–118. Mahwah, NJ: Lawrence Erlbaum, 2000. [58] P.-Y. Oudeyer, F. Kaplan, and V. Hafner, “Intrinsic motivation systems [30] K. Lorenz, “Innate bases of learning,” in Learning as Self-Organiza- for autonomous mental development,” IEEE Trans. Evol. Comput., vol. tion, K. H. Pribram and J. Kinga, Eds. Mahwah, NJ: Lawrence Erl- 11, no. 2, pp. 265–286, Apr. 2007. baum, 1996. [59] A. Stoytchev, “Robot tool behavior: A developmental approach to au- [31] J. M. Diamond, Guns, Germs, and Steel: The Fates of Human Soci- tonomous tool use,” Ph.D. dissertation, College of Computing, Georgia eties. New York: Norton, 1999. Institute of Technology, Atlanta, GA, 2007. [32] A. Stoytchev, “Behavior-grounded representation of tool affordances,” [60] A. Stoytchev, “Five basic principles of developmental robotics,” in Proc. IEEE ICRA, Barcelona, Spain, 2005, pp. 3071–3076. presented at the NIPS 2006 Workshop on Grounding Perception, [33] A. Stoytchev, “Toward learning the binding affordances of objects: A Knowledge, and Cognition in Sensori-Motor Experience, Whistler, behavior-grounded approach,” in Proc. AAAI Symp. Develop. Robot., BC, Canada, Dec. 8, 2006. Mar. 21–23, 2005, pp. 17–22. [34] A. Glenberg, “What memory is for,” Behav. Brain Sci., vol. 20, no. 1, pp. 1–55, 1997. [35] J. K. O’Regan and A. Noë, “A sensorimotor account of vision and vi- sual consciousness,” Behavioral Brain Sci., vol. 24, no. 5, 2001. Alexander Stoytchev (S’00–M’07) received the [36] X. Wang, M. Merzenich, K. Sameshima, and W. Jenkins, “Remodeling M.S. and Ph.D. degrees in computer science from of hand representation in adult cortex determined by timing of tactile the Georgia Institute of Technology, Atlanta in 2001 stimulation,” Nature, vol. 378, pp. 71–75, Nov. 1995. and 2007, respectively. [37] S. Harnad, “The symbol grounding problem,” Physica D, vol. 42, pp. He is currently an Assistant Professor of Elec- 335–346, 1990. trical and Computer Engineering and Director of [38] E. J. Gibson, Principles of Perceptual Learning and Development. the Developmental Robotics Laboratory at Iowa New York: Appleton-Century-Crofts, 1969. State University, Ames. His research interests are [39] D. Wolpert and M. Kawato, “Multiple paired forward and inverse in the areas of developmental robotics, autonomous models for motor control,” Neural Netw., vol. 11, no. 7–8, pp. robotics, computational perception, and machine 1317–1329, Oct./Nov. 1998. learning.