<<

Kirsh, David. " and situated cognition." In Robbins, P., & Aydede, M. (Eds.). (2009). The Cambridge handbook of situated cognition (pp. 264-306). Cambridge: Cambridge University Press.

CHAPTER I 5

Problem Solving and Situated Cognition

David Kirsh

Introduction exploratory process is not well understood, as it is not always systematic, but various In the course of daily life we solve prob- heuristic search have been pro- lems often enough that there is a special posed and some experimental support has term to characterize the activity and the been provided for them. right to expect a scientific theory to explain Situated cognition, by contrast, does not its dynamics. The classical view in psychol- have a theory of problem solving to compete ogy is that to solve a problem a subject with the classical view. It offers no com- must frame it by creating an internal rep- putational, neuropsychological, or mathe- resentation of the problem's structure, usu- matical account of the internal processes ally called a problem space. This space is underlying problem cognition. Nor does it an internally generable representation that explain the nature of the control of exter- is mathematically identical to a graph struc- nal processes related to problem solving. ture with nodes and links. The nodes can Partly this is a matter of definition. Prob- be annotated with useful information, and lems are not regarded to be a distinct cate- the whole representation can be distributed gory for empirical and computational analy- over internal and external structures such sis because what counts as a problem varies as symbolic notations on paper or diagrams. from activity to activity. Problems do arise If the representation is distributed across all the time, no matter what we are doing. internal and external structures the sub- But from a situated cognition perspective ject must be able to keep track of activ- these problems should not be understood ity in the distributed structure. Problem as with a formal structure that solving proceeds as the subject works from may be the same across different activities. an initial state in this mentally supported Each problem is tied to a concrete setting space, actively constructing possible solu- and is resolved by reasoning in situation- tion paths, evaluating them and heuristi- specific ways, making use of the material and cally choosing the best. Control of this cultural resources locally available. What is

264 177 problem sol ving and situated cognition called a problem, therefore, depends on the explain how people solve problems that are discourse of that activity, and so in a , is ill defined, which they recognized a large socially constructed. There is no natural kind class of problems to be. called "problem" and no natural kind process To develop their theory they presented called "problem solving" for to subjects with a collection of games and puz- study. Problem solving is merely a form of zles with unique solutions or solution sets. reasoning that, like all reasoning, is deeply Having a correct answer - a solution - bound up with the activities and context in is the hallmark of a problem being well which it takes place. Accordingly, the situ- defined. Problems were posed in contexts ational approach highlights those aspects of in which the experimenter could be sure problem solving that reveal how much the subjects had a clear of what machinery of inference, computation, and they had to solve. Games and were representation is embedded in the social, chosen because they are -contained; it is cultural, and material aspects of situations. assumed that no special outside This critical approach to problem solving of what is provided is needed to solve them. is what I shall present first. In Part 11 discuss These sorts of problems have a strict defi- the assumptions behind the classical psycho- nition of allowable actions (you move your logical theory. In Part 2 I present the major pawn like this), the states these actions cause objections raised by those believing that cog- (the board enters this configuration), and a nition must be understood in an embodied, strict definition of when the game or interactive, and situated way, and not pri- has been solved, won, or successfully com- marily as a cognitive process of searching pleted (opponent's king is captured). It was through mental or abstract representations. assumed that subjects who read the prob- There is a tendency in the situated cogni- lem would be able to understand these ele- tion literature to be dismissive of the clas- ments and create their internal representa- sical view without first acknowledging its tion. Such problems are both well defined flexibility and sophistication. Accordingly, I and knowledge lean, as "everything that the present the classical account in its best form subject needs to know to perform the task in an effort to appreciate what parts may be is presented in the instructions" (VanLehn, useful in a more situated theory. In Part 3 I 1989, p. 528). No special training or back- collect pieces from both accounts, situated ground knowledge is required. and classical, to move on to sketch a more positive theory - or at least provide desider- ata for such a theory - though only frag- 2. Task Environment ments of such a view can be presented here. In the classical theory, the terms problem and task are interchangeable. Newell and PART 1: THE CLASSICAL THEORY Simon introduced the expression task envi- ronment to designate an abstract structure 1. Newell and Simon's Theory that corresponds to a problem. It is called an environment because subjects who improve In an extensive collection of papers and task performance are assumed to be adapt- books, Herbert Simon, often with Allen ing their behavior to some sort of environ- Newell, presented a clear statement of the mental constraints, the fundamental struc- now-classical approach to problem solv- ture of the problem. It is abstract because ing (see, among others, Newell & Simon, the same task environment can be instan- 1972). Mindful that science regularly pro- tiated in very different ways. In chess, for ceeds from idealization, Simon and Newell example, the task environment is the same worked from the assumption that a the- whether the pieces are made of wood or sil- ory based on how people solve well-defined ver or are displayed on a computer screen. problems can be stretched or augmented to Any differences arising because agents need DAVID KIRSH

to interact differently in different physical nomenon. The same would apply to other contexts are irrelevant. It does not matter things nonexperts do when they play, SUcj1 whether an agent moves pieces by hand, as putting a finger on a piece, trying |f by mouse movements, by requesting some- possible actions on the board, using pencil one else to make the move for them, or and paper, talking to oneself, or consulting by writing down symbols and sending a a book (if allowed at all). All are assumed description of their move by mail. Issues irrelevant to task performance. They may associated with solving these movement or occur while a subject is working on a prob- subtasks belong to a differ- lem, or while playing chess, but, according ent problem. to the classical account, they are not liter- A task environment, accordingly, delin- ally part of problem-solving activity. This is eates the core task. It specifies an under- obviously a point of dispute for situation- alists, as many of these actions are regularly lying structure that determines the rele- observed during play, and they may critically vant effects of every relevant action that a the success of an agent. given agent can perform. This has the effect that if two agents have different capacities for action they face different task environ- ments. When four-legged creatures confront 3. Problem Space an obstacle, they face a different locomo- tion problem than two-legged creatures, and Task environments are differentiated from both problems are different from the loco- problem spaces, the representation sub- motion problem the obstruction poses to jects are assumed to mentally construct a snake. Thus, two agents operating in the when they understand a task correctly. This same physical environment, each facing the problem-space representation might be dis- same objective - get from atob- may face tributed over external resources. It encodes different task environments because of their the following: different capacities. Their optimal path may be different. Moreover, of all the actions a | The current state of the problem. At the creature or subject can perform, the only beginning this is the initial state. ones that count as task relevant are the ones • A representation of the goal state or con- that can, in principle, bring it closer to or far- dition - though this might be a procedure ther from an environmental state meeting or test for recognizing when the goal has the goal condition. It is assumed that dif- been reached, rather than a declarative ferences in expertise and intellectual ability statement of the goal. affect search and reasoning rather than the I Constraints determining allowable moves definition of the task itself. and states, hence the nodes and allow- Task environments are theoretical projec- able links of the space I these too tions that let researchers interpret problem- may be specified implicitly in procedures solving activity in concrete situations. They for generating all and only legal moves identify what counts as a move in a problem rather than explicitly in declarative state- (for a given agent). As such, they impose ments. a powerful filter over the way a researcher • Optionally, other representations that interprets subjects' actions. Scratching one's may prove useful in understanding prob- head during chess, for instance, is an action lem states or calculating the effects of that would be interpreted by a researcher as action. irrelevant to the game. It not only would lie outside the task environment of chess con- Some of these other representations encode strued as the set of possible chess moves but knowledge of problem-solving methods, also would be treated as having no relevance heuristics, or metrics specific to the cur- to the game in any other way - an epiphe- rent task environment. Others encode PROBLEM SOLVING AND SITUATED COGNITION 267

b

Figure 15.1. The different versions of the Tower of shown in 15.1a share the same abstract task environment, shown in the graph structure displayed in 15.1b. All the versions have the same legal moves in an abstract sense, the goal of the game is the same, and the strategies for completion are the same. At a more microscopic level, moving heavy pieces in one game may require additional planning, but these extra moves and extra plans are not to be part of the game. Because the game is defined abstractly, any differences in the action repertoire of an agent are irrelevant. In other tasks we base our analysis of the task on the actions the agent can perform, so that more powerful agents may face different tasks than less powerful ones. Choice of level of is a theoretical decision. 223 david kirsh

methods, heuristics, and metrics that are PART 2: CRITICISM OF THE domain independent, such as general meth- CLASSICAL THEORY ods of search, measures of when one is getting closer to a goal, and typical ways 1. Initial Summary of Objections of overcoming impasses that arise in the solution-finding process. The of task environment and problem space have a formal elegance that is seduc- tive. They encourage treating problem solv- 4. Ill-Defined Problems ing as an area of psychology that can be stud- ied using existing methods of mathematics Puzzle and game cognition seems to fit this and experimentation. But they can also jus- formal, knowledge-lean approach - at least tifiably be attacked from many sides, and in part. But Simon recognized that most of not just because efforts to extend the the- the problems we encounter in life are not ory to ill-defined problems have been mostly well defined in this formal sense. Some have unsuccessful. Four objections that are con- no unambiguously right answer, the result genial to a situated approach to cognition of applying an operational goal condition deserve close examination. to possible solutions. This may be because there are many grades and forms of adequate 1.1. Framing and Registration answer, as is typical of problems arising in architecture, engineering, cooking, writing, Framing and registration processes are inte- and other creative or design-related work. gral to the problem-solving process and Or it may be because the notion of what arguably the hardest part of it. The formal constitutes an adequate answer is not known theory treats the heart of problem solving in advance, and part of what a problem to be search. Indeed, this is the only part solver must learn in the course of working on explained by the classical theory. But search a problem is what counts as a better answer. in a problem space only makes sense after Still other problems have no fixed set of the hard work of framing has been done - operators relevant to a problem space - no after a problem has been well posed and fixed set of choice points, fixed conse- put into a searchable graph structure. It is quence function, fixed func- one thing to do this for games where the tion, or well-defined constraints on feasible operations and objectives are typically told actions. Think of the problem a painter faces to us explicitly. It is another to do this for when confronting a blank canvas in a studio everyday problems, where we have to decide with all the paints, media, brushes, and tools what is relevant and what is irrelevant. In he or she might ever want. Goals, operators, artificial , a closely related prob- choice points, consequence, and evaluation lem of bounding the scope of what needs functions are either undetermined by the to be considered in planning, reasoning, and very nature of the problem, or they have to solving problems is called the "frame prob- be learned microgenetically, in the course of lem," and it remains an open question how activity. The problem is largely being made people do this. up as it is being worked on (cf. Reitman, Moreover, what is the justification for 1964}. treating the abstraction or framing part Simon regarded the prevalence of ill- of problem solving to be separate and defined problems as a challenge to the clas- unconnected from the problem-solving part, sical theory but not an insurmountable one. which is assumed to be search? It may seem Cognitive theories should start first with the intuitive to see problem solving as having clear, central cases of problems - which parts: recognize a problem in a concrete sit- for Simon are well defined and knowledge uation; abstract, frame, or bound the prob- lean - and then move outward to harder lem; find a solution; and reinterpret the cases. solution in the concrete setting. It may seem problem sol ving and situated cognition 177 intuitive that we can modularize these parts and proceed without consulting and recon- and study each component. But whether suiting the recipe? Plan and execution are justified or not - and there are good connected in nonsimple ways. The interim to challenge the modularization of steps - effects of following a recipe alert a cook to why accept that the locus of difficulty, the details of the steps that need close atten- real challenge of problem solving, concerns tion. This interactive process of going back the search part? Framing is notoriously hard, and forth, between world and representa- and so is registration. tion (recipe), shows that there are two sides To understand the registration problem, to the registration problem: and imagine yourself in a shopping mall, stand- decoding. ing in front of a wall map, trying to find a Registration and framing are related path from your current location to a spe- because in registering a problem one also has cific store. Which is harder: figuring out to find a way of tying concrete elements of where you are relative to the map, assum- a situation with a problem representation. ing the map does not have an icon with a Framing adds a further element: a on "You are here" label, or finding a path from the knowledge that is relevant. When peo- a to b on the map? For most of us find- ple think about something they see as prob- ing the path is the easy part. That is the lematic, they typically frame their difficulty part that is analogous to search in a problem in terms of their immediate understanding space. It is far harder to figure out where you of their situation, an understanding that are and then translate the path you found comes with preconceptions of what is back into action in the world. Those are the relevant and potentially useful. This is registration parts: connecting the abstract often constraining. Problems of cooking, for search space (whether internal or external) instance, are framed in terms of ingredients, to world, and then reinterpreting the flame size, and pots and pans, rather than in results of search, or some other action per- terms of in chemistry (e.g., reac- formed on an abstract representation, back tion potential, catalyst) we may have learned into domain-specific terms. in school and that are, in principle, rele- Given the interactive nature of problem vant to understanding the cooking process. solving, the back-and-forth process of act- Expert chemists may bring such domain- ing, observing the result, and then thinking external views to the cooking process. And of the next move, agents almost never do all expert mathematicians or expert modelers their work in a problem space and then act in may bring the capacity to neatly formal- the world. They constantly translate moves ize the concrete. But for the rest of us it in their abstract problem space into actions is hard to get beyond the concrete to the in their concrete context and back again. abstract and general. If we could appreci- How subjects frame and interpret a problem ate the abstract in the concrete, we would therefore is essential to how they will pro- recognize and be able to transfer ceed and how easy it is to translate between from one domain to another more problem space and world. The more abstract readily than we do. The we do not a problem space, the more distant it is from is because our understanding of problems the specifics of the current situation, and the is usually tied to the resources and tools harder this translation process is. Think of at hand. We are hampered by the the distance between a recipe in a cookbook appropriate to the setting in which our activ- and its concrete execution by a cook in the ity takes place. kitchen. The recipe represents a solution to Given the way problems arise in natural the problem of creating a certain dish given contexts, the burden of explanation ought certain ingredients. But when cooks execute to lie in psychology to show both that (a) a recipe they go back and forth between people do have an abstract problem space the paper representation and their kitchen. representation of problems they solve, and Why can't they just remember the steps (b) the hard part of problem solving is not 270 david kirsh

to be found in the process of going back and graph paper, calculators, algorithms, tricks forth between situational understanding and of the trade, free advice - that make their problem space understanding, but in search. reasoning job easier. Even when no problem This is the challenge which greater aids are lying around, the type of problem to the processes of framing and registration encountered is not a worst-case problem but pose to the classical view. To my it has a simpler version of a problem that only in never been answered. its general form is hard to solve. It is well known that problems that are computation- ally complex when conceived in their gen- 1.2. Interactivity and Epistemic Activity eral form invariably have many special forms Examination of actual problem solving in that are quite easy to solve. Usually these ecologically natural contexts as opposed to are the ones people actually confront, and white-room environments reveals a host of posing a problem in its more general form, interactions with resources and cultural ele- as so often is done in the classical approach ments that figure in the many phases of makes the problem harder, encouraging cog- problem solving, such as understanding the nitive scientists to propose solution methods problem, exploring its scope and constraints, that people do not have to follow. getting a sense of options, and developing It is not an accident that we encounter a metric for evaluating progress toward a special cases. We live most of our life in solution. People generate a range of inter- constructed environments. Layers of arti- mediate structures. In reducing problem facts saturate almost every place we go, solving to search in a problem space, the and there are preexisting practices for doing classical approach minimizes and misunder- things. These artifacts and practices have stands the complexity and centrality of local been designed, or have coevolved, to make interaction. us smarter, to make it easier for us to There is much more going on dur- solve our problems and perform our habit- ing problem solving than searching in an ual tasks. Everywhere there are scaffolds and abstracted problem space. Most of these other resources to simplify problem solving, actions-interactions lie outside the narrow including people to ask. Part of what we definition of the problem. Although this learn is how to use these resources and par- echoes the first objection in stressing that ticipate in the relevant practices. Approach- problem solving is not reducible to search, ing a problem as if it must be posed in it pushes that argument further by focusing its general form ignores the efficiencies and on the nature of agent-environment inter- kludges that typify natural beings living in action during problem solving. People do worlds scaffolded and designed for them. many more task-relevant things when prob- It supposes that our main problem-solving lem solving than those allowed for in the skills are tied to search, when in fact they strict definition of their task or problem. may be more closely related to our abil- The notion of a task environment is far ity to manage our artifacts, make effective too narrow. These task-exogenous actions use of scaffolds, and conform to practice. affect both the process and success of prob- To focus on the 3 percent of problems only lem solving. Addressing this issue requires some of us solve risks misunderstanding the ethnographic attention to the real-world remaining 97 percent of problems we all details of problem solving. solve.

1.3. Resources and Scaffolds 1.4. Knowledge Rich

Once focus shifts from puzzles to real prob- Most problems people face in daily life are lems arising in everyday environments, it is not like knowledge-lean problems in which apparent that subjects have access to cul- all relevant aspects of each problem can tural products - tools, measuring devices, be given in a compact problem statement. problem sol ving and situate277 d cognition

Naturally occurring problems rarely occur matter even more. They activate an inter-

¡n a vacuum, where all an agent needs to pretive framework that primes agents to bow can be encapsulated in a few simple look for and conceptualize features of their sentences. We typically bring more knowl- environment in activity-specific ways, bias- edge and expertise than formal accounts ing what they see as problematic and what of problem solving discuss. Consider cook- they see as the natural or at-hand resources ing, cleaning, shopping, gardening, the tasks available to solve such problems. Problems, confronted in offices that involve computer goals, operators, and representations are not applications, or editing documents. In each abstract. They arise in concrete settings case an intelligent novice performs less well where agents have certain activities they than experienced participants. It might be have to perform. Features of these activity that can be reduced to familiar- spaces affect the way the problem is repre- ity with search heuristics, domain metrics, sented and framed. and the like. But much surely has to do Lave (1988) and others (Rogoff & Lave, with knowing how to pose, view, dissolve, 1984) have explored the effects of con- and work around problems, and knowing text on problem conceptualization. In com- what is most effective in specific situations menting on her well-known ethnography and how to coordinate the use of local of mathematical activity in supermarkets, resources - a deep knowledge of cases. Theo- Lave wrote: "I have tried... to understand ries of knowledge-rich problem solving have how mathematical activity in grocery stores become important in the literature since the involved being 'in' the 'store,' walking up 1980s. But even these studies place too lit- and down 'aisles,' looking at 'shelves' full of tle emphasis on the centrality of resources, cans, bottles, packages and jars of food, and scaffolds, interactivity, and cultural support. other commodities" (Lave, 1996, p. 4). Each Almost none explain the process by which of these domain-specific terms has an impact people understand problems. on the way problems are conceptualized and In the next sections I will develop each posed. argument further, calling attention to sup- To show that mathematical activity is not porting articles in both the situated and the same across settings, Lave looked at the classical literature where many of these techniques and methods shoppers in super- concerns have been recognized but left markets use to solve some of their typical unanswered. problems of choice, such as whether can A is a better buy than can B. She found that even though unit prices are printed on 2, Framing and Registration supermarket labels, shoppers rarely check them to decide what to buy. Instead they 2.1. Framing use less general strategies such as, "Prod- The heart of the framing and encoding uct A would cost $10 for 10 oz., and prod- argument is that natural problems arise in uct B is $9 for 10 oz., hence product B is concrete settings where agents are already the cheaper buy." Or faced with a choice operating in activity-specific provinces of between a 5 oz. packet costing $3.29 and a meaning. It matters whether an agent is 6 oz. packet priced at $3.59, the shopper playing chess by mail or playing chess with would argue, "If I take the larger packet, it a young child using Disney characters. The will cost me 30 cents for an extra ounce. Is it context affects the way the game is concep- worth it?" tualized and framed (e.g., chess for com- Why do shoppers ignore unit price? It is petition, chess for teaching beginners). This clear that they are not indifferent to unit framing colors choice, evaluation, and local price because they usually use strategies that objectives, all factors involved in creating a involve price comparison between specific problem space. In tasks that are less abstract items. But as retailers well know, the actual than chess the setting and local resources problem a shopper solves has many more 271 DAVID KIRSH

variables. When people make a decision same problems in school, where they used about what to take home they also con- the school-taught procedures. The authors sider where they will store the items, how noted that in the street, both children and long each item will last, how quickly it older vendors used convenient groups for will be used, and the family's to their additions, such as "three for 105," and brand. Cans are hefted, labels are exam- that simple multiples of these groups, two ined, and the factors that influence shop- or three of these three-for's were also very pers have been made sufficiendy prominent highly practiced so that, in effect, the ven- by producers and retailers that shoppers dors were substituting for summa- can be certain to notice them. The effect tion or multiplication whenever they could. is that reducing the problem of choice Predictably, school-taught procedures inter- to comparing unit price strips the actual fered with this type of situation-specific shopping problem of its complexity. More- problem solving, and predictably, the street over, by placing competing brands side by vendor kids performed better on the street side and placing related products nearby than nonvendors with comparable educa- (spaghetti sauce near pasta), supermarkets tion. Context and experience framed how provide a structure or organization for cog- the kids approached their problems and the nitive activity that they way shop- resources and tools they deemed appropri- pers think. Layout affects the way option ate. Their activity in street environments sets are conceptualized (e.g., "I came in to was different than in classroom environ- buy spaghetti and decided to get linguini ments. Arguably, the cognitive resources in because I liked the look of the new Alfredo the street coevolved with the demands of sauce"). This dynamic between product dis- street calculation. play and consumer framing of choice has In another study, this time by Scribner coevolved. (1984), there is an account of how milkmen In an earlier study, Carraher, Carraher, filled orders for different kinds of milk - and Schliemann (1985) presented a related white milk, chocolate milk, half-pints, view. They found that Brazilian children quarts - by packing their delivery cases to selling goods in street markets invented make delivery more efficient and physically special purpose procedures to add up less effortful. This again is a numerical prob- prices and calculate change rather than use lem that was solved using contextualized the more general pencil-and-paper meth- knowledge. ods they learned in school. They framed Scribner noted that old-time milkmen their problem in a domain-specific manner used their delivery case itself as a thing to because the specialized cognitive artifacts think with, and they filled orders faster and they used to help them calculate were the more accurately than students who filled ones built up in local practice and readily orders using arithmetic calculations. The available in the situation. milkmen learned the numerical relations For example, a girl who made money for of various configurations of milk containers the family as a street vendor, when asked the (one layer of half-pints is 16, two rows of price of 10 coconuts selling at 35 cruzeiros quarts are 8, hence half as many as pints, and per piece, did not use the add-o method for so forth) and used the compositional struc- multiplying by 10, as she had been taught in ture of layers and half layers, and so on, to fill school. She used the cost of three coconuts cases with multiples of items without count- (105), which was a convenient group she reg- ing out each item. If they needed 35 half- ularly sold coconuts in, added to this the pints to fill an order, they would know to fill cost of two more of these threesomes (210), two layers and then add three more on top. then added the cost of a single coconut (35) Deliverymen solved billing problems using a to the running total. She correctly reported similar process of taking overlearned quan- the price of 10 coconuts as 350 cruzeiros. tities or patterns, pulling them apart, and Street market children did less well at these putting them back together. For instance, to problem sol ving and situated cognition 177 figure out the cost of 98 half-pints a natu- tles. Idiosyncrasies of the problem instance, ral strategy was to take two times the case such as what is near to what, or how infor- price (a case holds 48 half-pints and its price mation about price, volume, and so on, is was memorized) and add the price of two displayed, do not matter. All that matters half-pints. The patterns of milk cartons in is the topological structure of the problem the case are the elements of calculation. space, or the mathematical structure of the They became things to think with, patterns problem. And that may be the same what- to decompose and recompose, to mentally ever the labels are for nodes and links: can manipulate. Again context and experience size, number of bottles, distances, or simply framed the way they saw their problems. numbers. The methods they developed were not uni- Two findings discussed at length in the versal, based on general algorithms for solv- problem-solving literature - problem iso- ing arithmetic problems; they were special- morphism and mental set - bear on this ized and situation specific. And their deep question of framing and abstraction. Both familiarity with different situations showed support the view that subjects are sensitive in performance. to surface attributes of a problem, so much Cognitive scientists interested in learning so that two problems that are formally the theory have called the mathematical knowl- same, or formally very similar, may be solved edge displayed here "intuitive" or "naive," in such different ways and with such dif- to distinguish it from the formal knowledge ferent speed-accuracy profiles that a process taught in school (Hamberger, 1979). Intu- theory should treat them as different. Litde itive knowledge is thought to be bound to is gained by seeing them as only different the context in which the knower solves per- problem-space representations of the same sonally relevant problems. task environment. The issue of who is right - the situation- Take problem isomorphism first. Tic-tac- al who looks at local resources as things toe and the game of fifteen are superficially to think with, or the formalist who frames different versions of the same problem (see the task more abstractly as a general type Figure 15.2). Legal moves and solutions in of problem, in these cases math problems tic-tac-toe and legal moves and solutions in that must be interpreted or applied to local the game of fifteen can be put in one-to- conditions - lies at the heart of the situ- one correspondence. From a formal point ated challenge. Which problem are people of view the problems, therefore, are iso- trying to solve? If a subject thinks about a morphic. They have the same mathematical problem in concrete terms such as cans and structure. Yet subjects conceptualize them shelves, and so has an internal conception differently and their performance is differ- and an external discourse that makes it seem ent, as measured by their speed-accuracy as if the problem were about attributes of profiles and the pattern of errors they gen- cans (e.g., their appearance, shape, volume, erate. Predictably, subjects rarely transfer price, brand], why suppose he is deluded their expert methods for tic-tac-toe to the and is really talking about a basic num- game of fifteen; they relearn them. Evi- ber problem that happens to be couched in dendy, then, algorithms are sensitive to sur- terms of cans? From his point of view, his face structure even if their success condi- problem is a naturally occurring one, quite tions are not. unlike the contrived sentence problems pre- What can be inferred? The obvious con- sented in math class ("a bachelor comes to clusion is that details of the problem con- a supermarket with $15 looking for the best text - the way it is presented and concep- way to spend his money on pasta and beer. tualized, the richness of cues in the local Spaghetti costs $1.75, fusilli cost $2.50, beer environment - determine what subjects costs..."). From the formalist point of view, count as a solution and the resources they however, it is irrelevant that the numbers of see as available to solve it. In the game interest refer to attributes of cans or bot- of fifteen, if paper and pen are handy, for david k1rsh

Figure 15.2. The task environment of the game of fifteen and tic-tac-toe is isomorphic because there is a one-to-one correspondence between the legal permissible moves in tic-tac-toe and the game of fifteen. In the game of fifteen, players take turns choosing from a set of numbered tiles. The first player to collect three tiles that sum to fifteen wins. Because it is easy to see opportunities for three in a row visually, but harder to see opportunities for summing to fifteen, subjects can play tic-tac-toe faster and with fewer errors than in fifteen. Their skills in tic-tac-toe do not transfer and they have to releam the tricks.

instance, subjects will often mark down cess model of problem-solving cognition, we sums and consequences of moves. How ought to attend to the way subjects distri- shall we view these paper actions? On the bute relevant states over environmental arti- one hand, because actions on paper cannot facts (scrap paper as well as the spatial lay- improve the pragmatic position of a subject, out of cards or the way the tic-tac-toe board paper and the actions it affords do not seem is filled in), and how they work out game to be part of the problem context. On the moves by performing epistemic actions other hand, for those who rely on paper to (Kirsh & Maglio, 1995). work out their next move, it is an impor- Our concern here is with the possibil- tant part of their problem-solving activity ity of finding the right level of abstraction and makes a difference to their outcomes. to characterize the psychological processes In tic-tac-toe, scratch paper only gets in the involved in solving problems, even well- way. Our makes spotting con- defined ones, such as the game of fifteen. sequences of moves easy. So in the game In the gestalt theory of problem solving, it of fifteen versus tic-tac-toe, the resources, is assumed that people see a problem as a actions, and calculations relevant to a solu- meaningful question only against a back- tion are, for many subjects, quite different. ground of assumptions. To a given subject The formal state space of the two versions of something is foregrounded as problematic the game is the same, but that space seems only against this backdrop of the unprob- to abstract away from too many psychologi- lematic (Luchins, 1942). The possible lines of cally and activity-relevant details to explain solution that subjects will consider, accord- the cognitive processes involved in problem ingly, are constrained by their mental set, solving. In fact, given such differences in which limits the information they attend problem-solving activity, why suppose the to and the conjectures and resources they two even share an isomorphic task environ- think are relevant. Sometimes the mental ment? The level of abstraction needed to set a person brings to a task or situation is view them in the same way seems too high. appropriate and helps in finding a solution. Because the purpose of a task-environment Sometimes it does not. and problem-space approach is to provide us An example of ways of framing and men- with constructs sufficient to explain psycho- tal set can prevent problem solving is found logical and behavioral activity, we need to in insight problems, where to solve the prob- find the right level of abstraction to capture lem subjects must break out of conventional generalizations. In these two cases, there thinking and try something nonstandard, seems too much difference in behavioral a trick. Usually this trick involves break- performance. Moreover, if our goal is a pro- ing preconceptions about what is allowable problem sol ving and situated cognition 177

method on new problems and persevere in using it, even when there is a much bet- ter way to solve the new problems. This was seen as strong evidence for mental set because when other subjects were shown these alternative problems before the ini- tial problem, they would soon learn the easy solution methods, suggesting that learning Figure 15.3. The nine-dot problem and its on one problem can interfere with learn- solution. The task is to connect all dots using ing and performance on another. Further four straight lines without lifting one's pen. support for the presence of mental set was Subjects frame the problem narrowly by found by Sternberg and Davidson (1983) assuming that lines must begin or end on dots and Gick and Holyoak (1980, 1983), who and cannot extend beyond them. They incorrectly assume that all turning occurs on confirmed in new experiments that prior dots. solution methods - prior set - worked against finding solutions to different prob- lems, problems that subjects without that or what is the function of an available bias would be expected to solve. resource. The relevance of mental set for the sit- For instance, in the nine-dot problem, uated approach is that how agents frame Maier (1930) told subjects to find a way problems, what they see as possible actions, to connect all the dots in a three-by-three and good methods for success, depend on matrix by using four straight lines, with- how they interpret their situation and their out lifting their pens or retracing any lines mindset in approaching a problem. To an (see Figure 15.3). The problem is hard experienced shopper, supermarket prob- precisely because participants do not con- lems are their own sort of problem, quite sider making non-dot turns (Kershaw, 2004). unlike the general arithmetic problems They rarely consider constructing lines that learned in school. To the strongly math- extend beyond the dots, and when they ematically inclined, however, supermarket do they seldom consider making a turn in problems are more likely seen as a special empty space, either between the dots or case of general arithmetic problems. Math- somewhere outside the matrix. The prob- ematicians see through the particulars of lem statement does not exclude these pos- the shopping situation, grasping the more sible actions. But subjects frame the prob- abstract mathematical problem. Their men- lem as if they consider these impermissible. tal set is very different. And they worked Framing has prevented the subject from cre- hard to achieve that competence. To less ating the right problem space, perhaps even mathematical reasoners, however, the sup- from grasping the right task environment. porting resources and scaffolds are so differ- In separate work on mental set, in what is ent and the tricks and visible cues are so dif- commonly regarded as the classical demon- ferent that, initially, at any rate, their whole stration of set, Luchins (1942) presented sub- mindset is different. The problem is differ- jects with water-jug problems in which they ent. had to figure out how to get a certain amount Even if one recognizes the abstract prob- of water (e.g., five cups), using any combi- lem posed, the resources available still can nation of three jugs: jug A holds eighteen strongly affect the method used to solve it. cups, jug B holds forty-three cups, and jug For instance, in math class at school students holds ten cups. They were free to dip their have pencil and paper. They write numbers jugs into a well as many times as they like. down and rely on algorithms defined over Luchins found that after subjects get the the inscriptions they create. To multiply two hang of the solution method and have more numbers they line them up and use one or less automatized it, they try to use that of the multiplication algorithms. The same 276 DAVID KIRSH

holds for division and determining ratios. and rerepresenting, so that reformulation is Without pencil and paper, however, tech- a key part of problem solving, and that fram- niques and methods usually change. Even ing, therefore, does not occur once and prob- mathematicians might prefer to think with lem solving begins afterward; the two are local artifacts if faced with a problem that is often intertwined. Second, even when a sub- cumbersome to solve in their head. ject is searching a problem space the search The upshot is that though a task anal- process is complicated by the need to con- ysis may be important to determine the tinually anchor the search space in locally success conditions of different approaches, meaningful ways. Search itself is an interac- and indeed necessary to explain why they tive process that should not be reduced to work, such analyses seem remote from a pro- internal symbol manipulation. cess theory. A psychological theory ought to To appreciate these points, it is illumi- explain the many phases and dynamics of nating to contrast the concepts of registra- the problem-solving process: how one sees tion and translation. In mapping a game of a problem; why one sees it that way; and fifteen back into tic-tac-toe, we perform a how one exploits resources, interacts with translation. We similarly perform a transla- resources, and solves the problem in accept- tion when we map a word problem (Mary able time. The bottom line, for the moment, is two inches taller than Peter who is...) is that how agents frame a problem, how into a simple algebraic statement, or puzzles they project meaning into a situation, deter- and games (nine dots, chess) into searchable mines the resources they see as relevant to its graphs, or a problem in Euclidean geom- solution. If street vendors frame the "How etry into a problem in analytic geometry much for ten of these?" problems in terms using Cartesian coordinates. The of of today's price for three and today's price the mapping is that the new representation for one, then they prime a set of tools of offers another perspective with different thought distinct from those they learned in methods and techniques, often simplifying school. They do not look at the problem problem solving. But the mapping process deeply. As work on transfer has shown, they links two representations, or representa- stay on the surface, interpreting the problem tional systems, and that is the key thing. in superficial ways. Well-defined entities or relations in one rep- resentational system are mapped onto well- defined entities or relations in the other. 2.2. Registration By contrast, when we orient and reori- As important as it is to extend the theory ent a city or mall map to determine how of problem solving to explain how problems the representations of buildings, pathways, are framed, this way of structuring problem and openings correspond with the buildings, solving still seems to locate the real part - pathways, and openings in the actual space, the solving part - to take place after a task we are registering the map, not translating has been represented as a problem space; it, because we are trying to match up dis- that heuristic search, in one form or another, crete representational elements in the two- is the driving force in problem solving, and dimensional map with nonrepresentational that expertise is substantially about acquir- and often nondiscrete elements in the three- ing the right heuristics, metrics, and gener- dimensional world, the arena where we per- alizations of cases as if framing is just a way form physical actions. This means that much of preparing for problem solving, not of solv- of problem-solving acumen, when registra- ing it. tion is involved, may lie in knowing how Two reasons to question this clean to link representations (whether internal, account are that first, creating a problem external, or distributed over the two), with space may be a highly interactive process of entities, attributes, and relations in the phys- framing, representing, exploring, reframing, ical domain. problem sol ving and situated cognition 277

Examples of registration-heavy problems then, fourth, follow it. But discovering and often arise when something goes wrong following a route is typically interactive: during practiced activity, when the normal look at the map, look at the surroundings, method we use to get something done fails, locate oneself, interpret map actions in phys- and we are thrown into problem-solving ical terms, and repeatedly do this until the mode to figure out how to recover. Cooking, goal is in sight. cleaning, driving, shopping, assembly, and Navigational capacities depend on con- construction are all everyday domains where tinually linking symbolic elements in the problems typically arise when there is a map to physical referents in the space. These breakdown in normal activity. Much of what referents serve as anchors tying the map makes such problem solving hard is that down to the world so that a trajectory in the the agent is not yet sure what to attend to: map can be interpreted in terms of visible what events, structures, or processes to see structures in space. Thus, in a shopping mall as relevant. Every person has many frames we look for signs and arrows pointing to the for thinking about things, but which are the food court or stores of interest to help us fig- ones that fit the current situation? ure out where we are. We interactively make Here is a trivial example, computer cases. our way, often by working off the physical Computer companies regularly devise new setting rather than the map. But when plot- ways to open and close their cases and it can ting a course we use all these cues to help be surprisingly difficult to determine how to us orient and make sense of the path we get inside a computer without first check- devised using the map. Subjects go back and ing the manual. Problem solving consists of forth between map and world. The reason pushing or pulling on pieces, scrutinizing the this constant anchoring has not been a major case for telltale cues, for clear affordances issue is that in games such as chess, Tower or explicit indicators. It might be said that of Hanoi, and tic-tac-toe, in math problems, this activity is a form of external search. But and other verbally stated problems, inter- more likely it is a form of registration: of action is focused on a spatially constrained trying to discover a pattern of cues that can representation, the chess board, the tower, be fit to a method we already know or to a the formulation of the problem. This sus- mechanical frame that will make sense of tains the dlusion that problem solving is pri- the release mechanism. This is a form of marily a matter of controlling operators in registering because much of the reasoning an abstract representation. involved in determining how to open the The upshot is that in many naturally aris- case is tied to exploring the affordances of ing problems the locus of difficulty may lie as the and looking for ways to concep- much in the registration process, the activ- tualize or reconceptualize its different parts. ity of selecting environmental anchors to tie In fact, in many everyday problems, the reg- mental or physical representations to the istration phase is more complex than the world, as it does in searching for paths in search phase. the representation itself. In some instances, the phases of reg- istration and search are virtually impossi- ble to separate. We can rationally recon- 3. Interactivity and Epistemic Actions struct problem solving so that there are distinct phases, but in fact the actual pro- 3.1. People Solve Problems Interactively cess is more interactive. This is especially In most problem-solving situations people trueofwayfinding with a map. We can, if we do not sit quiedy until they have an answer like, describe map use sequentially: first, ori- and then announce it all at once. They do ent the map with immediate landmarks to things along the way. If it is a word prob- establish a correspondence; second, deter- lem (John is half as tall as Mary...), they mine current position; third, plot a route; mutter, they write things down, and they DAVID KIRSH

check the question several times. If they are complexity of a problem, or as a mechanism solving an assembly task (here are the parts for exploring the structure of a problem or of a bicycle, assemble it), they will typi- as a way of engaging other sorts of behaviors cally feel the pieces, try out trial assemblies, that might help subjects solve their prob- and incrementally work toward a solution. lems. External activity was still related to Rarely does anyone work out a complete search, one way or another. A theory of sit- solution in their head and then single- uated problem solving should give the prin- mindedly execute it. People, like most other ciples of more interactive approaches. creatures, solve things interactively in the world. 3.2. The Role of External Representations The classical approach to problem solv- ing failed to adequately accommodate this Although Simon and other exponents of the in-the-world and not just in-the-head inter- classical theory never accepted the central- activity in two ways. First, the classical the- ity of interactivity, they took an important ory, in its strictest form, assumed that users step forward when they began paying more completely search an internal representation attention to the role external representations of their problem before acting. Heuristics play. Larken and Simon (1987) enlarged the were proposed as a mechanism for reduc- orthodox account to allow problem states to ing the complexity of this internal search so be partially encoded internally and partially that solutions could actually be found. They encoded externally. Accordingly, to solve were not meant to help a user figure out a geometric or algebraic problem, subjects the next single action to perform; they were might rely on applying operators to external meant to help a user figure out a whole plan, symbols, equations, illustrations of geomet- an entire sequence of actions. This "Make a ric figures, and so forth. Instead of repre- plan before you act" hypothesis was derived senting the transformations of the equation from Miller, Galanter, and Pribram (i960), 2x + 4y = 40 in one's head, as in mental rep- and not surprisingly was repeatedly chal- resentations for 2(x + 237) = 40 followed by lenged in the planning literature, both in AI x + 2y = 20, Larken and Simon showed that and in psychology. it might be easier to generate such represen- The second way interactivity was mis- tational states in the world and track where understood was that it was never seen as a one is both mentally and physically to decide force for reshaping either the search process what to do next. or the problem space. The special value of external representa- theorists were quick to appreciate the value tions is obvious in visual problems, such as of incorporating sensing and into tic-tac-toe. Vision is a computational prob- planning. But most AI planners incorrectly lem that terrestrial animals devote huge neu- assumed that any actions that users per- ral resources to solve. In tic-tac-toe, the form in the world during problem solving computational cost of evaluating the con- are either sequences of a move can be borne by the visual system, which exploits parallel and • External analogues of internal search - highly efficient methods to project the out- instances of searching in the world instead comes of placements. This makes it easy to of in the head, or see that one or another move is pointless. • Stepwise execution of an incomplete For instance, in Figure 15.4a, given a visual plan - starting to implement a partial display of the board, it is obvious where 0 solution (or plan) before having the com- must move to prevent immediate defeat. plete one in mind - then replanning in Choice can be made without serious con- light of the resultant world state. sideration of other moves. Solving the same problem algebraically, as in the game of fif- External interactivity was never (or rarely) teen, or solving the same problem in one s seen as a mechanism for reducing the head, especially more complex versions (see problem sol ving and situated cognition 177 177

explore a of the same thing (see Figure 15.5).

If you look at a Necker cube for half a minute or more your interpretation will almost certainly toggle, and the surfaces you see as front and back will swap places. That is, if you first see A as the front face and B as the back, then after a short while you will Figure 15.4. In harder versions of tic-tac-toe it is see B as the front and A as the back. This nearly impossible to determine the best moves swapping of faces, this reinterpretation of without a visual representation of the board. the figure, does not occur in mental images. Although the board state of 15.4a can be A mental image is an intentional object and generated internally, and players can easily play as such must be sustained under an interpre- without looking at the board, the cost of tation. If a mental image were conceptual- sustaining such states in mind increases with the ized as an organized set of lines not yet inter- complexity of the mental image or preted as a cube, then new mental images representation. In 15.4b, the five-by-five board is might arise from thinking about the image. much less easy to keep in mind. To win it is But if it is maintained as a unified object - necessary to get four in a row anywhere. As the structure of the problem increases and the a gestalt - it will not toggle unless decon- complexity of the current state rises, vision pays structed into constituent lines. People rarely off. The interactive nature of vision scales better deconstruct their mental images. with board complexity. What if the same cognitive limitations apply to internal problem spaces? What if visual operators can explore parts of a search Figure 15.4b) is significantly harder. It is both space more broadly than internal operators cognitively easier and computationally sim- acting on mental representations of the same pler to use the external representation than structure? This would suggest that certain an internal representation. problems might be solved only when they In treating problem solving as a process that may be partly in the mind and partly in the world, the classical view took a big step toward a more situated perspective. But the assumed value of external representa- tions lay in the increased efficiency of apply- ing visual operators and the stimulating role external representations can play in search. Using external representations was not seen as forcing a revision of the way problem solv- ing unfolds. In particular, a concern with external representations did not lead to a discussion of other ways external resources figure in problem solving. Let us consider Figure 15.5. The Necker cube and other visually the role of external representations more ambiguous structures have more than one closely. interpretation, which subjects discover after a short while. When the cube is visually before them, they scan the edges, propagating Expanding the search space. In an elegant constraints. But when the same image is demonstration, Chambers and Ries- imagined, the ambiguity of the figure goes berg (1985) showed that how people undetected. It is grasped as a whole, without the visually explore an external represen- need of "mental saccades" to test its tation is often different from how they and sustain it. 223 david kirsh

are represented externally in figures or sym- decimal notation is better than binary f0r bols. This would be a curious form of cogni- certain operations, such as dividing by 1- tive set. the binary is simpler for other operations In some of his work, Simon came close such as dividing by 64. The proponents of to making this point. In several coauthored the classical view certainly knew this and articles he considered the role that auxil- often discussed the importance of problem iary representations play in helping subjects representation, but they never took the next solve algebraic word problems. He found step. that students used diagrams to detect addi- Their appreciation of the importance tional assumptions about the problem situ- of diagrams, illustrations, and word for- ation that were not obvious from the ini- mulations for problem-solving performance tial algebraic encoding of word problems never led to a major departure from (Simon, 1979). Although such additional the problem-space, task-environment . assumptions could in principle be discov- Externalization was not seen as establishing ered from the algebraic encoding alone, sub- a need to shift focus from problem spaces to jects found the cost of elaboration too great. affordances or to the cues and constraints of Discovering such assumptions without dia- external structures. It never led to a revised grams requires far too much inference. concern for observing what people actually Given the interest in problem isomorphs do, in an ethnographic sense, when they at the time, proponents of the (revised) clas- solve problems. sical view did not regard the value of exter- nal representations as grounds for shifting 3.3. Adding Structure to the Environment the focus of problem-solving research away from heuristic search in problem spaces, to When we do look closely at the range of replace it with a study of how subjects inter- activities that people perform during the actively engage external resources. It was course of solving or attempting to solve instead seen as further evidence that how problems, we find many things that do not subjects interpret problems affects their neatly fit the model of search in an internal problem-solving behavior. It nicely fit the or external problem space. problem isomorph literature showing that For instance, in their account of subjects surface representation strongly determines playing the computer game Pengo, Agre and problem-solving trajectory. Chapman (1987) discussed how a computer In some respects it is surprising that the program, and by , could classical theory did not embrace interactivity exhibit planned behavior without search. at this point. The idea that it is useful, and at Their program worked interactively. It used times necessary, to transform problems into a set of simple rules to categorize the envi- different representational notations or rep- ronment in a highly context-sensitive man- resentational systems is a truism in problem ner. The environment that Pengi - the name solving in math, and in science more gener- of the computerized penguin that Pengo ally. Every representational system makes it players were attempting to control - con- easy to represent certain facts or ideas and sists of blue ice-blocks distributed in a ran- harder to represent others. What is explicit dom . Pengi starts in the center and in one representation may be implicit in the villains of the game, the sno-bees, setoff another (Kirsh, 1991, 2003)1 For example, in from the corners. If Pengi is stung it dies, but decimal notation it is trivial to determine it can defend itself by kicking an ice-block that 100 is divisible by 10. But when 100 directly in front of a sno-bee thereby crush- is represented in binary notation as 110010 ing it. If Pengi was chasing a bee, Agre and it is no longer trivial. This holds whether Chapman's system would classify the struc- we translate 10 into its binary equivalent, tural arrangement of the ice-blocks one way. 101, or keep it in decimal notation. The If Pengi was running away from a bee, the 177 problem sol ving and situated cognition 177 system would classify the very same arrange- problem to another person is an effec- ment of ice-blocks another way. To achieve tive method to identify and clarify givens, this difference in classification the system to articulate problem requirements, and to attached visual markers - expose constraints on problem solutions or projections or annotations - to certain parts solution paths (Brown, Collins, & Duguid, of the situation. Thus, on two occasions, the 1989). Some of this facilitation, no , same objective state might be classified in occurs because different representations ele- two ways depending on which of Pengi's vate different aspects of a problem. But goals were active, because the world would some of it, as well, is due to the known value be visually annotated differently (see Fig- of talking out loud during problem solving ure 15.6). (Behrend, Rosengren, & Perlmutter, 1989). Agre and Chapman then showed that The thread common to all these differ- with these visual markers the computer pro- ent actions is that they reduce the complex- gram was able to behave in a strategic man- ity of the momentary computational prob- ner using a few reactive rules. It was not lems that agents face. They help creatures necessary to search a problem space as long with limited cognitive resources perform at as the system could project additional rep- a higher level. resentational structure onto the visible envi- Here is another, more prosaic example. ronment. The implication was that humans In a card game, such as gin rummy, play- work this way, too. By performing certain ers tend to reorganize their hand as they types of visual actions, including actions that play. Reorganizing a hand cannot change the affected visual memory, humans are able to value of current cards or the value of subse- solve problems without search that are both quent cards. Whether or not an ace of spades complex and that on a priori grounds ought will be a good card to accept and the three to require extensive search. of clubs a good card to throw away is un- This tactic of adding structure, either affected by the way players lay out the cards material or mental, to the environment to in their hand. The objective problem state of simplify problem solving is surprisingly per- the hand is invariant across rearrangement. vasive. People mentally enrich their situa- Yet from a psychological perspective, re- tions in all manner of ways. To help improve arrangements help players to notice possi- there are strategies such as the method ble continuations and to keep track of plans. of loci, which involves associating memory Thus, the strategy sort by suit then sort in items with spatial positions or well-known ascending order across suit, is an effective way objects in one's environment. To improve to overcome cognitive set, or continuation performance in geometric problem solving, blindness. This simple interactive procedure people project constructions, mental anno- effectively highlights possible groupings. It tations (see Figure 15.7). is an epistemic activity. The knowledge div- People have even more diverse ways of idend it pays exceeds its cost to an agent in materially enriching their situations. They terms of time and effort (Kirsh, 1995b). add reminders, perhaps with Post-it Notes, perhaps by rearranging the layout of books, 3.4. Epistemic Actions papers, desktop icons, and so forth. They annotate in pen or colored pencil; they In a series of papers Kirsh (1995a, 1995b; encode plan fragments in layouts; they keep Kirsh 8t Maglio, 1995) have argued that this recipes open; and of course they talk with sort of epistemic activity is far more preva- one another, often asking for help or to lent than one might expect. Even in contexts force themselves to articulate their ideas, where agents must respond very quickly it using their voice as an externalized thought. still may be worth their while to perform Indeed, it is widely accepted that the act epistemic actions. For instance, in the arcade of collecting one's to present a game Tetris, the problem facing players is to 237 david kirsh

Figure 15.6. Here we see the game board of Pengo as a sees it and a blowup of one section of a game where Pengi projects visual markers, indicated by dots, to allow it to act as if it has a rule "Kick the 'block-in-front-of-me' to the 'block-in-its-path.'" The dot markers serve as indexical elements that Pengi can project so that its current visual has enough structure to drive the appropriate reactive or interactive rules it relies on to determine how to act. Some of these rules tell Pengi to add visual structure and others to physically act. problem sol ving and situated cognition 177

ing place. The novel feature of these epi- stemic actions is that their value depends crucially on when they are done. Because the game is fast paced, information becomes stale quickly. Twirling a "zoid" the moment it enters the board is a valuable action, but it is near useless to an expert 200 ms later. Twirling must be timed to deliver infor- mation exactly when it will be useful for an internal computation. From a purely problem-space perspective, where states and operators have a timeless validity, there is no room to explain these time-bound actions. Even though epistemic actions help agents to solve problems, and they can be understood as facilitating search by increas- bisecting angles, and generally adding mental ing the speed at which a correct problem annotations. In this example, an agent is shown representation can be created, they are not, a rectangle and asked to prove that a line that cuts a rectangle in half will also cut all diagonals on the classical view, part of problem solv- of the rectangle in half. In those cases where ing, and they do not lie on a solution path. subjects do not draw in the diagonal and They help to discover solutions, but for bisector, they still can often find a solution by some reason they are not part of the solu- projecting first a diagonal and then a section line tion path. that bisects the rectangle. It is possible to add letters to projected points such as the intersection of the dotted bisection and the solid diagonal. This same capacity can be used in creative ways to add structure to other sorts of situations by envisioning the results of performing actions before performing them.

decide how to place small tetrazoidal shapes on a contour at the bottom of the board (see Figure 15.8). As the game speeds up, it becomes harder for players both to decide where to put the shapes and to manipulate them via a keyboard to put them into place. What Kirsh and Maglio (1995) found was that players, even expert players, regularly performed actions that helped them to rec- ognize pieces, verify the goodness of poten- Figure 15.8. In Tetris the goal of play is to tial placements, and test plans (e.g., drop- relentlessly fill gaps on the bottom layers so as to ping a piece from high up on the board) complete rows. The game ends when the board clogs up and no more pieces can enter. Because despite there being a cost to their actions in there are mirror pieces, and the choice of where terms of superfluous moves. Evidently the to place a Tetris piece is strategic, players need cost of moving a piece off its optimal tra- many hours of practice to become expert. We jectory was more than compensated for by found that players often perform unnecessary the benefits of simplifying some aspect of rotations to speed up their identification the cognitive problem involved in identify- process, especially among pieces with mirror ing the piece and determining its best rest- counterparts. (Kirsh & Maglio, 1995) 239 david kirsh

a b Figure 15.9. The consequence of thoughtful redesign is that tasks that were error prone and hard to manage become progressively easier. The two interfaces are small environments - tools, in a sense - that agents must thoughtfully use to complete a task. In this case the task is to set the font style of a region of text. By redoing the graphic layout, as in 15.9b, designers are able to lower the costs of planning what to do, monitoring what one is doing, and verifying that one has completed the task. If this task was treated as a problem to solve, the effect of redesign is that it is now easier, even though the task environment is the same.

The challenge epistemic actions pose to are laden with mental aids, so problem solv- the classical approach, and to the psycholog- ing is more a matter of using those aids effec- ical study of behavior more generally, is that tively. Supermarkets tend to display com- without an analysis of the possible episte- petitive items beside one another to help mic functions of an action it may be nearly shoppers choose; large buildings tend to impossible to identify the primary function have signs and arrows showing what lies of an action and so label it correctly. Actions down a corridor because people need to that at first seem pragmatic, and so to an find their way; microwaves and ovens have observer may seem to be an ordinary move, buttons and lights that suggest what their may not be intended by the player to be function is because people need to know an ordinary move. Their objective may have what their options are; and most of our been to change the momentary epistemic assembly tasks take place when we have dia- state of the player, not the physical state grams and instructions. Wherever we go, we of the game. This important category of are sure to find artifacts to help us. Rarely problem-solving activity lies outside the do we face knowledge-lean problems like classical theory because the classical theory the Tower of Hanoi or desert-island prob- operates with a fixed set of actions and a lems. Our problems arise in socially orga fixed evaluation metric. Both action reper- nized activities in which our decisions am toire and metric are taken as objective fea- activity are supported. tures of the task environment and assumed There are several reasons why cultun insensitive to the momentary epistemic state products and artifacts saturate everyda of the agent. environments. First, most of the enviror ments we act in have been adapted to hel us. Good designers intentionally modify ei 4. Resources and Scaffolds vironments to provide problem solvers wi1 more and better scaffolds and resourcf The classical theory also fails to accommo- designs that make it easier to complete tas date the universality of cultural products (see Figure 15.9 and caption). And if desig that facilitate activity-specific reasoning. ers are not the cause of redesign, it oft Environments in which people regularly act happens anyway, as a side effect of previc problem sol ving and situated cognition 177 177 agents leaving behind useful resources after having dealt with similar problems. Rarely are we the first to visit a problem. A second reason resources and scaffolds almost always exist in our environments is that, as problem solvers, we ourselves construct intermediate structures that pro- mote our own interactive cognition. As already mentioned, we create problem- Figure 15.10. Multiplication problems are rarely aiding resources such as illustrations, piles, solved in the head. Problem solvers reach for annotations, and notes that function like pen and paper, line up the numbers in a strict cognitive tools to help us to understand manner, and then produce intermediate and explore questions and coordinate our products that are then relied on in their . Paper, pen, graph paper, and lines are inquiry. When we face problems, we typi- all scaffolds that help agents to reduce error. In cally have a host of basic resources at hand: this figure we see three ways to multiply. The tools such as pen, paper, calculators, and most noteworthy feature of these structures is rulers; manipulables such as cans, cups, and the lines that guide the user. They are not chopping boards; cultural norms of ges- elements of the algorithm. They are scaffolds ture, style, ; and, of course, cultural indicating where the multiplicands end and the resources and practices such as tricks for intermediate products of the problem solver are solving problems, techniques, algorithms, to be placed, and so on. They help agents stay in methods of illustration, note taking, and so control, keep on track, monitor where they are, on. When we attempt to solve a problem, we prevent error. reach for these aids or call on tricks and tech- niques we have learned. Many of our solu- tions or intermediate steps toward solution of the term, used it. In his view scaffolds take the form of actions on or with these are support structures in learners' immedi- materials. They seed the environment with ate environment that might permit them to useful elements (e.g., lemmas), they make solve problems that are at the periphery of it possible to see patterns or see continua- their problem-solving ability, problems that tions (e.g., the lines in tic-tac-toe), or they reside in what he called the "zone of prox- make it easier to follow rules (e.g., the lines imal development." They are akin to hints. in multiplication; see Figure 15.10). And intrinsic to this notion is the assump- A final source of resources and scaffolds tion that as soon as the student internal- is found in our neighbors or colleagues who izes the requisite methods, norms, heuris- are often willing to give us a helping hand, tics, and construction skills, scaffolding will offering hints, suggestions, tools, and so on. be unnecessary and no longer found in the In educational theory, the term scaffolding problem-solving situation. Workers put up refers to the personalized problem-solving scaffolds to help them reach parts of a build- support that an expert provides a novice. ing they cannot otherwise reach. But as soon Help is too simple a term to describe this as they have finished their job, they remove support because a good teacher gives a the scaffold. Training wheels are a classi- student tools, methods or method frag- cal example of scaffold in the learning lit- ments, and tricks about technique that the erature. Other examples include the use of student is ready to absorb but that, taken vowel markers in beginners' Hebrew that are on their own, do not provide an answer to omitted in normal Hebrew writing because the student's impasse. Scaffolds extend the context and knowledge make them redun- reach of a student, but only if the student is dant. in a position to make use of them. This was a Outside of learning theory, the term key aspect of the way Vygotsky, the author scaffold is used to refer to the cultural 286 241 DAVID KIRSH

resources - artifacts, representations, norms, policies, and practices - that saturate our everyday work environments and that remain even after we have internalized their function. They reflect the supports present in most work environments. The major- it)' of our problem-solving abilities evolved in these resource-heavy environments; they rely on those resources being there. Take the case of geometric problems. Figure 15.11. The written or formal statement of Students leam a variety of ways to solve a geometric problem is almost impossible to such problems but most involve construct- understand without translating it first to a ing illustrations. Part of a student's problem- geometric diagram. In proving KL == BL, virtually all problem solvers use the figure as an solving competence consists in the ability to arbitrary version of the problem that feeds create apt illustrations and then to use them, . Students find it simpler to reason not just to understand the problem but to using the figure because it is easier to notice work inside, annotating them, to solve the visual or geometric relations between properties problem. In Figure 15.11, we see an exam- in this problem than it is to notice algebraic ple. Rather than tackle the problem alge- relations. The validity of the proof depends on braically, an easier approach is to modify universal generalization: a proof method where the problem-solving environment so that it one assumes an entity, an isosceles triangle of supports a range of different actions that certain dimensions; one then proves a key statement about this triangle; and then one the user finds easier to control and work shows that nothing in the proof depended on with. As discussed earlier, these represen- having chosen this particular triangle as the tations have a structure that makes the rel- example. The proof generalizes to all evant attributes of the problem easier to right-angled isosceles triangles. manage.

Formal Problem. ABC is an isosceles problem as his friend who uses an illustra- right-angled triangle with the right tion? Do they operate in the same task envi- angle at C. Points D and E, equidistant ronment? A task environment, after all, is from C, are chosen arbitrarily on AC defined with respect to operators too. How and BC. Line segments from D and C about a student who uses a pocket calculator are perpendicular to AE and meet the to determine the square root of 1600 versus hypotenuse AB at K and L. Prove that her friend, who calculates the value by hand: KL = BL. is it the same task environment? Tools, scaf- folds, and resources seem to interact with The presence of resources and scaffolds tasks, usually changing them, often in ways in an environment does not, in itself, chal- that reduce their complexity. lenge the view that people solve problems by searching through a problem space that is distributed over structures in the world 5. Special-Case Solutions and structures in the head. But it does call into question how to formulate the prob- A favorite pastime of situationalists is to lem that people are solving. If the resources enumerate instances in which resources in at hand change, or are different from situa- the environment have been put to creative tion to situation, why assume the problem use to transform the complexity of prob- is the same? Does a student who uses alge- lem solving. No discussion of situated prob- braic techniques solve the same geometric lem solving would be complete without PROBLEM SOLVING AND SITUATED COGNITION 1*7

mentioning the infamous example, "The intelligence is in the cottage cheese" (see Fig- ure 15.12)- Although the method shown in Fig- Figure 15.12. To solve the problem of how much ure 15.12 is certainly simple, it is worth not- cottage cheese two-thirds of three-quarters of a ing just how narrow it is. It is only because cup is, a subject was observed to turn over a the daily allotment of cottage cheese is 3/4 of one-cup tub of cottage cheese, crisscross it to mark four quarters, remove one quarter, then a cup, and tubs are exactly 1 cup (or because take two of those three quarters, which is consumers have a measuring cup on hand), one-half. It would have been easy enough to that it is possible to execute the algorithm multiply 2/3 by 3/4, but typical subjects require and use the visual cues provided by the criss- pencil and paper to do that and regard the task crossing. This is a very special case. Weight to be effortful and confusing. The cup-sized watchers' dieticians might have allowed 7/16 cottage cheese itself became a thing to think of the daily portion of cottage cheese as the with. lunch portion. And they might have decided that the daily allotment was 3/5 of a cup. But why would they? Problems and the environments in which they are solved are Yet in practice, most of the problem rarely independent. If for some reason 21/80 instances a traveling salesman actually faces of a cup became important among a subset are not nasty. Indeed, depending on the road of consumers, how long would it be before layout of a given sales region, it may be quite manufacturers would change the tub size to easy to compute an optimal tour (see Fig- make such calculations easy? ure 15.13). Would a salesman well adapted to This highlights a key fact about situ- his specific region leam a slow general algo- ated problem solving: it is usually narrow, rithm for solving the problem in all cases special-case-oriented, and shallow. The rea- or a fast special algorithm that satisfactorily son such approaches work is that most solves just his customary problems? Who instances of naturally occurring problems, would be better adapted to his task? And, even problems that are theoretically hard, if there happens to be one or two worst-case are confined to those versions of the prob- problems lurking in his sales region where lem that can be solved easily. The ones that his method gives him a suboptimal tour, the resist simple methods are part of a small set overall cost of traveling that imperfect tour of worst-case versions (technically known as one day a month is more than compensated the complexity core) that people rarely if by the benefits of having cheaply found opti- ever encounter. On those improbable occa- mal or near optimal tours the rest of the sions where they do encounter an instance times he has gone out. of this core, they usually do poorly. Findings in complexity theory justify Here is an example. The traveling sales- this perspective. Hard problems are almost man problem is computationally intractable always easier if a second-best answer will be in the general case. There exist some nasty adequate much of the time. More precisely, problem instances - the worst cases, the a problem that is NP-complete (or worse) complexity core - where the only way to can usually be solved in polynomial time if determine an optimal tour among all the it is acceptable to give a solution that is 1 - 6 cities the salesman must visit is by check- from the perfect answer (Padadimitriou & ing every conceivable tour. Because a tour Yannakakis, 1988).1 The more tolerant one consists of visiting every destination exactly is about the precision of the answer - that once, there are as many tours as sequences is, the further from the perfect answer one of destinations, and that is n factorial tours. can tolerate, the faster the algorithm. So For ten cities that means checking 10! or if our traveling salesman does not require 3,628,800 tours. having the absolutely shortest path but will david kirsh i8s

of a more general mathematical, logical 0

planning problem. This raises a hard prob. lem for formalists. How should the con- ceptualization people actually work with - the conceptualization that somehow figures Easy Tour Hard Tour in the problem space an agent creates - be Figure 15.13. Compare these two instances of matched against the real task environment the traveling salesperson problem. Why is the associated with the problem in its more gen. easy tour easy to compute and the hard tour eral form? hard? Because the easy tour lies on a convex hull. If we know this fact, then we can use a fast The answer to this question has deep algorithm based on the truth of that implications for problem-solving methodol- assumption. If there is no convex hull, or if we ogy. As empiricists we ought to accept that have no reason to think that there is, then there it is an empirical question which of the many is no fast algorithm to guarantee an optimal possible task environments that might fit a tour. In many cities, roads are laid out on a grid problem is the actual task environment that structure, which makes the determination of problem solvers must adapt to. It is not to optimal tours more like easy than hard tours. be answered in ignorance of the frequency of the problem instances real problem solvere face. Presumably, it is misleading to choose settle for one that he knows is close to the too general a task environment if, in fact shortest, then he can have a quick, reli- problem solvers never face more than a small able method of finding that path. And the subset of the problem instances that fit the same benefits apply if he can tolerate a lit- general task specification. That would be a tle uncertainty in his answer. Thus if he bad framing of the problem. can accept being right at least 99 percent of For example, the task environment for the time, or more precisely, have confidence the Tower of Hanoi, in most renderings of 1 - E that his answer is optimal, then a fast the task, is structurally the same whether it algorithm exists (Karp, 1986). The reason applies to towers made with thirty disks or to such algorithms exist is that in all prob- those made with three disks. From a mathe- lems there is a large class of instances where matical point of view the task environment there is structure in the problem that can be is the same because a recursive algorithm exploited. In Figure 15.13, the structures are is used to generate the formal structure of apparent. In Figure 15.12, the cottage cheese the problem. Just as we treat multiplication case, the structure is the obvious relation of two thirty-digit numbers to be the same between the numerator and denominator basic task as multiplication of two two-digit of the fractions 2/3 x 3/4,1/2 x 2/3, 3/4x4/5, numbers, so it might seem natural to treat and so on. An agent does not have to know solving thirty-disk problems to be the same about this structure or understand it to rely basic task as solving three-disk problems. Of on algorithms or heuristic methods that are course the actual problem space is much valid in virtue of this structure. But an agent larger in the thirty-disk case than in the who is ignorant of the relevance of this struc- three-disk case because there are so many ture will never know whether she is cur- more combinations to consider. But why rendy confronting an instance of the hard should that matter? If we think that sub- core or why her method sometimes fails. jects solve Tower of Hanoi problems with a The adequacy of special-case solutions recursive algorithm, it ought to be the same suggests that, in general, agents operate with whether they face thirty disks or three. an understanding of their problems that is Yet people who solve the three-disk prob- good enough for the cases they normally lem often fail on larger problems, even ones encounter. They conceptualize their con- as small as seven- or eight-disk problems. text rather differendy than formalists, who They usually have trouble keeping track of see the problem being solved as an instance where they are in the problem; they have problem sol ving and situated cognition 177 trouble maintaining current state. At some and methods of coping, we would expect point they need a separate strategy to stay on that most problem solving in well-designed course, and typically this is to use paper or environments is computationally easy, with some other way of encoding state. Can we external supports that ensure it is so. No one pretend that this auxiliary strategy is not part has argued that situated problem solving is of their problem-solving method for larger better than other methods. It is just more problems? like the way we think. And that was the This raises the question: Do they face the question at issue. same problem in both cases but use differ- ent methods because of memory and com- putational limits? Or do they face different 6. Knowledge-Rich Cognition problems because small and large versions of the Tower of Hanoi are actually different Experts know a lot about their domains. problems given the resources and methods Even if they cannot articulate their knowl- subjects have? The answer depends on why edge, they have built up methods for we think it useful to invoke a task environ- achieving their goals, dealing with hassles ment. The notion of a task environment was and breakdowns, finding work-arounds, and introduced to explain what rational agents more to make them effective at their tasks. adapt to as they get better at solving a prob- That is why they are experts. lem. Namely, they adapt to the problem's In regarding agents in their everyday con- structure, to the cues and constraints on texts to be well adapted to their contexts of paths to a solution. The paradox of large work and activity, we are treating them as problems is that rational agents never do rich in knowledge, as more or less experts in adapt. They cannot use the algorithms they their commonsense world. They know how use for smaller instances of the "same" prob- to get by using the material and symbolic lem, at least not in their head. And when resources supporting action. Perhaps the they use paper, they perform many other difference between situated cognition and actions related to managing their inscrip- problem-space cognition is the difference tions that have nothing to do with the core between knowledge-rich and knowledge- algorithm. Why, then, maintain that the lan- lean cognition. guage of task environments is helpful? What An old distinction, still useful but poten- does it predict? It does not predict the per- tially misleading, distinguishes declarative formance and pattern of errors that problem from procedural knowledge: knowledge of solvers display as the problem gets larger, facts and explicit rules from the type of because performance depends on the algo- knowledge displayed in skills. Typical stud- rithms actually in use. All it can predict is ies in knowledge-rich problem solving focus how an ideally rational agent, unaffected by on the dense matrix of facts, procedures, resource considerations, would behave. heuristics, representational methods, and If it is true that people do not use the cases that experienced agents bring to their same method in large and small versions of tasks. This is a mixture of declarative and the Tower of Hanoi, why assume that they procedural knowledge, though presumably should use the same methods in other envi- some of the knowledge that experts have ronments, such as shopping, assembly, sell- is related to knowing how to publicly use ing, or navigation? representations, to exploit their social net- The upshot is that situated problem solv- works, and to use tools - skills that seem ing emphasizes that people solve problems to be more embodied than can be encapsu- in specific contexts. The methods they have lated in a set of procedures. Because most learned are well adapted or efficient in those people become experts or near experts in contexts but may be limited to special cases, dealing with their everyday environments - not generalizable, and often idiosyncratic. shopping, driving, socially conversing, pre- Indeed, given the coevolution of settings paring their meals, coping with familiar 245 david kirsh

technology - they probably know enough tant in the literature since the 1980s. But about these domains to have effective even these accounts place too little empha- problem-solving methods for handling the sis on the centrality of resources, scaffolds majority of problems they confront. For interactivity, and cultural support. There is the few problems they cannot handle, they far more going on in solving a real-world usually have work-arounds, such as calling problem than in searching a problem space friends for advice or knowing how to halt or abort a process, that let them prevent catas- trophic failure. PART 3: POSITIVE ACCOUNT - On the classical account, a major source A FEW IDEAS of the improved performance that experts display is to be explained in terms of The view articulated so far is that problem improved search or improved representa- solving is an interactive process in which tion of problems. They have better methods subjects perceive, change, and create the for generating candidate solutions and better cues, constraints, affordances, and larger- metrics for evaluating how good those can- scale structures in the environment, such didates are. It has also been observed that as diagrams, forms, scaffolds, and artifact experts spend more time than novices in ecologies that they work with as they make the early phases of problem solving, such their way toward a solution. This looks like as determining an appropriate representa- the basis for an alternative and positive the- tion of a problem. To cite one study (Les- ory of how people overcome problems in gold, 1988), when a fund manager considers concrete settings. The positive element in stocks to include and exclude from her port- situated cognition is this emphasis on agent- folio, she is systematically reviewing a set of environment interaction. All that is missing candidates. So we might start our interpreta- from such a theory are the details! tion of her problem solving with a problem How and when do people externalize space and operators. But before she makes inner state, modify the environment to gen- her final decision she will have done many erate conjectures, interactively frame their things, perhaps over several days, to uncover problems, cognize affordances and cues, and more information about stocks, and to get allocate control across internal and external hints about what the Street thinks about resources? These are fundamental questions each stock. Some of the things she might that must be answered by any positive do include retrieving charts, extending and account that attempts to locate problem interpreting those charts, contacting ana- solving in the interaction between internal lysts, their reports, examining the representations or processes and external portfolios of her competitors, reading eco- representations, structures, and processes. If nomic forecasts, and possibly even visiting problem solving emerges as a consequence companies. of a tight coupling between inside and out- This appreciation of the value of prepa- side that is promoted and sustained by cul- ration reveals that when experts solve ture and the material elements of specific knowledge-rich problems they engage their environments, then we need an account of environment in far more complex ways than the mechanisms involved in these couplings, just implementing operators. They interac- and ultimately, of how culture, learning, and tively probe the world to help define and the structure of our artifacts figure in shap- frame their problems. This suggests that ing that coupling. deeper ethnographic studies of everyday It is beyond the goals of this article to problem solving may show a different style present a positive theory, or even a sketch of activity than found in formal accounts of one. I will discuss, however, four areas in of problem solving. Theories of knowledge- which adherents of situated cognition, in my rich problem solving have become impor- opinion, ought to be offering theories - areas problem sol ving and situated cognition 177

¡n which a situationalist might construc- important element of the culture of problem tively add value to the problem solving lit- solving (Kokinov, Hadjiilieva, & Yoveva, erature. My remarks should be seen more 1997)- as suggestions or desiderata for a situated A simple theory of hints might begin by theory of problem solving than as an actual defining a hint to be a verbal or nonverbal sketch of one. The four areas are as follows: cue that acts like a heuristic bias on search. This would nicely fit the classical theory. i, Hints Consider the hints people give in the game | Affordances of I Spy - a child's game in which one player, , Thinking with things the spy, gives clues about an object he or she Self-cueing observes and the other player(s) attempt to guess what the object is. It is important to appreciate that this pos- itive approach is not meant to label other I "You're getting hotter" - metric informa- theories - more classical theories - as useless. tion - your current guess is better than Insight into problem solving can be found in your last. the gestalt literature, in articles on clinical I "It's bigger than a loaf of bread" - con- problem solving, in the study of learning and straint on candidate generation and an , and also in the literature on task acceptable solution. environments just criticized. What is offered I "More over there" — gesture that biases here should be seen as an addition to those the part of the search space to explore. literatures, part of what one day might be a I "Try eliminating big categories of things more integrated approach, though the proof first like 'Is it a living thing?'" - heuristics that such as theory is viable will require for efficiently pruning the space. the sort of dedication to ethnography, • "Don't ignore the color" - critical features experimental research, and model build- to attend to that bias generation and eval- ing that has so far been lacking in situated uation. cognition. • "Back up - abstract away from these sorts of details" - recharacterize the search space. i. Sketch of a Theory of Hints In thinking of hints in this way, we tie them Hints come in all forms. They are a part to their role in both candidate generation of the natural history of problem solving. and evaluation. Here are a few examples drawn from the Situated cognition downplays search in classroom. A teacher may tell or show a stu- a problem space as the determinant of dent how to use a special method or algo- problem-solving behavior. Psychological, rithm ("Here's a faster way to determine social, cultural, and material factors are square root"), give advice on framing the treated as more important. Nonetheless no problem ("First state the givens and thing to approach to problem solving can overlook be shown, then eliminate irrelevant details the importance of both candidate generation and distracters"), or suggest useful strategies and evaluation. Ideas, possibilities, and pos- ("Generate as many lemmas as you can in sible ways of proceeding are always involved order to fill your page with potentially useful in problem solving, as is their evaluation things"). A fellow student may share illustra- as being fruitful, off the mark, or sugges- tions or models, mention an analogous prob- tive. Even in insight problems (Mayer, 1995), lem, give part of the answer, suggest a way where discovering a solution requires break- of thinking or representing the problem, and ing functional set - as when a subject sud- so on. Is there a theory that might explain denly realizes that an object can be used in a why a hint is a hint? Hints represent an nonstandard way - it is still necessary to try 247 david kirsh out ideas of different ways of using objects. So there is always a component of generat- ing candidate actions and testing their ade- quacy. What drives the generation process? It depends on how easy it is to know that one is at a choice point and the set of actions available there. In knowledge-lean problems such as the Tower of Hanoi, there are a small Figure 15.14. Try solving the flagpole problem. number of possible moves at each step, and Why is it hard? It is not because we cannot think an agent knows what these are. The problem of a way to represent it. We can extract the is discrete, candidate generation is easy, and givens, the solution condition, and constraints. state-space search is a good formalization of Making an illustration is easy, as shown here. the problem, even though the agent's psy- But is this a good illustration? If you have not chological task changes when the problem solved the problem, here is a hint: work in from increases in size. In knowledge-lean prob- the extremes to see how the vertical drop lems the difficulty lies more in evaluation changes as the separation between the trees than in generation - in deciding whether a increases or decreases. What kind of hint is this? given option is a good one. Hints therefore ought to offer advice or heuristics for mov- ahead, exchange pieces, not pawns." "Don't ing in a good direction. For instance, work place your pawns on the same color squares toward clearing a peg completely; there are 2 as your bishop." no short cuts. In problems that do not count as In games like chess where there are many knowledge-lean, or problems where it is not more possible actions at each choice point, a easy to determine when one is at a choice state space still captures the formal structure point and what the options are, the hardest of the problem, because the agent knows step is often to figure out how to think about perfecdy well where the choice points are. the problem, to pose it or frame it in a con- Given how much knowledge is required structive way. Consider the problem in Fig- for expertise it seems odd to call chess a ure 15.14. What should one do? Making an knowledge-lean problem. In chess it is hard illustration is always a good idea, but which to see the downstream effects of action, illustration, and how should it be annotated? because one's opponent is unpredictable. So The space of possible drawings and textual it is hard to generate plausible chains of "If I actions is huge. Moreover, once a sketch has do this, then you are likely to do that." The been made there remains the question of branching factor of the game is large and how to proceed. the space of possible continuations huge. This puts greater pressure on prudent can- Flagpole problem (Figure 15.14): Two didate generation at each level of the search palm trees are standing, each 100 feet space. Accordingly, in chess, hints and sug- tall. A 150-foot rope is strung from gestions often have to do with biasing gen- the top of one of the palm trees to eration. For instance, typical rules for open- the top of the other and hangs freely ing are these: "Open with a center pawn." between them. The lowest point of "Knights before bishops." "Don't move the the rope is 25 feet above the ground. same piece twice." "Always play to gain con- How far apart are the two flagpoles? trol of the center." Typical rules for middle (Ornstein & Levine, 1993) game include these: "In cramped positions free yourself by exchanging." "Don't bring One reason some problems are hard is your king out with your opponent's queen that people find it difficult to escape their on the board." And in the end game rules familiar way of proceeding, another instance include these: "If you are only one pawn of mental set. Aspects of a problem may problem sol ving and situate177 d cognition 177

remind them of possible methods or method (Laird, Newell, & Rosenbloom, 1987), fail- fragments, but if these seem unpromising, ure to generate a feasible action creates what is to be done? The difficulty hardly an impasse and a new subproblem to be seems to lie in finding something to do. It lies solved. An alternative suggestion by Greeno in finding an interesting thing to do. Where and colleagues (Greeno, Smith, & Moore, do interesting ideas come from? Does star- 1993; Greeno & Middle School Mathematics ing at a problem help? Should the problem through Applications Project Group, 1998) solver do things not directly connected with is that problem solvers recognize possible taking a step forward in the problem, such as moves by being attuned to affordances and doodling, reading more about related prob- constraints. If at first an agent does not see lems, talking to colleagues, brainstorming? a possible action, she can interact with the How many candidate steps come from inside environment and increase her chances of dis- an agent's head and how many come from covering it. This is a promising approach that prompts from artifacts, , and deserves elaboration and study. so forth? The inspiration for seeing problem solv- Early experiments by Gestalt psycholo- ing emerging out of interaction with re- gists have shown the value of environmental sources and environmental conditions is cues, such as waving a stick or setting a string drawn from Gibson's (1966, 1977, 1979) in motion (Maier, 1970), that call attention theory of perception. Gibson regarded an to an item that might figure in a solution. affordance as a dispositional property like But a more theoretically motivated expla- being graspable, pullable, or having a struc- nation of how these cues trigger candidate ture that can be walked on, sat on, picked generation is needed. And, once subjects do up, thrown, climbed, and so on. These prop- have a new line of thought to pursue, where erties of objects and environments are what do they get their metrics on goodness, their make it possible for an agent with particu- ability to discriminate what is an interest- lar abilities to perform actions: pulling X, ing avenue to follow and what looks useless? walking on X, sitting on X, picking up X, How does the environment help? throwing X, climbing over X. Agents with different abilities would encounter different A theory of situated problem solving affordances. For example, relative to a leg- should explain why hints are successful and less person, no environment, regardless of the many ways our environments offer us how flat it is, affords walking. The same hints on how to solve our problems. At a applies to other actions and skills. Only minimum there is a large body of relevant relative to an action or activity reper- data to be found in hint-giving and hint- toire does an environment have well-defined receiving behavior. affordances.

Because affordances are objective fea- 2. Affordances and Activity tures of an environment, there may be af- fordances present in a situation that are A second element in a positive theory would never perceived at the time. The affordances explain how people discover candidate steps that are actually perceived depend on the in a problem solution by interactively engag- cues available during activity. The more ing their environment. In classical accounts, attuned a creature is to its environment, if a task environment is well defined there the more it picks up affordance-revealing is a set of feasible actions specified at each cues and the more readily it accomplishes choice point. An agent is assumed to recog- its tasks. So runs the theory according to nize choice points and automatically gen- Gibson. erate feasible actions. In situations where Problem solving, Greeno and colleagues problems have not been well framed, how- suggest, should likewise be seen as an active, ever, discovering moves to consider can dynamic encounter of possibilities and reg- be challenging. In Newell's theory, SOAR istering of affordances, constraints, and 249 david kirsh

invariants. When engaged in a task, or when Imagine the affordances, constraints, and trying to solve a problem, skillful agents invariants that a chef must pick up when recognize aflbrdanees that are relevant to preparing an egg sunny-side down. First a their immediate goals. A cook recognizes hard edge must be found for cracking the the affordances of the stove, the knobs, egg cleanly, then the egg itself must be held blenders, and ingredients. Because many of correctly when opened, the frying process must be monitored, and the egg shifted at these affordances are representational (e.g., the right time so that it does not stick. Flip, a dial) or rule based (e.g., it is a convention ping has its own complexities. Throughout that pulling a lever forward increases rather the process a cook must be sensitive to the than decreases the magnitude of what- preconditions of actions, the indicators that ever it controls), the world of affordances things are going well or beginning to go and constraints that a cook is sensitive to awry, and the moment-by-moment dynam- must include properties that are socially, ics of cooking. A sunny-side-down cooking culturally, and conventionally constructed. trajectory, under normal conditions, is sup- Thus linguistic structures as well as nonlin- posed to be an invariant for a competent guistic ones can be affordances for Greeno. cook. In the theory of attunement to affor- In his view, it is not relevant that linguis- dances and constraints, all these cues, affor- tic and nonlinguistic affordances are learned dances, constraints, and invariants of normal differendy or that they are grasped differ- practice are precisely the things that skilled endy: that the way someone knows what agents are supposed to be attuned to. depressing a button with the linguistic label In extending affordances to include the "abort" will do is conceptual, whereas the affordances that situations provide for tools way someone knows that a knife affords (in the hands of competent users), Greeno hefting or cutting is probably nonconcep- and colleagues pushed the well tual. Both types of possibilities for action beyond what can be perceived through nor- are "seen" and qualify as affordances for mal perception - even ecological percep- Greeno. tion. Their theory goes even further, though, This inclusiveness represents a major to include affordances for reasoning. Such extension of the concept of affordance and things as marking up diagrams, working constraints, as understood in ecological psy- with illustrations, and manipulating sym- chology, but if it can be made to work, it per- bolic representations are all actions that mits seeing problem solving to be the out- experienced agents can perform and that come of a more embodied encounter with serve as steps in reasoning. cultural resources. Greeno and colleagues For instance, when a student of geom- assume that people can perceive or regis- etry sees an illustration as in Figure 15.15, ter affordances for activities that are quite it is assumed that she can recognize a host complex. For instance, they assume that of affordances for construction. To some- the more familiar we are with tools, such one who has learned to make triangular con- as hammers, screwdrivers, chisels, knives, structions it is natural to look at Figure 15.15b machine saws, and lawn mowers, the more and imagine dropping perpendiculars, as in we can see opportunities for using them. Figure 15.15c. With practice and a little prod- Carpenters can make hundreds of percep- ding, most students realize that the area of tual inferences about wood. And someone Figure 15.15b is the same as the rectangle in who mows his lawn every week can see Figure 15.15a, and both are base times height when grass is ready for its next cutting or (b x h). The proof involves noting that the too dense for a given lawn mower. This is triangles in Figure 15.15c are congruent. If an important and useful extension but not someone cut out the left-hand triangle, he without difficulties. could paste it on the right and produce a Because using tools invariably involves rectangle just like Figure 15.15a. It is possible mastering practices, the affordances that to prod students to recognize and generalize tool users perceive must be complex. PROBLEM SOL VING AND SITUATED COGNITION 177

is a stretching operation that transforms Figure 15.16a into Figure 15.16b. How many people see it? Because there are an infi- Figure 15.15. For a student of geometry, 15.15b nite number of ways to stretch the figure, affords many different types of constructions, it requires an insightful mind to envisage including dropping perpendiculars from vertices the transform. Figure 15.16b gives one set as in 15.15c, adding diagonals, bisecting, and so of deformations that shows the equivalence. forth. These affordances for constructions can Yet how many people can see even the first be projected mentally onto the figure or added transform from 1 to 2? A theory of affor- by pencil or pen on the figure itself. Although there are an infinite number of constructions the dance pickup must tell us who, when, and figure affords, a geometer only considers those why some people can see this invariance constructions that are part of the standard and who, when, and why certain others can- practices. For instance, in 15.15CI a clever student not. It also must explain why the most use- might realize that the area of a trapezoid can be ful affordances are the ones that come to proved similar to the area of a rectangle by mind. noticing that triangles constructed through the Education researchers would love to pro- midsection of each vertical side are congruent vide a theory of attunement learning, though and create a rectangle whose length is (top + so far advances have been empirical rather bottom)/2. than theoretical. For instance, Sayeki, Ueno, and Nagasaka (1991) found that students who were given a chance to play with a deck this idea by giving them paper and scissors of cards as shown in Figure 15.17a, and then and the chance to cut and paste. When they to chat about their activity, were soon able perform this physical action a few times, or to recognize the invariance of the area in the perform the construction on paper, they figures shown in Figure 15.17b. Physical expe- come to see an invariance preserved under rience with the deck helped to imbue stu- a shearing transformation. Clever students dents with a strong sense of the transforma- who are able to prove the equality of area tions that preserve area. Presumably subjects theorem more analytically may even be able who played with clay structures shaped into to recognize opportunities for making less the structure shown in Figure 15.16a would obvious constructions, as in Figure 15.15c!, similarly have an easier time following the thereby generalizing further the equality of constructions in Figure 15.16b. area theorem and recognizing the invariance Greeno and colleagues viewed this con- under further transforms. sequence of practice as support for their The idea that an experienced subject can theory that problem solving is the result of become attuned to both affordances and rel- attunement to affordances, constraints, and evant invariants is powerful but constitutes a theory of problem solving only if it offers predictions. This is a challenge for an attune- ment theory. It is not a lost cause but is cer- tainly one that has yet to be met. Exactly when will a subject pick up affordances for construction and, given the vast number of such affordances, exactly which ones will she attend to? This last concern applies to all theories of affordances but is especially Figure 15.16. All the figures shown in 15.16a and problematic for those that presume afford- 15.16b are topologically the same: 15.16b is a ances for reasoning. sequence of transforms that is meant to prove To see how serious this challenge is, con- the equivalence of figures in 15.16a. As with sider Figure 15.16. The two shapes in Fig- most proofs, not everyone is able to "see" the ure 15.16a are topologically invariant. There validity of each inferential step. 300 david kirsh

out before grasping deep analogies. fa affordance detection approach would pre diet that problem solvers would be actively a b engaging their environment, both by pro. jecting and by acting. In the classical the- Figure 15.17. Students were shown a deck of ory, by contrast, the reason a subject is able cards as in 15.17a. The front face initially formed a square. As the deck was pushed around it to transfer problem-solving expertise from became apparent that many four-sided and one problem to another is that the subject non-four-sided shapes could be constructed. is able to detect deep structural similari- The attribute common across all transformations ties, and so methods found successful in the is that the area of the front side always remains source domain are mapped over to the target constant. The equivalence of the two images in domain. There is no deep need to interact 15.17b became obvious, intuitive, known in an with the problem environment. embodied way. If the theory of attunement were better specified, these advantages would be sub- stantial. But the theory, as developed to in variance. Practice and engagement leads to date, falls short of details. How, for instance improved attunement, which in turn leads does a subject choose which affordance to to better solution exploration. act on? There are infinitely many afford- This is a rather different view than the ances available in any situation. This is par- classical account for two reasons. First, ticularly true once the notion of an afford- in the classical account, problem solving ance has been broadened to include the always involves search in a problem space. possibility of projecting structure or creat- In the attunement account, problem solv- ing structure. In Figure 15.16a, for instance, ing involves evaluating perceived or regis- there is no end of ways to deform the clay tered affordances. This bypasses the need model. The challenge is to know which of for a problem space where feasible actions the many ways to pursue. are internally represented. Choice points are In Greeno et al. (1998) it is suggested that encountered rather than internally gener- dynamical system models may help here. ated. And the choice points encountered It is easy to see why. If problem solving is depend on the actions actually taken by the a dynamic encounter with an affordance- agent - including mental projection actions. rich environment, the language of attrac- This means that the environment an agent tors and repellors ought to be helpful in encounters is partly co-constructed by its explaining why some affordances are per- actions. As an agent begins to see things ceived rather than others, or why some differently, especially ways it can project or actions are pursued rather than others. Eco- construct, it partly creates new task afford- logical psychologists have often advanced ances. Constructions support new projec- dynamical system accounts. But again, prior tions and so on, though it must be said that to a more complete account of the details, there remains a need to triage or evaluate the we can only wonder what controls this many affordances registered and projected. dynamic encounter. Discourse about basins Second, in the attunement theory, trans- of attraction and repulsion - the favorite fer is the result of detecting similar afford- terms in discussions of dynamical systems - sound remarkably like the discourse of gra- ances and constraints or invariances. Trans- dient ascent: an action is selected if it fer is a direct consequence of being able to yields a higher value. This is the method detect affordances and constraints, whether of hill climbing and is one of the cor- or not the agent has much grasp of the struc- nerstone methods of problem-space search. ture of a problem. Because similar afford- Yet it is not enough. To explain much ances can be found in many situations, of the observed behavior of subjects on including those that pose very different knowledge-lean problems, it has been found problems, agents are more likely to try things problem sol ving and situated cognition 177

expect a positive theory of situated problem solving to study and explain. In discussing how milkmen used their bottle cases to calculate efficient plans, we considered how overlearned patterns supported or embed- ded in artifacts - such as the regular pattern Figure 15.18. By adding annotations or other that forty-two bottles makes in a forty-eight- markings it is easier to envision the way this bottle milk case - could be used to help structure can be deformed. Such cues help to them solve specific planning and accounting anchor projection and facilitate envisionment. It problems. Experienced algebraicists use pat- is a rich area for study. terns in matrix representations to recognize properties of linear equations, and experi- necessary to introduce other weak meth- enced milkmen use patterns of bottles, both ods such as difference reduction, depth- present and imagined. These patterns figure first search, breadth-first search, minimax in special-case solutions they have learned. search, abstraction, and others. Attunement The infamous weight watcher's example theory owes us the details of how these shows a similar trick: the physical form and sort of mechanisms of control can be imple- size of cottage cheese when dumped from its mented. If on the other hand dynamical carton can be used to support highly specific systems are going to supplant the need for arithmetic operations that would be harder these additional control mechanisms, then for most subjects to perform in their head. we need to know more about how they will The idea of cutting a regular cylinder in cope with the details of control that have half is so intuitive that for many people it seemed so useful in the classical theory. is understood more directly than fractions. There are further questions. Why are They can think with the parts the way they affordances and constraints sometimes vis- can think with symbols. C. S. Peirce (1931- ible and sometimes not? In Figure 15.16b, for 1958) first mentioned this idea - that people example, it is hard for most people to see the use external objects to think with - in the legality of the transform between 1 and 2. But late nineteenth century, when he said that in Figure 15.18, with the simple addition of chemists think as much with their test tubes annotations, or hints, or material anchors, it as with pen and paper. is much easier. The details of an attunement The notion that we use things to think model should explain the biases on afford- with, that we distribute our thinking across ance and constraint detection. It should tell internal and external representations or us the types of detection errors that sub- manipulables, is relatively uncontentious jects make and why. It should tell us the when the artifacts used are symbolic, such as biases we observe in human problem solv- written sentences, illustrations, numbers, or ing. And it should tell us the time course of even gestures. Much ethnographic research affordance-constraint detection: why some has shown in detail how people use artifacts are quickly seen and others are detected to encode information: how people repre- slowly. There is substantial opportunity for sent information in external structures and theory and experiment here. It may eventu- then manipulate those external structures ally pay off but it is clear there is much to and read off the results. The same applies be done. to gesture. People use body gestures to help them think in context, both when they think individually and when they think as a group 3. Thinking with Things or in a team. Gestures help them remem- ber ideas, formulate thoughts, and not just How people use artifacts, resources, and supplement vocalization. A slightly different tools as things to think with delineates a idea attaches to the use of artistic media. No third class of phenomena that we would one would deny that a sculptor is thinking 302 david kirsh or solving certain problems when using clay toward this theory can be taken by look- any more than someone would deny that an ing at how people co-opt things to perform author is thinking or solving problems with computations. Computation is about har- the help of pen, paper, and written sentence. nessing states, structures, or processes to The medium has important properties that generate rational outcomes. It is about using the artist or author is trying to exploit. Inter- things to find answers. The most familiar action is essential to the process. And though forms of computation are digital, the manip. one might say that all the relevant cogni- ulation of symbol strings, as in math, engi- tion occurs inside the head of the human, neering, or . But all sorts the physical materials, scaffolds, tools, and of nonsymbolic systems can be harnessed structures are an important factor in the to perform computations. For instance a outcome. They help to structure the afford- slide rule is an analog mechanism for per- ance landscape. forming multiplication, addition, and a host The question of how people think with of other mathematical operations. It does things, how the determinants and dynamics not have the precision of a calculator, or of cognition depend on properties of arti- the range of functions of many other digital facts and the context of action, is unques- devices. But because it preserves key rela- tionably at the heart of the theories of situ- tionships - linear distance along each scale ated, distributed, and interactive cognition. of the rule is proportional to the logarithm But as with other tenets, it is in need of of the numbers marked on it - moving the greater elaboration and empirical study. To slides in the correct way allows a user to date, the majority of studies have been con- perform multiplications. It can be used to fined to ethnographic examinations of par- simulate the outcome of digital multiplica- ticular cases. This is a necessary step given tion because it preserves key relationships. It the importance of details. But little atten- replaces symbol manipulation with manip- tion has been paid to the distinction between ulation of physical parts. using objects to solve special-case problems The same can be said, though less eas- - a method that reflects memorized tech- ily, for illustrations, sketches, and three- niques - and using objects or systems in dimensional models. For instance, one of the more general ways as an intrinsic part of most common ways we think with things is reasoning and problem solving. The distinc- by using them as model fragments. A struc- tion is important because if most instances ture or process can be said to model or partly of thinking with things are highly specific, model another if it is easy to manipulate and if most are cases where dedicated tools examine the model and then read off impli- are used to solve domain-specific problems, cations about the target structure or process. then too much of the rest of problem solv- An extreme example is seen in the architec- ing is left out and any hope of a more gen- tural practice of building miniature three- eral theory of problem solving looks bleak. dimensional models of buildings. The oper- How much did we really leam about prob- ations that architects perform on their small lem solving by observing weight watchers models are sufficiently similar to actions that partition cottage cheese? inhabitants will perform on or in full-sized The theory situated cognition owes us is buildings that architects can try out on the one that will explain how people harness model ways the full-sized building may be physical objects to help them reason and used. They can act on the model instead solve problems. What characteristics must of the real object. This saves time, effort, a thing to think with have if it is to be effec- and cost because mistakes have few conse- tive, easy to use, and handily learned? quences in models and simulations. This can Although a comprehensive theory would even be done using two-dimensional dia- provide a principled taxonomy of the ways grams, as shown by Murphy (2004), who we can think with things, a tiny step recorded how architects reason about the problem sol ving and situated cognition 177

uses of a building by bringing their bod- If a robot were sufficiently tuned to the reg- ies into interaction with their architectural ularities of its environment and the tasks it drawings. needed to perform, it could be driven by a The formal part of a thinking-with-things control system that would guarantee with theory should provide the analytic tools for high probability that the robot would even- evaluating why certain things can function tually reach its goals, though not necessarily successfully as things that can be thought by the shortest path. If this sounds similar to with. Much of this work has already been the theory of attunement to affordances and constraints it is because it is the same theory done in other fields. For example, in math though restricted here to problems such as it is well understood that a powerful tech- trajectory planning, grasping, and physical nique of reasoning is to map structures into manipulation, and without the requirement different representational systems. Problems of mental projection. that may be hard to solve in one system, say, Euclidean geometry, may be easy to solve in Although such an approach may seem the analytic geometry. This mapping between antithesis of problem solving, it is a legiti- representational systems may not seem to mate way of dealing with real-world prob- be a computation, but it is. It is a trick that lems - but only if certain conditions are met. is widely used. For instance, when sailors Specifically: plot a course, they typically use more than one map. Every map represents the world • The world must be relatively benign. under a projection that preserves some rela- Moving toward goals can only rarely lead tions while distorting others. Mercator pro- to disastrous downstream consequences jections are good for plotting course but bad (Kirsh, 1991,1996). for estimating distance or area. Sailors over- • The world must be reversible. If you do come this problem by plotting their course not like your action, you can undo it and in maps of different projection. They con- either return the world to the condition it vert information from one map to another. was in before, or you can find a new path The result is that they are able to track their to any of the states you could reach location, distance, bearing, and speed more before. accurately and more quickly than they oth- erwise could (Hutchins, 1995). They think When can these conditions be counted on? across maps. The result of using multiple Assembly tasks, math tasks, and puzzles, maps is like using a fulcrum: it allows a such as tangrams, support undoing without weak user to do some heavy lifting because penalty. The Tower of Hanoi is another clas- the maps or relation between the maps does sic task supporting reversibility. And check- much of the work. ers, chess, and other competitive games usu- An even more complicated way of using ally have a form - correspondence chess, things to think with is found when people checkers - where subjects can search over use the very thing they are interested in as possible moves directly on the board before its own model. For example, people solving deciding how to move. In other tasks, even if tangrams seem to use the tangram pieces one cannot reverse the action, subjects can as things to think with. So do people when get a second chance to solve the problem. they assemble things without first reading So as long as it takes no longer to try out the assembly manual. When Rod Brooks an action in the world than trying it out (1991a, 1991b) initially offered the expression in a model, there are advantages to acting "use the world as its own model," he meant directly in the world. First, subjects gain pre- that a robot would be better off sensing and cision in the representation of the outcome reacting to the world itself than by simu- because nothing is more precise than the lating the effects of actions in an internal real thing. Second, subjects gain practice in model of the world prior to selecting action. bringing things about, which can be of value 3oo david kirsh

where skill is required. Third, subjects save I personally think this is an exciting area having to formulate a plan in their head, try of research. As with other areas this one lis it out in a model, and then execute the plan much in need of clarification and empirical in the world. The execution phase is folded exploration. Moreover, there is a danger that into the discovery phase. this activity-centric model makes thought But there are many other tasks where so relativistic and hermeneutic that only actions are not reversible and search in the violinists can understand other violinists world is a bad idea. In cooking, for example, only sculptors can understand other sculp' as with many tasks, after an initial prepa- tors, and so on. Although I believe such con- ration phase when a recipe is selected and cerns can be answered, they highlight that ingredients gathered, there is an execution there are philosophical and methodological problems as well as empirical ones that such phase where there is no turning back. There an approach may face. are many places where a plan can be modi- fied or updated to deal with errors, setbacks, and unexpected outcomes. But in most tasks there are commitment points where it is 4. Self-Cueing and Metacognitive inappropriate to do anything but the next Control of Discovery move. Most tractable analyses of thinking ap- What controls search? In the classical theory proach thought as a computational activity the firing of productions, a form of associa- whether that computation is said to occur tive memory, drives search. If a rule exists inside the head on internal representations, whose preconditions match patterns cur- outside the head through the use of phys- rently in working memory, then it is trig- ical objects, or in the interaction between gered, and unless other rules match current inside and outside. An even more profound conditions and are therefore also triggered, approach to thought sees it as a mechanism that solitary rule fires and causes a change for extending our perception, action, and in working memory. Owing to this focus on regulation. This is a radical vision of how what is in current working memory there cognition is shaped through our interaction is an inescapable data-driven bias to search, with artifacts. both on candidate generation and evalua- The core idea in this more enactive the- tion. Only a change in working memory ory of thought is that thinking is some- can trigger new productions. Environmental how tied up with the way we encounter state enters working memory through per- and engage the world. A violinist encoun- ception. A positive interactionist or situa- ters the world through his violin in a way tionist theory ought to provide an account that depends essentially on the violin. Vio- of how we interact with our environments to lin problems are encountered only in vio- overcome the lock that data-driven thinking lin playing and constituted in the interaction has on our . It must provide a the- of player and instrument. The same applies ory of the dynamics driving the interactive to people who bicycle, whitde, manipu- exploration of a problem during the search late cranes, or solve higher-math problems. for solution. One special line of inquiry looks The material instruments, representational at the way we self-cue, how we alter the cue , and sensorimotor extensions that structure of the environment to stimulate artifacts provide offer new modes of experi- new ideas or candidate generation. ence and involvement with the world. The A nice example of how as data-driven problems that arise are somehow essentially creatures we self-cue to improve perfor- tied to those interactions and therefore can- mance can be found in the way people play not be properly analyzed until we under- Scrabble (see Figure 15.19). The basic prob- stand what those who have those skills expe- lem in Scrabble is to find the highest-value rience as problematic. words that can be made from the letters PROBLEM SOLVING AND SITUATED COGNITION 301

in effect jumping to a new place in their internal lexical space. By lexical space we meant a system of discrete nodes consisting of letter sequences. How close one node is to another in this space is determined by such things as phonemic closeness (bear is close to bare and air), graphemic closeness (bear is Figure 15.19. When people play Scrabble, they close to ear), and perhaps others things such scan pieces on their tray, look for openings on as semantic closeness (bear is close to lair). the board, and search through a developing In our simulation we defined closeness as space of possible word and word fragments. graphemic closeness alone. Thus two nodes Most people also periodically move the letters are neighbors in or simplified model of inter- on their tray. Sometimes this is to hold their current candidate. But often it is to highlight nal lexical space if they can be reached by high-frequency two- and three-letter shifting or removing a few letters or adding combinations that figure in words. a letter if there is an additional letter on the tray. Hence, bear is close to bar, bare, bra and if there is another e on the tray, then on one's tray after placing them in a permis- the word beer is also a graphemic neighbor. sible position on the board. The challenge Letter sequences that do not form words, for a psychological theory is to explain how such as bre ebra, are also lexemic neighbors the external cues, comprised of the ordered but less strongly attractive if they are infre- letters on tray and board, mutually inter- quent sequences in English. act with a subject's internal lexical system. The challenge for an interactionist account In discussing our findings we reasoned is to show how subjects intentionally man- that although subjects can in principle apply age the arrangement of letters on their tray as many mental transforms in lexemic space to improve performance. as they want and so, in both the no hands In Maglio, Matlock, Raphaely, Cher- and hands condition, they can in principle nicky, and Kirsh (1999), an experiment was move arbitrarily far from the letter layout performed to test the hypothesis that peo- on their physical tray. We expected that in ple who rearrange letters externally self-cue practice subjects would tend to get stuck in and consequently perform better than those the lexical neighborhood near to the group- who do not. The task was similar to Scrabble ings on their tray. This idea reiterates the in that subjects were supposed to generate data-driven nature of Scrabble. possible words from seven Scrabble letters. The power of using one's hands comes Unlike Scrabble, though, they just called from the difference in the constraints on out words as they recognized them; there physical movement and mental movement. A were no constraints coming from the board just-so story runs like this. The letters that or from values associated with letters. In subjects see exert a pull on their the hands condition, participants could use much like an elastic band. The farther that their hands to manipulate the letters; in the subjects go in their lexemic space from the no hands condition they could not. Results sequence on the tray the stronger the tension showed that more words were generated in pulling them back. This tension is purely the hands condition than in the no hands internal. In physical space, there is nothing condition, and that moving tiles was more preventing players from moving their tiles helpful in finding words that were more dis- ever further from their first layout. It is easy tant from the opening letter sequence. to break the elastic-band effect. This means Evidently, self-cueing helps beat the data- that a subject can jump to a new place in driven nature of cognition. We conjectured lexical space, and so activate a new basin that the reason moving tiles is helpful is of neighbors just by rearranging his or her that when subjects rearrange tiles they are tiles. The new arrangement will prime or DAVID KIRSH

cue a new set of lexical elements and make cal theory of problem solving. It has forced it more likely that new words will be dis- us to reconsider the form an adequate the- covered. ory should take. Is a general theory of prob- This approach to problem solving, if lem solving possible? Is there enough resem- intentional, looks a lot like . blance between the actions that problem It is reminiscent of the behavior of Ulysses solvers take when solving or overcoming (Elster, 1979), who recognized that he could problems to hope to discover a general the- not overcome the lure of the Sirens except ory of the dynamics of problem solving, by binding himself. To achieve his goal, regardless of domain? I argued that such a given that he knew he would act inappro- theory must at least tell us for a given prob- priately in the situation, he altered the situ- lem and situation how much of the control ation. We do the same when we move our is to be found in internal processes, directed eyes. Given that our visual system is auto- by such internal resources as problem-space matic and data driven, we cannot help but representations and heuristic search, how see what we look at. But we still can avert much of the control is to be found in the our eyes or direct them to other things. setup or design of the environment influenc- The decision to look elsewhere is often ing cognition through the affordances, cues, intentional, and when it is, it counts as and constraints that a culture of activity has metacognitive if it is based on reflectively built up over time, and how much is to be exploiting the way our visual system works. found in the dynamic interaction between The same applies to moving tiles in Scrab- internal and external resources. ble and a host of problem-solving strategies This is a tall order. It requires develop- we rely on in other domains. In math, for ing a set of supporting theories that explain instance, we often copy onto a single page how cues, constraints, and affordances affect the lemmas we generated over many pages. how a subject thinks and acts. This point This increases the chance of seeing patterns. came up repeatedly: first when discussing In algebra, students soon learn that it helps the processes of registration, then when dis- to be neat because it is easier to see rela- cussing interactivity and epistemic actions, tions and patterns. Indeed, the strategy of scaffolds, and resources, and later when rearranging the environment to stimulate I discussed the direction a positive the- new ideas when candidate generation slows ory might take: explaining attunement to down is pervasive. The principle relied on is symbolic affordances and cues, explaining self-cueing and metacognitive control. (For how subjects respond to hints, how they a more complete discussion of the role of self-cue, and how they think with material metacognition in reasoning and learning, see things. A general theory of problem solving Kirsh, 3004.3 would have to pull all these supporting the- The moral for research in problem solv- ories together to explain how subjects move ing should be clear: by studying more care- back and forth among cues, scaffolds, vis- fully how cues and affordance landscapes ible attributes and the mental projections, bias cognition, new interactive strategies will structures, and problem spaces those sub- be discovered that show unanticipated ways jects maintain on the inside. It would be an subjects use the environment to shape their interactionist theory. problem-solving cognition. The situationalist approach, secondly, has exposed what is wrong with studying prob- lem solving in constrained laboratory con- Final Discussion and Conclusion texts, disconnected from the settings in which those mental and interactive pro- By exposing how cognition is closely cou- cesses originally developed. Subjects adapt pled to its social, material, and cultural con- to the world they live in. The internal costs text the situationalist approach has called of problem solving depend substantially on attention to the deficiencies of the classi- how experienced a subject is, whether the PROBLEM SOLVING AND SITUATED COGNITION problem is presented to him or her in a of this chapter and for comments by Rick After- familiar way, and how effectively he or she man and Jim Greeno. Joachim Lyon carefully read the entire manuscript and gave thought- can exploit the surrounding resources, cues, ful remarks throughout. Support for this work affordances, and constraints. This concern came in part from ONR Award # N00014-02-3- with cognitive costs and benefits is consis- 0457. tent with the general theme of model build- ing in . But putting it into practice means paying closer attention to the Notes details of natural contexts. This cannot be done without close ethnographic and micro- 1 In computational complexity theory, prob- analytic study. lems are ranked according to the resources As a negative theory, situated cognition needed to solve their hardest instance, as has been a success. But if approaches are measured by the number of steps and the judged by their positive theories, situated amount of memory or scratch paper the best cognition has been a failure. All efforts at algorithm will use. For instance, solving a creating a substantive theory of problem traveling salesman problem with two cities solving have been underspecified or frag- is trivial because there is only one combina- tion of cities, one tour, to consider (A, B). mentary. And it is too early to know whether But as the number of cities increases, die the next dominant theory will bear a situa- number of candidate tours that have to be tionalist stamp. written down and measured increases expo- Building on the critique presented in Part nentially (as there are as many possible tours One and Part Two, I posed a set of ques- as sequences of all cities). It is the shape of tions and initial approaches that might indi- this resource growth curve that determines cate the direction situated research should the complexity class of a problem. A prob- pursue next. These include an analysis of lem is said to be in the class of NP-complete hints and scaffolds, symbolic affordances if, in the worst case, the best algorithm would and mental projection, thinking with things, have to do the equivalent of checking every self-cueing and metacognition, and an enac- tour, and the test to determine whether one tive theory of thought. Some of these stud- tour is better than another is itself not an NP-complete algorithm. There are infinitely ies are being undertaken outside of cogni- many problems that are harder than NP- tive and computational psychology. They complete ones, and infinitely many problems explore how people interactively populate that are easier. The easier ones have polyno- situations with extra resources and how they mial complexity. exploit those resources to simplify reason- 2 Taken from "The Thirty Rules of Chess." ing. Some of these studies, however, are not Retrieved May 31, 2008, from http://www. yet being undertaken. The bottom line, it chessdryad.com/education/sageadvice/ seems to me, is that it is not enough that thirty/index.htm. we recognize the central insight of situated cognition - that the environment provides organization for cognitive activity, that the world enables and supports such activities References 1 we must go further. We must explain how internal control processes work with Agre, P., & Chapman, D. (1987). Pengi: An imple- these supports and organizational structures mentation of a theory of activity. In Proceed- ings of the Sixth National Conference on Artifi- to regulate intelligent activity. cial Intelligence (pp. 268-272). Menlo Park, CA: AAAI Press. Baker, B. S. (1994). Approximation algorithms Acknowledgments for NP-complete problems on planar graphs. Journal of the ACM, 41(1), 153-180. I am happy to thank the editors for their valuable Behrend, D. A., Rosengren, K., & Perlmut- suggestions and moral support during the writing ter, M. (1989). Parent presence. International 3°4 DAVID KIRSH

Journal of Behavioral Development, 12(3), 305- Hamberger, J. (1979). Music and cognitive research: Where do our questions come from? Brooks, R. A. (1991a). Intelligence without rea- Where do our answers go? (Working Paper Ser son. In Proceedings of the 1991 Joint Confer- No 2.). Cambridge: Massachusetts Institute of ence on Artificial Intelligence (pp. 569-595). San Technology, Division for Study and Research Francisco, CA: Morgan Kaufmann. in Education. Brooks, R. A. (1991b). Intelligence without repre- Hutchins, E. (1995). Cognition in the wild. Cam- sentation. Artificial Intelligence, 47,141—159. bridge, MA: MIT Press. Brown, J. S., Collins, A., & Duguid, P. (1989). Hutchins, E., & Palen, L. (1997). Constructing Situated cognition and the culture of learning. meaning from space, gesture, and . In Educational Researcher, 18, 32-42. L. B. Resnick, R. Saljo, C. Pontecorvo, & B. Carraher, T. N., Carraher, D. W., & Schliemann, Bürge (Eds.), Discourse, tools, and reasoning: A. D. (1985). Mathematics in the streets and Essays on situated cognition (pp. 23-40). Hei- in the schools. British Journal of Developmental delberg, Germany: Springer-Verlag. Psychology, 3, 21-29. Karp, R. M. (1986). Combinatorics, complexity, Chambers, D., & Reisberg, D. (1985). Can mental and randomness. of the ACM images be ambiguous? Journal of Experimen- 29(2), 98-109. tal Psychology: Human Perception and Perfor- Kershaw, T. C. (2004). Key actions in insight mance, 11, 317-328. problems: Further evidence for the impor- Clark, A. (1997). Being there: Putting , body, tance of non-dot turns in the nine-dot and world together again. Cambridge, MA: problem. In K. Forbus, D. Gentner, & T. MIT Press. Regier (Eds.), Proceedings of the Twenty-Sixth Devlin, A. S., & Bernstein, J. (1996). Interactive Annual Conference of the Cognitive Science Soci- wayfinding: Use of cues by men and women. ety (pp. 678-683). Mahwah, NJ: Lawrence Erlbaum. Journal of Environmental Psychology, 16, 23- 39- Kirsh, D. (1991). When is information explicitly represented? In P. Hanson (Ed.), Vancouver Elster, J. (1979). Ulysses and the sirens. Cam- studies in cognitive science (Vol. 1, pp. 340-365). bridge: Cambridge University Press. New York: Oxford University Press. Gibson, J. J. (1966). The considered as Kirsh, D. (1995a). Complementary strategies: perceptual systems. Boston, MA: Houghton Why we use our hands when we think. In J. Mifflin. D. Moore & J. F. Lehman (Eds.), Proceedings of Gibson, J. J. (1977). The theory of affordances. In the Seventeenth Annual Conference of the Cog- R. E. Shaw & J. Bransford (Eds.), Perceiving, nitive Science Society (pp. 212-217). Hillsdale, acting, and knowing {pp. 67-82). Hillsdale, NJ: NJ: Lawrence Erlbaum. Lawrence Erlbaum. Kirsh, D. (1995b). The intelligent use of space, Gibson, J. J. (1979). The ecological approach to Artificial Intelligence, 72(1-2), 31-68. . Boston: Houghton Mifflin. Kirsh, D. (1996). Today the earwig, tomorrow Gick, M., & Holyoak, K. (1980). Analogical prob- man? In M. Boden (Ed.), Artificial life (pp. lem solving. , 12, 306-355. 237-261). Oxford: Oxford University Press, Gick, M., & Holyoak, K. (1983). Scheme induc- (Reprinted from Artificial Intelligence, 47(3), tion and analogical transfer. Cognitive Psychol- 161-184.) ogy, 15(1),1-38 Kirsh, D. (2003). Implicit and explicit represen- Greeno, J. G., & Middle School Mathematics tation. In L. Nadel (Ed.), Encyclopedia of cog- through Applications Project Group (1998). nitive science (Vol. 2, pp. 478-481). London: The situativity of knowing, learning, and Nature Publishing Group. research. American , 53(1), 5-26. Kirsh, D. (2004). Metacognition, distributed cog- Greeno, J. G., & Moore, J. L. (1993). Situativity nition and visual design. In P. Gärdenfors & and symbols: Response to Vera and Simon. P. Johansson (Eds.), Cognition, education and Cognitive Science, 17,49-60. communication technology (pp. 147-179). Hills- Greeno, J. G., Smith, D. R., & Moore, J. L. (1993). dale, NJ: Lawrence Erlbaum. Transfer of situated learning. In D. K. Detter- man & R. J. Sternberg (Eds.), Transfer on trial: Kirsh, D., & Maglio, P. (1995). On distinguishing Intelligence, cognition, and instruction (pp. 99- epistemic from pragmatic actions. Cognitive Science, 18, 513-549. 167). Norwood, NJ: Ablex. PROBLEM SOi.VING AND SITUATED COGNITION

Kokinov, B., Hadjiilieva, K., & Yoveva, M. (1997). Murphy, K. M. (2004). Imagination as joint Is a hint always useful? In Proceedings of the activity: The case of architectural interaction. Nineteenth Annual Conference of the Cognitive Mind, Culture and Activity, 11(4), 270-281. Science Society. Hillsdale, NJ: Lawrence Erl- Newell, A. (1990). Unified theories of cognition. baum. Cambridge, MA: Harvard University Press. Kotovsky, K., & Fallside, D. (1989). Representa- Newell, A., 8c Simon, H. (1972). Human problem tion and transfer in problem solving. In D. solving. Englewood Cliffs, NJ: Prentice Hall. Klar & K. Kotovsky (Eds.), Complex infor- Norman, D. A. (1988). The psychology of everyday mation processing (pp. 69-108). Hillsdale, NJ: thingX. New York: Basic Books. Lawrence Erlbaum. Ornstein, A. C., 8c Levine, D. U. (1993). Founda- Laird, J. E., Newell, A., & Rosenbloom, P. tions of education (5th ed.). Boston: Houghton S. (1987) SOAR: An architecture for gen- Mifflin. eral intelligence. Artificial Intelligence, 33, 1- Paige, J. M., 8c Simon, H. (1979). Cognitive pro- 64. cess in solving algebra word problems. In H. A. Larkin, J. H., & Simon, H. A. (1987). Why a Simon (Ed.), Models of thought. New Haven, diagram is (sometimes) worth ten thousand CT: Yale University Press. words. Cognitive Science, 11, 65-99. Papadimitriou, C., 8c Yannakakis, M. (1988). Lave, J. (1988). Cognition in practice: Mind, mathe- Optimization, approximation, and complex- matics and culture in everyday life. Cambridge: ity classes. In Proceeding? of the Twenti- Cambridge University Press. eth Annual ACM Symposium on Theory of Lave, J. (1996). The practice of learning. In Computing (pp. 229-234). New York: ACM S. Chailkin 8c J. Lave (Eds.), Understanding Press. practice: Perspectives on activity and context Peirce, C. S. (1931-1958). The collected papers (pp. 3-32). New York: Cambridge University of Charles Sanders Peirce (C. Hartshorne, P. Press. Weiss [Vols. 1-6], 8c A. Burks [Vols. 7-8], Lesgold, A. M. (1988). Problem solving. In R. J. Eds.). Cambridge, MA: Harvard University Sternberg 8c E. E. Smith (Eds.), The psychol- Press. ogy of human thought (pp. 188-213). New York: Pretz, J. E., Naples, A. J., 8c Sternberg, R. J. Cambridge University Press. (2003). Recognizing, defining, and represent- Luchins, A. S. (1942). Mechanization in problem ing problems. In J. E. Davidson 8c R. J. Stern- solving: The effect of Einstellung. Psychologi- berg (Eds.), The psychology of problem solving cal Monographs, 54,1-95. (pp. 3-30). Cambridge: Cambridge University Luchins, A. S., 8c Luchins, E. H. (1970). Press. Wertheimer's seminars revisited: Problem solv- Reitman, W. R. (1964). Heuristic decision proce- ing and thinking (Vols. 1-3). Albany: State Uni- dures, open constraints, and the structure of versity of New York Press. ill-defined problems. In M. W. Shelly & G. L. Maglio, P., Madock, T., Raphaely, D., Chernicky, Bryan (Eds.), Human judgments and optimality B., 8c Kirsh, D. (1999). Interactive skill in (pp. 282-315). New York: John Wiley. Scrabble. In Proceedings of Twenty-First Annual Rogoff, B., & Lave, J. (Eds.) (1984). Everyday cog- Conference of the Cognitive Science Society. nition: Its development in social context. Cam- (pp. 326-330). Mahwah, NJ: Lawrence Erl- bridge, MA: Harvard University Press. baum. Sayeki, Y., Ueno, N., 8c Nagasaka, T. (1991). Maier, N. R. F. (1930). Reasoning in humans: I. Mediation as a generative model for obtain- On direction. Journal of Comparative Psychol- ing an area. Learning and Instruction, 1, 229- OQT, 10,115-143. M2- -M Maier, N. R. F. (1970). Problem solving and cre- Scribner, S. (1984). Studying working intelli- ativity. Belmont, CA: Brooks/Cole. gence. In B. Rogoff 8c J. Lave (Eds.), Every- Mayer, R. E. (1995). The search for insight. In R. J. day cognition: Its development in social context Sternberg 8c J. E. Davidson (Eds.), The nature (pp. 9-40). Cambridge, MA: Harvard Univer- of insight (pp. 3-32). New York: Cambridge sity Press. University Press. Scribner, S. (1986). Thinking in action: Some Miller, G. A., Galanter, E., 8c Pribram, K. H. characteristics of practical thought. In R. J. (i960). Plans and the structure of behavior. New Sternberg 8c R. K. Wagner (Eds.), Practical York: Holt, Rinehart 8c Winston. intelligence: Nature and origins of competence DAVID KIRSH

in tkr everyday world (pp. 13-30). Cambridge: Sternberg, R. J. (1997)- Thinking styles. New York Cambridge University Press. Cambridge University Press. Simon, H. A. (1969). The sciences of the artificial. Sternberg, R. J., & Davidson, J. E. (1983). insj h(

Cambridge, MA: MIT Press, in the gifted. Educational Psychologist, 18,51_ Simon, H. A. (1979). Models of thought (Vol. 1). 57- New Haven, CT: Yale University Press. VanLehn, K. (1989). Problem solving and cog. Simon, H. A., & Hayes, J. (1976). The under- nitive skill acquisition. In M. I. Posner (Ed) standing process: Process isomorphs. Cognitive Foundations of cognitive science (pp. 527-;^)' Psychology, 8,165-190. Cambridge, MA: MIT Press.