<<

Psychonomic Bulletin & Review 2004, 11 (1), 1-23

PSYCHONOMIC SOCIETY KEYNOTE ADDRESS How a cognitive came to seek universal laws

ROGERN. SHEPARD , Stanford, California and Arizona Senior Academy, Tucson, Arizona

Myearlyfascination with and physics and, later, withperception and imagination inspired a hope that fundamental phenomena ofpsychology, like those ofphysics, might approximate univer­ sal laws. Ensuing research led me to the following candidates, formulated in terms of distances along shortestpaths in abstractrepresentational spaces: Generalization probability decreases exponentially and discrimination time reciprocally with distance. Time to determine the identity of shapes and, pro­ visionally, relation between musical tones or keys increases linearly with distance. Invariance of the laws is achieved by constructing the representational spaces from psychological rather than physical data (using multidimensional scaling) and from considerations of geometry, , and sym­ metry. Universality of the laws is suggested by their behavioral approximation in cognitively advanced species and by theoretical considerations of optimality. Just possibly, not only physics but also psy­ chology can aspire to laws that ultimately reflect mathematical constraints, such as those ofgroup the­ ory and symmetry, and, so, are both universal and nonarbitrary.

The object ofall science, whether natural science or psy­ three principal respects. The first is my predilection for chology, is to coordinate our experiences and to bring them applying mathematical ideas that are often ofa more geo­ into a logical system. metrical or spatial character than are the probabilistic or (Einstein, 1923a, p. 2) statistical concepts that have generally been deemed [But] the initial hypotheses become steadily more abstract most appropriate for the behavioral and social sciences. and remote from experience. The second is my unwillingness to be satisfied with any (Einstein, 1934/1954, p. 282) proposed psychological principle whose sole justifica­ tion is that it fits all the available empirical evidence­ To the extent that my particular approach to psycho­ whether behavioral or neurophysiological. I crave, in ad­ logical science may be distinctive, I think it may be so in dition, a reason that that behavioral principle (or that associated neural structure)should have the particular form that it does, rather than some other. The third is my I am honored to follow William K. Estes, who gave the first ofthese newly instituted keynote addresses at the 2002 meeting ofthe Psycho­ belief that if, as I fervently hope, psychological princi­ nomic Society (see Estes, 2002). The mathematical theory ofleaming ples are notmerely arbitrary, some may be shown to have that Estes put forward at the middle ofthe last century was a significant arisen as accommodations to universal features of the inspiration for my own ensuing attempts to construct mathematical the­ world -, If this is so, we might aspire to a science of the ories for fundamental psychological phenomena. This article is based on the keynote address I presented at the November, 2002 meeting of mind that, like the physical and mathematical sciences, the Psychonomic Society in Kansas City. It is expanded here to include has universal laws. material on discrimination time, music cognition, and how capabilities The possibility ofa psychological science ofthe kind for mental transformation may make possible the discovery ofphysical I envision may be viewed with skepticism, ifnot, indeed, laws-topics that I did not have time to cover in that address but that (in incredulity. Those outside the , on the term preparing this written version) seemed to help to complete the picture of my approach to . I thank the Governing Board ofthe Psy­ , often react with the raised chonomic Society and its Chair, Stephen Palmer, for inviting me to eyebrows reserved for oxymora. Some physical scientists present the address, David Balota for guidance in my preparation ofthe subscribe to opinions like that ofan astrophysicist mem­ manuscript, and William Estes for his many thoughtful suggestions for ber of the National Academy of Sciences whom I heard improving the manuscript. Finally, I gratefully acknowledge the Na­ tional Science Foundation's support ofthe research that my co-workers declare that whereas the physical sciences deal with and I have pursued on mental representations and psychological laws quantitative and objective facts, the social during the 35 years following my move to academia in 1966. The opin­ sciences are limited merely to "expressions ofsubjective ions and conclusions expressed in this article are my own, however, and feelings." Even quantitatively oriented do not necessarily represent those of the National Science Foundation may assume that because human behavior is inherently (or, indeed, of any of my former co-workers). Correspondence concern­ ing this article should be addressed to R. N. Shepard, 13805 E. Langtry less predictable than the behavior ofdeterministic phys­ Lane, Tucson, AZ 85747-9637 (e-mail: [email protected]). ical systems, the kinds ofmathematics applicable to psy-

Copyright 2004 Psychonomic Society, Inc. 2 SHEPARD

chological phenomena are, at best, probabilistic or sta­ study of the physical brain. But none of the empirical tistical. data I shall cite and none ofthe mental laws I shall pro­ All such views-perpetuated by a misconceived di­ pose arose from or depend on any knowledge about the chotomy between the so-called "exact" and "inexact" brain. The history ofthe relations between the psycho­ sciences-are mistaken. Human beings and all other an­ logical sciences and the neurosciences indicates, rather, imals are themselves physical systems and, as such, op­ that a recognition of the problems that a behaving or­ erate in compliance with physical laws. Physical systems ganism confronts in the world is needed in order to know do, of course, differ widely in the predictability oftheir what to look for in the brain. behaviors. But these differences in predictability are not I have chosen three general lines of my own work­ determined by whether the behaviors are those investi­ namely, those on mental transformations, on generaliza­ gated by physicists or by psychologists. Rather, they are tion and discrimination, and on music cognition-to illus­ determined by the precision with which the would-be trate how abstract spatial conceptualizations ofproblems predictor knows the values and the principles of inter­ posed by the world have led me to candidates for univer­ action ofthe controlling variables (and by the sensitivity sal mental structures and laws. From there, I shall venture ofthose variables to initial conditions). some implications for an understanding ofhow scientific Even for a physicist, the future positions ofa leafdrop­ laws are discovered in general, and for the prospects for ping from a tree on a gusty autumn day are quite unpre­ a science of universal mental laws in particular. But, dictable in the absence ofprecise knowledge ofthe shift­ first, it may help to review the circuitous odyssey through ing and turbulent air currents and ofthe shape many quite different fields that led me to my peculiarly ofthe particular leafwith which those currents interact. geometrical approach. But the future positions ofa heavy ball released from the top of an evacuated cylinder are highly predictable be­ Childhood and Early School Yearsl cause, here, the single controlling force ofgravity is con­ Beginning in early childhood, I would spend many stant and precisely known. solitary hours happily tinkering with old clockworks, Likewise, for a psychologist, the future path ofthe ball telephones, and parts ofother appliances that I found in that has just arrived in a basketball player's hands is quite a vacant-lotjunkpile or in the attic ofmy San Jose grand­ unpredictable. That player is highly trained and motivated parents' bam and, also, drawing and designing things to keep members of the opposing team guessing as to geometrical and mechanical. These interests continued whether the ball will go left to one teammate, right to an­ into my high school years, when I was building robots other teammate, up toward the basket, or down, to be drib­ and other electromechanical devices and constructing bled forward, backward, left, or right. As a consequence, models ofregular polyhedra and projections ofthe four­ the processes within that player's nervous system­ dimensional hypercube. At the same time, I was discov­ although deterministic-may be so delicately poised as ering and devouring books from the library by Sir Arthur to be tipped toward the precipitation ofa particular one of Eddington, Sir James Jeans, Eric Temple Bell, and oth­ these macroscopic actions by micromechanical (perhaps ers, on relativity, cosmology, and . But my chaotic or thermal) events inaccessible to all external ob­ lack ofattention to school subjects was a concern ofmy servers and even to the player about to take the action. parents, teachers, and school counselors, from my hav­ But, when that player, having been fouled, stands at ing to repeat kindergarten all the way up through my ig­ the free-throw line, the future path ofthe ball is highly nominious freshman year at Stanford. predictable. The trajectory is then governed primarily by the goal oflaunching the ball on that precise ballistic tra­ Undergraduate Years at Stanford jectory-out of all possible trajectories within the bas­ At the end of my second quarter at Stanford, my cu­ ketball pavilion-that will carry it through the small and mulative grade point average had fallen to within one distant hoop. Moreover, for modeling the player's hand­ point of precipitating my permanent dismissal. During eye coordination performance, a statistical may my ensuing two-quarter obligatory leave, I worked at the be less appropriate than a more deterministic and spa­ Stanford Research Institute (now SRI International) as a tial, differential calculus such as that governing the guid­ chemistry lab assistant to Dr. Elizabeth Roboz (who was ance ofmissiles or, most simply, the descent ofthe heavy later to become Einstein's daughter-in-law). Several ball in the evacuated cylinder. other scientists at SRI, taking a special interest in me, Ofcourse, although humans and other animals are, at persuaded me that ifl wanted to do creative work in sci­ one level ofdescription, physical systems, the behavioral ence, I should complete my formal schooling. So I did or mental principles ofinterest to the psychologist emerge return to Stanford. But I also continued my independent at a quite different level ofdescription. Oflate, psychol­ geometrical explorations. (These now concerned the ogy and have been dominated by local connection of homogeneous topological a preoccupation with brain imaging studies. Moreover, networks that rendered them the discrete analogues of since long before, scientists from other fields and non­ continuous two- or three-dimensional spaces with global scientists alike have widely assumed that an objective curvatures that are zero, like the plane; positive, like the understanding ofthe mind can be gained only through a surface ofa sphere; or negative, like a saddle shape.) KEYNOTE ADDRESS 3

I was strongly attracted by the four-dimensional non­ the simple sum ofthose component separations (Attneave, Euclidean Minkowskian and curved-space (Riemannian) 1950). New vistas opened before me, and I proceeded to ofspecial and general relativity. But I feared apply for graduate study in psychology. that my failure to have obtained a grounding in the full spectrum of relevant mathematical fields precluded a Early Graduate School Years at Yale successful career in mathematics or theoretical physics. I accepted an assistantship at Yale, largely on the rec­ I was also deterred by the thought that those disciplines ommendation ofTaylor (who would much later move to were unlikely to provide a window on the inner phenom­ Yale himself). But, on my arrival there in the fall of ena ofperception, imagery, and dreams-phenomena that 195I, I found that in my assistantship, I was to condition figured prominently in my own creative thinking, and rats to begin running in a wheel within 5.8 sec of the that were coming to fascinate me as much as the exter­ onset ofa light to avoid an electric shock so intense that nal phenomena ofphysics. 2 (to my own great distress) it caused the rats to jump, At different times during my ensuing undergraduate squeal, and, often, to urinate or defecate. I was to record years, I entertained the possibility ofdevoting myselfto a the latency at which each rat began running following career in each of many different fields-including, in ad­ the onset ofthe light. After 60 trials in which a rat suc­ dition to math and physics, , biology, art, cessfully avoided the shock, I was to disconnect the music, creativewriting and, especially,toward the end, phi­ shock and record the latency ofrunning during the next losophy. But I was feeling an increasing need to settle on a 30 extinction trials. field in which the mental phenomena that now so intrigued The experiment had been designed to test a prediction me might be conceptualized in terms ofmathematics­ of Yale's renowned -response (S-R) or even, I hoped, geometry (which had been character­ theorist Clark L. Hull, then a frail, bent, and reclusive ized as "the most ancient branch ofphysics" by Einstein, presence in his basement lab and, as it turned out, in the 1923b, p. 28, and as its "noblest branch" by Harvard final year ofhis life. According to Hull's theory, the la­ mathematician W.F.Osgood-see Taylor, 1984, p. 607). tency of such a conditioned response should gradually One day,my father, sensing my growing angst, surprised increase when, with the absence ofthe punishing shock, me by asking whether I had considered psychology. This the accumulated habit strength of the response drained was a field that could hardly be further from his own away like fluid from a leaking drum. Contrary to this fields-i-originally ofmining engineering and now ofmet­ prediction, however, I noticed that anyone rat's latencies allurgy.Nevertheless, he observed that psychology would of wheel running did not increase in the absence ofshock. seem to be the field that most explicitly offered a scien­ Instead, each rat continued responding to the light with tific approach to the mental. I was struck by the thought undiminished alacrity until, on some unpredictable trial, that the only son ofthe metallurgist might become-as it simply refrained from running. (Because different rats indeed I have (despite an ensuing period of behaviorist desisted after different numbers of shock-free trials, a indoctrination)-a mentalurgist! I did then register for plot showing a gradual increase in median latency for the Psych I. But, as I feared, I found little in this course that whole group of rats would not be representative of the appeared susceptible to the kinds oftheories that I so ad­ behavior ofanyone rat.) mired in physics and, certainly, nothing that looked geo­ To provide a quantitative representation typical of the metrical. StilI, in desperation, I obtained permission to behavior ofindividual rats, I took it upon myselfto plot register, as the only undergraduate, in the graduate-level the mean latencies of the wheel-running response on course Sensation and . Although offered by shock-free trials as a function of number of trials since Donald Taylor, the same professor who had taught the the last shock. The result is shown in Figure IA. For each disappointing Psych I, this course did turn out to be closer rat, these trials were the equivalent ofextinction trials in to my interests in perceptual phenomena and also, it that no shock was received. Because longer runs of seemed, more susceptible to mathematical formulation. shock-avoiding trials were rarer than shorter runs, sam­ At the end ofone class, Taylor mentioned an as yet un­ ple sizes decreased with length ofrun ofshock-avoiding published Stanford doctoral dissertation. In it, Fred Att­ trials and (as can be seen) variability oflatency increased. neave, using stimuli that varied on perceptually distinct Nevertheless (as can also be seen), there was no system­ dimensions (such as size, shape, and color), found evi­ atic increase in latency ofresponse. dence that both subjective judgments ofthe similarities When a rat that had thus been responding with short among the stimuli and objective frequencies oftheir ac­ latencies finally did delay its response, it did so discon­ tual confusions during paired-associates learning were tinuously, refraining from running at all until the shock inconsistent with a Euclidean representation. Evidently, at 5.8 sec caused it immediately to resume running to ter­ the "psychological distance" between stimuli did not ap­ minate the shock. The resumption ofrunning following proximate the square root of the sum of the squares of a shock and the cessation ofrunning that brought on the their separations on the component perceptual dimensions next shock were equally discontinuous. I was able to dictated by the Pythagorean theorem for Euclidean right show this in the single plot shown in Figure IB by lin­ triangles. Instead, the data implicated a non-Euclidean early stretching or shrinking (for each rat) each inter­ metric in which the psychological distance approximated shock interval so that it fit between the same plotted 4 SHEPARD

SA

7.15 en 7 -0 (shocked c: mean) 8 Q) 6 ~ 5.S0 (/J (latency Q) cutoff) (/J 5 5 c:

(/J8- Q) 4 4 a: Responsesearlyenough '0 to avoid shock i)- 3 9 3 c:

~ 2 ---'-f--+--t---t-:::-1r-....J;~ 1.92 N~ 93 2 c: r; 93 N~19 50 71 66 (unshocked al 19 Q) "1 '\. mean) ::E Samplesizes (N) 5

o...I,--o.._~ __...... _ ...... -o.._~~~ 7 ... u 1 2 3 4 5 6 7 8 9 10 0... 1 1. 1. !. 1. !. o 8'7 7 7 7 7 7 '7 8 .c .c III Shock-Free Trials Following Last Shock ~ Fractionof IntervalBetweenShocks III

Figure 1. Rats' mean latencies of wheel running following the onset ofthe conditioned stimulus plotted (A) as a function of successive trials after the last shock for those responses that occurred within 5.8 sec and, thus, avoided the shock, and (B) within successive sevenths ofthe intervals between shocks. From "A Funny Thing Happened on the Way to the Formu­ lation: How I Came to Frame Mental Laws in Abstract Spaces," by R. N. Shepard, 2003. In R. J. Sternberg (Ed.), Psychol­ ogists DefYing the Crowd, pp. 215-237. Washington, DC: American Psychological Association. Copyright 2003 by the Amer­ ican Psychological Association. (Adapted with permission.) shock trial endpoints. Within each ofseven equally spaced those ofus in Neal Miller's learning seminar to read the time slots between these endpoints, I plotted the mean of manuscript of Hull's about-to-be-published final book, the latencies of all the responses within that time slot. A Behavior System (Hull, 1952). But she was adamant With the resulting larger sample sizes, the mean within­ that absolutely no problem concerning Hull's theory be slot latencies were all very stable-and absolutely flat brought to his attention lest the shock precipitate in the throughout the shock-free interval, as can be seen in the failing Hull a final cardiac arrest. figure. I later learned that nearly 20 years earlier, Isadore Itdid not look to me as though habit strength gradually Krechevsky (who later shortened his name to Krech) had decreased in the absence ofthe shock. Rather, it looked as published evidence of"hypotheses" in rats-albeit dur­ though the rats were doing just what I would have done in ing (Krechevsky, 1932a, 1932b). their unhappy situation. It was as if the rats had formed Yet, the two animalleaming researchers on the Yalefaculty the hypothesis that the light signaled the imminence ofan­ to whom I did show the results ofmy analyses found my other shock that could only be prevented by immediately results incomprehensible and suggested that my outcome­ running in the wheel. But, after continuing to run without contingent ways of averaging the data were suspect. I any recurrence ofthe shock, a rat would eventually risk a filed my plots away and never ventured to publish them test ofits hypothesis about the connection between light until over 50 years later (Shepard, 2003). I also declined, and shock by not running. If the shock occurred, the rat from that time on, to undertake any further experiments would immediately resume running to terminate the on subjects unable to provide informed consent. shock. But if the shock did not occur, the exhausted rat, A decade later, such discontinuities in learning per­ welcoming the apparent break in the connection between formance were being confirmed-by means ofjust such light and shock, would refrain from further running. I also outcome-contingent averaging of data-by Bower and plotted median latencies for shock-free trials just during Trabasso (1963, 1964), Levine (1966), Levine, Miller, extinction-but backward in time, beginning with the last and Steinmeyer (1967), Restle (1962), and others, includ­ trial on which each rat responded. The median response ing, in a somewhat different way, myself(Shepard, 1963, latencies were again flat, manifesting no systematic in­ 1966a). These experiments were conducted on humans crease until each rat was no longer responding. rather than on rats, however, and most of them were Although intrigued by this suggestion of a cognitive based on the new, mathematically elegant hypothesis and process ofhypothesis testing in the lowly rat, I could not one-element stimulus-sampling models. This striking take the results ofmy analyses to the reclusive Hull him­ new evidence both benefited from and contributed to the self. His longtime secretary, Ruth Hays, had allowed then gathering shift from the more "thoughtless" or "hy- KEYNOTE ADDRESS 5

draulic" explanations of behavior to theories of the in­ I was now convinced that the problem ofgeneralization ternal processing ofinformation. was the most fundamental problem confronting learning theory. Because we never encounter exactly the same total Later Graduate School Years at Yale situation twice, no theory of learning can be complete After leaving the rat lab, I asked Carl Hovland to be without a law governing how what is learned in one situ­ my dissertation advisor. Although Hovland had done his ation generalizes to another. Hull, recognizing the need own doctoral research under Hull, he was the one senior for a law ofgeneralization, had earlier persuaded Hovland member ofYale's faculty who to determine the form ofthe function relating generaliza­ worked primarily with human subjects and who seemed tion to stimulus difference in his 1936 doctoral disserta­ open to theoretical ideas that differed from the S-R be­ tion. Hovland attempted this by measuring humans' gen­ haviorist approach then prevailing at Yale. Indeed, an­ eralized galvanic skin responses (GSRs) to tones differing ticipating the soon-to-flourish cognitive revolution, Hov­ in frequency from a training tone associated with (mild) land was already exploring an information-theoretic electric shock (Hovland, 1937). The averaged GSR approach to human concept learning (Hovland, 1952).3 strengths fell offwith difference in frequency of tone in While at Yale, I received inspiration from two young an apparently concave-upward manner suggestive of an visiting colloquium speakers. One was William Estes, exponential decay function. Accordingly, Hull then added whose elegant stimulus-sampling theory of learning an exponential-decay generalization function as Postu­ (Estes, 1950) came as a breath offresh air. In addition to late 5 in his Principles ofBehavior (Hull, 1943). But GSR laying the groundwork for the one-element models of data are notoriously variable, and Spence (1937), who discontinuous learning just referred to, Estes showed had his own theoretical reasons for preferring a convex­ how empirically testable functional relations could be upward function, argued that Hovland's (1937) data were derived, mathematically, from hypothetical but simple too noisy to conclude whether the generalization function and well motivated (ifabstractly formulated) elementary was concave or convex, let alone specifically exponential. internal processes. This was a notable departure from For me, there was a more fundamental problem. Even Hull's practice of postulating such functional relations if stable measures ofgeneralization could be found, gen­ after empirically fitting curves oftheoretically unmoti­ eralization functions could not in general be monotonic, vated functional form to animal data. Yet, when I con­ let alone invariant, when plotted on a physical dimension. fessed to Estes that I, too, aspired to a mathematical ap­ There is greater generalization (1) between tones separated proach, he counseled me that from his experience, in frequency by an octave than between tones separated by psychologists were not yet ready for mathematically for­ somewhat less than an octave, (2) between spectral hues mulated theories. Estes's own dissertation advisor, B. F. ofthe shortest and longest visible wavelengths (violet and Skinner, in commenting on Hull's theory building, had red, respectively) than between either ofthese and a hue of earlier written "A science of behavior," because its prob­ an intermediate wavelength (e.g., green), and (3) between lems are not of the same sort, "cannot be closely pat­ the vertical and horizontal orientations ofa rectangle than terned after geometry or Newtonian mechanics" (Skin­ between these and a 45° oblique orientation (Shepard, ner, 1938, p. 437). Much later, when Skinner and I were 1962b, 1965, 1987). both at Harvard, he liked to refer to mathematical mod­ In my 1955 doctoral dissertation, with the supportive els as "paper dolls." I am grateful that neither Estes nor I encouragement and suggestions oftwo then Yaleassistant heeded Estes's own warning about the field's lack of re­ professors, Robert Abelson and Burton Rosner, I devel­ ceptivity to mathematical formulations.t oped a method for recovering a spatial representation of The other inspirational colloquium speaker was stimuli from generalization data themselves, without ref­ George A. Miller. He presented his information-theoretic erence to any physical dimensions. Provisionally assum­ analyses ofthe extensive data that he and Patricia Nicely ing that the generalization function might be exponential, had collected on the rates at which listeners confused 16 I applied the inverse, logarithmic transformation to gener­ consonant phonemes with each other under 17 condi­ alization errors recorded during paired-associates learn­ tions ofnoise and filtering (Miller & Nicely, 1955). The ing (which proved to be much more stable than GSR challenge of trying to discover the dimensions of per­ data). I then showed that a type ofmultidimensional scal­ ceptual processing implied by this extraordinarily rich ing could be applied to these transformed data to yield a of data strengthened my resolve---originally stimu­ spatial representation of the stimuli (and also of the re­ lated by the dissertation of Attneave (1950)-to find a sponses). The goodness offit ofthe original data to an ex­ way ofextracting spatial representations ofstimuli from ponential function ofthe distances in the recovered spa­ similarity, confusion, and generalization data. Moreover, tial representation yielded the first reliable evidence that when I and then my associates later succeeded generalization does in fact fall offexponentially with psy­ in developing quite general computer methods ofdoing chological distance (Shepard, 1955, 1957, 1958a, 1958b). this by multidimensional scaling, tree-fitting, and cluster­ Although I passed the oral examination on my disser­ ing, the data ofMiller and Nicely proved to be the single tation, I subsequently learned that one ofmy examiners, most useful set oftest data (Shepard, 1972, 1980, 1988a). Fred Sheffield (another Yaleanimal learning researcher), 6 SHEPARD

had expressed the view that "this sort ofthing should not Shepard, Kilpatric, & Cunningham, 1975) and in the re­ be encouraged." I had previously heard Sheffield opine search I shall review here on mental transformations, that psychology was advanced not by constructing math­ generalization and discrimination, and music cognition. ematical theories but by designing experiments in such a Still, a nagging feeling that none ofthe basic research way that the final report could be written in advance of I was doing at the Bell Labs would ever earn or save collecting the data, leaving only spaces to fill in the ap­ money for the Bell System goaded me into undertaking propriate statistics and associated p values. I could only one experiment of more immediate practical relevance. wonder how Einstein would have arrived at the theory of My research assistant and I had people look up and dial. relativity by following such advice. seven-digit phone numbers in either of two ways. One was the usual way; the other was a reversed order in Years at the Bell Telephone Laboratories which the less familiar four-digit line number appeared In 1958, after three postdoctoral associateship years to the left and was to be dialed before (rather than after) (one at the Naval Research Laboratory and two with the generally more familiar and memorable three-digit George Miller at Harvard), I served for eight years on the central office code. We found that this reversal of order technical staff of the Bell Laboratories in Murray Hill, cut both the lookup-plus-dialing time and the frequency New Jersey. Inspired, in part, by a summer workshop of wrong numbers approximately by half (Shepard & conducted by and Herbert Simon on their Sheenan, 1965). Such a change could have eased the life new computer simulation approach to cognitive processes, of the customers and saved the Bell System enormous I planned to use Bell Labs' computer facilities to simu­ amounts of money. But when I circulated a draft of our late the evolutionary formation ofperceptual-cognitive report to the applied and engineering divisions, the feed­ mechanisms. But then the executive director ofthe large back I received was to go back to my "ivory tower." The division that included my department, the eminent engi­ Bell system was too big to make such a change, I was neer John R. Pierce, invited me to have lunch with him. told. So, I returned to my basic research and, from then No sooner had I mentioned the emerging field ofartifi­ on, accepted my paychecks without guilt. cial than Pierce, known for his sharp mind and outspoken opinions, rocked back in his chair and ex­ A Shift to Academia and to a Bolder Vision for claimed, "Ah yes, AI. There's no holding it up; it keeps Psychological Science hitting new lows!" It was during my ensuing faculty years that I began to Because this was my first real job and I now had a devote sustained thought to the possibility that mental family to support, I decided to use Bell Labs' computer principles may have arisen as accommodations to uni­ facilities to work, instead, toward new ways to extract versal physical or abstract mathematical features of the spatial representations from similarity and generaliza­ world. I had admired the elegant axiomatic approach to tion data-ways that were not possible with the mechan­ the psychological taken by R. Duncan Luce (1959a), in­ ical desk calculators that had been available back at Yale. cluding, in particular, his of"the This led to the development ofthe methods now known possible psychophysical laws" (Luce, 1959b). Now,dur­ as nonmetric multidimensional scaling-first by me ing a short (1966-1968) stint on the Harvard faculty, at (Shepard, 1962a, 1962b) and then, with improvements, the other end of the hall from the single-minded psy­ by my Bell Labs mathematical colleague Joseph Kruskal chophysicist S. Smith Stevens, I felt a need to achieve (1964a, 1964b). Using only qualitative (i.e., ordinal) clarity for myselfabout whether a particular form ofthe properties of the data, these methods yield quantitative psychophysical function might be advantageous for in­ (fully metric) spatial representations (Shepard, 1966b). dividuals behaving in the world (see Shepard, 1978b, Thus did I attempt to achieve in psychology what had 1981b). My mission gradually evolved into its present, been characterized as "the very essence ofscience from broader form during my 34-year tenure at Stanford­ the time of Anaximenes to the present day"-namely, principally through new investigations into mental trans­ "quality ... reduced to quantity" (Sambursky, 1956, formations and, once again, into my thesis topic ofgen­ p. 11). Moreover, these new methods provided for solu­ eralization. I came to suspect that a nonarbitrary basis tions in non-Euclidean as well as in Euclidean spaces for universal mental principles had been underestimated (Cunningham, & Shepard, 1974; Kruskal, 1964a, 1964b; for two reasons. Shepard, 1980, 1991; Shepard & Carroll, 1966). These First, the relevant features of the world, being both and related data-analytic methods-including additive universal and abstract, leave no contrasting background clustering (Shepard & Arabie, 1979) and, especially, the from which they might stand out to catch our attention. INDSCAL type of multidimensional scaling developed For example, all of our collective evolutionary history by my Bell Labs colleagues Carroll and Chang (1970)­ and individual learning experiences have occurred in a have proved to be useful in the behavioral, social, cogni­ three-dimensional Euclidean space. Much as fish that have tive, and neurosciences. They have also been essential in never inhabited a nonaquatic medium, neither we nor our my own demonstrations of what I termed the second­ ancestors have lived in a four-dimensional space. Some order isomorphism ofperception and imagery (see, e.g., humans know that four-dimensional Euclidean space af­ Shepard & Chipman, 1970; Shepard & Cooper, 1992; fords the possibilities oftransformations not available in KEYNOTE ADDRESS 7

our familiar three-dimensional space-such as rigid rota­ I was drifting toward wakefulness early in the morning tion (about a plane!) ofa left-hand glove into a right-hand of November 16, 1968, 1 experienced a spontaneous glove-but, presumably, through abstract mathematical hypnopompic image of three-dimensional objects ma­ reasoning rather than through direct experience-based jestically turning in space. Even before rising from bed, intuition. I had mentally worked out the design ofthe first experi­ Second, any accommodating mental structures are ment (Shepard & Metzler, 1971) in what was to become likely to have become so deeply internalized, implicit, a long-continuing series, first on mental rotation and and automatic in their operation as to be largely inacces­ then on related phenomena ofapparent motion (Shepard, sible to our conscious awareness. Two early examples 1984; Shepard & Cooper, 1982). I now describe the two struck me as illustrations ofthe power ofthe spatial in­ cases ofapparent motion and mental rotation and, then, tuitions with which we are unknowingly endowed. In a the third case ofstill more freely imagined spatial trans­ talk at Newell and Simon's computer simulation work­ formations. This is their order of increasing indepen­ shop in 1958, Marvin Minsky remarked that in compar­ dence from external support. I believe this also to be the ison with symbolic logic, the prooftrees for plane geom­ order oftheir evolutionary emergence. Mechanisms that etry are so ramified that it is only by virtue ofour spatial originally evolved in the service ofperception may have intuitions that we are able to prove even elementary the­ gradually been liberated to aid in the anticipation of fu­ orems ofEuclidean geometry. Similarly, Donald Michie, ture states ofthe world and, then, in the planning ofac­ while visiting me at Bell Labs, described a game in tions likely to increase the probability of favorable fu­ which two players alternately draw single counters from ture states. a set ofnine bearing the numbers 1 through 9. The first Apparent motion is the most automatic and perceptual player to obtain three counters whose numbers sum to 15 ofthese three cases. In it, the display ofan object in one wins. People are very slow to master this game. Yet, it is position is suddenly replaced by a display ofan object of isomorphic to the trivial but spatially presented game of the same shape but in a different position in space. The tic-tac-toe. This can be seen from the existence of a resulting visual experience is that a single enduring ob­ 3 X 3 magic square with the numbers 1 through 9 as­ ject has suddenly moved from one position to the other signed to the nine cells in such a way that (for example) and has done so over a definite connecting path. Under the top, middle, and bottom rows contain 8-1-6, 3-5-7, appropriately general conditions, the path over which the and 4-9-2, in that order. In this square, the three numbers motion is experienced is the most direct and simplest in each row, each column, and each diagonal (and only one, as specified by the kinematic geometry of three­ these) sum to 15. dimensional Euclidean space and the symmetry group of But, if the constraints ofthe particular space in which the object (Carlton & Shepard, 1990a, 1990b; Foster, we have lived have always been externally available to 1975; McBeath & Shepard, 1989). If the two displayed us, what advantage would we have gained by internaliz­ positions differ in orientation, this kinematically sim­ ing those constraints? This question was raised most re­ plest path is generally curved. In agreement with the ex­ cently and insistently in the articles and commentaries perienced continuity ofthe motion, when the two posi­ on my work in a special issue of Behavioral and Brain tions are displayed in ongoing alternation, the minimum Sciences (2001) by Barlow (pp. 602-607), Kubovy and duration ofthe display ofeach position yielding experi­ Epstein (pp. 618-625), Schwartz (pp. 626-628), Todorovic enced preservation ofobject structure increases linearly (pp. 641-651), Bertamini (pp. 655-657), and Jacobs, with the length ofthe path over which the motion is ex­ Runeson, and Andersson (pp. 679-680). I responded perienced (Farrell & Shepard, 1981; Shepard & Judd, with both empirical findings and theoretical arguments 1976; Shepard & Zare, 1983). Moreover, the same time­ (see Shepard, 2001, pp. 712-729, 736-742, particularly distance relation evidently holds in other modalities, pp. 714-715, 723-724). Perhaps most relevant here is the such as the tactile and the auditory, and, in the latter case, adaptive value conferred by being able to simulate, men­ on the dimensions ofboth physical location and pitch (e.g., tally,the simplest and, hence, swiftest structure-preserving Jones, 1976; Lakatos & Shepard, 1997a, 1997b; Shepard, transformation that will carry one represented object 1981c, pp. 318-320). into congruence with another in the absence ofany exter­ Mental rotation represents a more voluntary and cog­ nal support for the particular path taken by that transfor­ nitively effortful case. In it, the time to determine that mation. Thus, the just considered game oftic-tac-toe is two displayed objects are ofthe same shape similarly in­ child's play because the number ofalternatives to be con­ creases linearly with the extent ofthe kinematically sim­ sidered is radically reduced by the implicitly recognized plest rigid motion required to bring one object into the symmetry in which any patterns ofXs and Os that can be orientation of the other in three-dimensional space. transformed into each other by rotations or reflections of Here, too, this simplest motion is affected by any sym­ the square are equivalent. metries of the object. Despite the absence of external support, the object is sequentially represented in posi­ Investigations ofMental Transformations tions corresponding to successive points along this par­ My investigations of mental transformations had a ticular simplest connecting path in the abstract space of sudden beginning shortly after I arrived at Stanford. As possible positions. This is demonstrated by how the 8 SHEPARD speed of the (accurate) discriminative response to a vi­ ing orientation, centered within a circular field (as illus­ suaI probe presented during the mental operation sys­ trated in Figure 2B). Immediately upon the replacement tematically varies as a function of the time and relative ofthis polygon by a blank field, the subject was to imag­ orientation ofthe presented probe. ine the just presented shape rotating in a clockwise direc­ Much ofthe work on mental rotation may be familiar to tion within that circular field. Then, at some unpre­ most cognitive psychologists, and the first decade ofthat dictable time, a test probe consisting of either the same work (and the related work on apparent motion) was cov­ standard polygon or its mirror image was presented within ered in the book MentalImagesand TheirTransformations the same circular field at an unpredictable orientation.. (Shepard & Cooper, 1982). Still, as an experiment pro­ The participant's task was to push a right- or a left-hand viding particularly clear evidence that the internal pro­ button as quickly as possible to indicate whether the test cess represents the structure ofa complex object in suc­ probe was the standard or the reflected version ofthe poly­ cessively more rotated orientations-and does so during gon, respectively, regardless ofits presented orientation. the absence of any corresponding external stimulus-I The results are summarized in Figure 2. When the here describe the results ofthe elegant and quantitatively probe was presented at an orientation in which each par­ precise experiment by my former research collaborator ticipant should (according to that participant's previ­ Lynn Cooper (1976).5 ously determined rate ofmental rotation) be representing The participants in this experiment had already served that particular polygon at the time ofpresentation ofthe in other experiments ofCooper and were highly familiar test probe, responses were uniformly accurate and fast. with the eight random polygons that she had originally As is shown in Figure 2A, these averaged just over half constructed for her dissertation experiments. As a result, a second, regardless of the absolute orientation of that each participant's speed of mental rotation had already test figure in the circular field. When the probe was pre­ been determined for these stimuli. At the beginning of sented in an orientation that differed from the orientation each trial, one of these familiar random polygons was in which a participant was expected to be representing displayed to the participant in its standard, original train- the polygon at the time of test, however, discriminative

A B "0 -g 900 • 60· probes (familiar) Cooper (1976) 8 o 30· probes (unfamiliar) CD .~ 800 I

Q) 700 E Reaction time expected if SUbjects i= werenot mentallypassingthrough c:: representations at 30° orientations o ~ 600 ------13m a:Q) c:: 500 m Example of random CD polygon in circular field ~ 30° 90° 150° 210° Z7Q0 3300 0° 60° 120° 180° 0° 60° 1200 180°2400 310° Probe Orientation - When Same Probe Departure From Orientation as That Subject Is Predicted to Be That Subject Is Predicted to Be Representing at Time of Test Representing at Time of Test

Figure 2. Mean time during mental rotation of a random polygon to classify a test probe as the standard ora reflected version ofthat polygon, plotted (A) as a function oftheabsolute angular ori­ entation of the probe, when it was presented in the orientation through which the subject was pre­ dicted to be mentally passing at the time of test, and (8) as a function of the angular departure of that probe from the orientation through which the subject was predicted to be mentally passing at the time of test. From "Demonstration of a Mental Analog of an External Rotation," by L. A. Cooper, 1976, Perception & , 19, 296-302. Copyright 1976 by the Psychonomic Society. (Adapted with permission.) KEYNOTE ADDRESS 9

reaction time increased linearly with the angle of such ceived as a cube, as long as its edges and corners (no mat­ departure. As can be seen in Figure 2B, this time, which ter how wildly curving and nonorthogonal) project, on the was a little over 500 msec for 0° departure, reached about (stationary) observer's retina, an image that is identical to 800 msec for the maximum possible departure of 180°. that projected by a true cube. Followers ofEgon Brunswik It was as if the participant were mentally representing the might argue that the normal cube, rather than any of these polygon in orientations successively more rotated in a distorted variants, is experienced because straight lines clockwise direction. Then, when the probe figure ap­ and right-angled corners are statistically more common in peared in some other orientation, the mental rotation had our environment. I believe that the reason is deeper than to continue on-or back up-to the orientation of that this. I believe that it depends on our implicit appreciation externally presented test stimulus in order for the partic­ of the consequences ofpotential spatial transformations. ipant to determine whether the test figure matched the For, ifthe object were actually twisted and distorted in the internally represented polygon or was its reflected version. way described, then the probability that a freely mobile The leftmost point, for the case ofa 0° departure from observer would be so positioned that the distorted object the expected orientation in Figure 2B, is simply the av­ yields exactly the same retinal projection as the true cube erage of all the points in Figure 2A. The open circles would be not just low, but zero. This account ofthe mat­ among those points are for intermediate test orientations ter is, I suggest, more deterministic and Gibsonian than it that differed by multiples of30° from any orientations in is probabilistic and Brunswikian. which these polygons had previously been presented to Implicit representation ofpossible spatial transforma­ these participants. Nevertheless, relative to the virtually tions may also playa significant role in our appreciation horizontal line fitted to the points in Figure 2A, the re­ of symmetries in objects and visual patterns (Palmer, action times for these novel test orientations were no 1982,1983; Shepard, 1981c). The arts and crafts of all greater than those for the probes in the previously expe­ cultures of the world favor symmetrical and repeating rienced orientations (which had been confined to 60° patterns, and we are exquisitely sensitive to small depar­ steps around the circle). Such results provide support for tures from symmetry. Now, symmetry is defined, math­ our claim that mental rotation is an analogue process in ematically, in terms ofinvariance ofpattern under trans­ which something is literally rotating. But what is rotat­ formation. Thus, we may immediately, ifunconsciously, ing is very abstract. It is, specifically, the orientation in "see" that a horizontally repeating pattern would be un­ the external space ofthe blank circular field in which a changed by certain translations, that a bilaterally sym­ test stimulus-ifit were to be presented-would yield a metric shape would be unchanged by a reflection, and fast and accurate discriminative response. I would hope that a circle would be unchanged by any rotation. Indeed, that the astrophysicist whom I quoted near the beginning the perception of shape, generally, may derive from an of this article would allow that these data are not mere appreciation ofthe object's degrees ofapproximation to nonobjective and nonquantitative "expressions of sub­ self-similarity under all possible rigid transformations jective feelings." (Shepard, 1981c,pp.3l5,326; 1982b,p.60; 1988b,p.95), Freely imagined spatial transformations, as in creative or even nonrigid transformations (Leyton, 1992). Fur­ thinking and , generally go on with still thermore, as I shall note later, our intuitions about sym­ less external constraint or support. While lying in bed metry may have played the crucial role in the thought ex­ with eyes closed, one might imagine how a large table periments leading to some ofthe most important insights could be maneuvered through a narrow doorway. Ac­ ofphysical science and moral philosophy. counts of many mathematicians, scientists, engineers, Our implicit spatial competence may in some way also inventors, architects, and sculptors have remarked on the underlie various other cognitive functions or develop­ importance ofspatial visualization in their creative work ments that might at first not seem at all spatial. Examples (Shepard, 1978a, pp. 134-159). Einstein, when asked that I have at various times discussed include about the mental processes that led to his revolutionary organization and retrieval (Shepard, 1966a, p. 203; 2001, discoveries, spoke of "visualizing ... effects, conse­ p. 726), comparison and judgment processes (Shepard, quences, possibilities" (quoted in Wertheimer, 1945, 1981c, pp. 329-330; also see Dixon & Just, 1978), lan­ p. 184) by means of"images which can be 'voluntarily' guage origin, syntax, semantics, and metaphor (Shepard, reproduced and combined" (quoted in Hadamard, 1945, 1975, pp. 113-116; 1982b, pp. 61-65), and something p. 142). I suggest that these nonverbal processes are of to which I shall return shortly-namely, music cognition essentially the same kind as the processes ofmental ro­ (Shepard, 1982a). tation that my co-workers and 1 quantitatively investi­ gated in an extremely purified form under the highly Investigations ofGeneralization controlled conditions of the psychological laboratory. In the second line of work I have chosen for illustra­ Implicit representation ofpossible spatial transforma­ tion here, I returned to what had struck me decades ear­ tions may playa significant role even in simple object per­ lier at Yale as the most fundamental of psychological ception, in which maximal external support might seem to laws-namely, a law of generalization. I now applied be available. Any twisted and distorted shape will be per- multidimensional scaling to generalization data from 10 SHEPARD

A B 1~------., 1...------.... Triangles differing in Circles differing in size size and shape G: McGuire(1954) -o G: Attneave(1950) -o D: Shepard (1962b) -c: D: Shepard (1958a) -c: o o ~ ~ .~ .~ ~ ~ (J) CD c: c: CD oCD o oL----===-==.... oL-_.:::.£.:a=r...... o Distance (D) o Distance (D) C D 1.....------t 1~------.... Colors differing in hue Colors differing in hue G: Ekman (1954) -Cl G:Guttman& Kalish(1956) e D: Shepard (1965) - D: Shepard (1962b) -c: -c: o o ~ (pigeon data) 1ii (human data) .~ .~ ~ ~ (J) CD c: c: (J) CD o Cl • • • Distance (D) Distance (D)

Figure3. Simple exponential-decay functions fitted to generalization data that are plotted against the interstimulus distances obtained by applying multidimensional scaling (MDS) to those data. (A) Humans' generalization data from the original study ofAttneave (1950) for triangles differing in size and shape, with distances from the MDS solution obtained for these by Shepard (1958a). (B) Humans' generalization errors while learning nine responses to nine circles of different sizes (McGuire, 1954), with distances from the MDS solution obtained for these by Shepard (1962b). (C) Pigeons' generalization data obtained by Guttman and Kalish (1956) for colors of different wavelengths, with distances from the MDS solution obtained for these data by Shepard (1965). (Dj.Humans' similarity ratings obtained by Ekman (1954) for colors of different spectral hues, with distances from the MDS solution obtained for these data by Shepard (1962b). AU four figures adapted from Shepard (1987), "Toward a Universal Law ofGeneralization for Psychological Sci­ ence." Science, 237,1317-1323. Copyright 1987 by the American Association for the Advancement ofScience. (Adapted with permission.)

many experiments on both human and animal learning in analysis (Shepard, 1962b, Figure 10) of the generaliza­ which a great variety ofvisual and auditory stimuli were tion errors made by humans while they learned a differ­ used. I found that the probability with which a response ent response to each ofnine circles ofdifferent sizes in that had been learned to one stimulus was made to an­ an experiment by McGuire (1954-see Shepard, 1958b, other stimulus uniformly fell offin approximation to an Table 1). Figure 3C comes from my analysis (Shepard, exponential decay function of distance between those 1965, Figure 2) ofdata collected by Guttman and Kalish stimuli in the recovered representational space. (1956) on pigeons' generalization ofa pecking response, Here, I show 4 ofthe 12 examples presented in Shepard originally learned (by each pigeon) to a particular wave­ (1987). Figures 3A and 38, respectively, are the results length oflight, when later tested with other wavelengths. of my analysis (Shepard, 1958a, Figure 2) of data from Figure 3D comes from my analysis (Shepard, 1962b, Attneave's (1950) original study and the results of my Figure 14) of data collected by Ekman (1954) on hu- KEYNOTE ADDRESS II

mans' judgments ofsimilarities between colors that also search. The theory predicts that the abstract representa­ differed on the hue continuum. Ijustify inclusion ofdata of tional space is (locally) approximately Euclidean only to the latter type on the basis ofmy assumption that experi­ the degree that the extensions of the consequential re­ enced similarity and behavioral probability ofgeneraliza­ gions along different directions in the space are posi­ tion arise from the same distance-based representation tively correlated. Only then will the exponential arising ofsimilarity. As can be seen, the data are approximated from the Bayesian integration drop off in different di­ (especially for the human data, in panels A, B, and D) by rections in ways that approximately yield the elliptical the simple exponential decay function ofdistance in rep­ equal-generalization contours (or surfaces) consistent resentational space. This function is forced to pass with the Euclidean metric. This case seems appropriate through the generalization value 1 at zero distance, and for integral dimensions such as those of lightness and fitted by adjustment of a single slope parameter, which saturation for colors (Shepard, 1958b, 1962b, 1991). If, is expected to depend on the particular animal and set of however, the extensions of the consequential regions stimuli tested. along different dimensions are assumed to be uncorre­ In order to put the exponential function forward as a lated, the slope of the computed exponential fall-off universal law, however, I needed to identify something varies in a different way with directions in the space. about the world in which any behaving organism might This case, which seems more appropriate for separable evolve that would favor the emergence of a law ofjust dimensions such as those ofcolor and shape (Attneave, this form. There is one fact about the world that seemed 1950) or ofsize and orientation (Shepard, I964a), yields most relevant and universal. Biologically significant ob­ equal-generalization contours that approximate the dia­ jects belong to distinct basic kinds-each with its own mond shape corresponding to the so-called "city-block" consequences for an organism. I argued that generaliza­ metric implicated by Attneave (see Myung & Shepard, tion is not merely a failure ofdiscrimination, but a cog­ 1996; Shepard, 1987, 1991). nitive act of inductive inference. An individual who has The fundamentally Bayesian approach to generaliza­ found a particular object to have a significant conse­ tion briefly described here is being extended to the for­ quence and who now encounters a new object that is dis­ mulation ofadditional fundamental laws ofinductive in­ criminably different from that earlier, consequential ob­ ference by Joshua Tenenbaum and Thomas Griffiths ject needs an estimate of the probability that this new (200 Ia, 200 Ib). John Anderson too has demonstrated object has the potential for the same consequence. On the power of a Bayesian approach to learning and cate­ essentially Bayesian grounds, I argued that the estima­ gorization. He was also particularly effective in advo­ tion ofthis probability requires a summation over all hy­ cating the idea that principles governing human cogni­ potheses about which objects might be ofthe kind hav­ tive processes represent adaptations to general features ing that consequence or, in the case of a continuous ofthe world (Anderson, 1990, 1991). space ofpossible objects, it requires an integration over all such hypotheses. For convenience I refer here to such Investigations ofDiscrimination computations as Bayesian summations and integrations. The theory ofgeneralization is, as I noted, a theory of Following my geometrical inclination, I supposed that cognitive inference, not a theory of failure of sensory a consequential subset would correspond, in the abstract discrimination. It is based on an individual's uncertainty representational space of objects, to a suitable (in the about what region in representational space corresponds continuous case, connected) consequential region. Intu­ to the set ofobjects harboring some consequence. To the itively,it seemed clear that points representing more sim­ extent that these stimuli are not fully discriminable, there ilar objects and, hence, lying closer together in the rep­ will be additional uncertainty about what point in the resentational space would jointly fall within a greater space corresponds to a given stimulus. Generally, in this number ofthe potentially consequential regions. I found case, the function relating probability ofresponse to dis­ that the Bayesian integration with priors corresponding tance between stimuli is expected to become inflected to minimum knowledge (i.e., to maximum entropy) does and convex upward in the vicinity ofthe training stimu­ indeed yield specifically the exponential decay function lus (cf. Nosofsky, 1986; Shepard, 1958a, 1986). (Indeed, (Shepard, 1987). This result was soon extended, by Stu­ Figure 3C may manifest some convexity of this sort.) art Russell, then at Stanford, to stimuli that differed in As was originally demonstrated by Henmon (1906), the presence or absence of discrete features. Russell however, the time required to discriminate two stimuli, showed that for such stimuli, summation over candidate like probability of generalization, falls off with inter­ subsets yields a function that falls offexponentially with stimulus difference according to a concave upward func­ a distance between stimuli, defined in terms ofthe pro­ tion. But discrimination time does not fall offexponen­ portion ofsuch features that are not shared by the stim­ tially.In my 1981 presidential address to the experimental uli (Russell, 1988). division of the American Psychological Association I found that the theory ofgeneralization also explains (Shepard, 1981a), I presented a dozen sets of discrimi­ the non-Euclidean results ofAttneave (1950), which had nation time data obtained by different researchers. In originally tipped me toward a career in psychological re- each ofthese sets, this fall-offwith distance could be ap- 12 SHEPARD

A B 360..--~------..., 1,000..._------. Circles differing in size Lines differing in length ~ i==" Henman (1906) Curtis, Paulos. & Rule (1973) -(]) --(]) E E i= F c § o +:l at ~ c: c 'E 'E .§ 'C:: C/) ~ 15 B 300 L...,. ....I 290L------....I o Distance (D) o Distance (D) C o 1,000..--":"""""------..., 900~------..., i=' Arrays of dots differing ~ Numbers 31-99 differing in in number of dots magnitude (relative to 65) --(]) Buckley&Gillman (1974) -(]) Dehaene. Dupoux. & Mehler E E (1990) i= F c: c o o ~ at ~ c c 'E 'E 'Co 'C:: C/) ~ 15 B • 500 L...,. ....I 500 L...,. ....I o Distance (D) o Distance (D) from 65 34

Figure 4, Reciprocal functions (with positive additive constants) fitted to humans' discriminative reaction times (in milliseconds) plotted against corresponding interstimulus distances. (A) Dis­ junctive reaction times to pairs of lines differing in length in the original experiment reported by Henmon (1906), with reciprocal function fitted by Shepard (1981a). (B) Discrimination times to in­ dicate which of two circles differing in size was the larger, with fitted reciprocal function, from the experiment reported by Curtis, Paulos, and Rule (1973). (C) Discrimination times to indicate which oftwo arrays ofdots contained the greater number of dots (Buckley & Gillman, 1974), with recip­ rocal function fitted by Shepard (198b). (D) Average discrimination times to indicate, for each pre­ sented two-digit number between 31 and 99 (except 65), whether it was larger or smaller than 65, plotted as a function of the absolute numerical difference between the presented number and the comparison number, 65. I computed the plotted averages, from data kindly supplied to me by Stanis­ las Dehaene, from an experiment reported in Dehaene, Dupoux, and Mehler (1990). The smooth curve is the reciprocal function I then fitted to the plotted points, Figure 4B is adapted from Cur­ tis, Paulos, and Rule (1973). Copyright 1973 by the American Psychological Association, (Adapted with permission.)

proximated, rather, by the reciprocal ofdistance (i.e., by of random neuronal noise or simply to terminate the distance raised to the -1 power) plus an additive constant trial.) yielding a positive asymptote corresponding to the min­ Here, I present the results obtained from four experi­ imum possible discrimination time. A reciprocal func­ ments on discrimination time (all plotted in millisec­ tion ofthis form had earlier been proposed, on somewhat onds). Multidimensional scaling was not essential to de­ different grounds, by Curtis, Paulos, and Rule (1973). termining the distances in these cases. The stimuli varied Unlike the exponential, this function increases without along a single physical dimension in such a way that bound as interstimulus difference approaches zero. (For physically defined distances might reasonably be pre­ extremely small differences, however, a subject will re­ sumed to approximate distances in the appropriate rep­ spond earlier than predicted by this function as a result resentational space. Figure 4A is based on the times orig- KEYNOTE ADDRESS 13

inally measured by Henmon (1906) for discrimination of nonidentity presents a different problem. I, among others, lines ofdifferent lengths. Figure 4B is the already men­ noted that in detecting a more subtle difference between tioned results ofCurtis et al. (1973) for discrimination of two stimuli, any being with limited processing resources circles ofdifferent sizes. Figure 4C is based on the times is expected to require more fine-grained and, hence, measured by Buckley and Gillman (1974) to indicate, for more time-consuming processing. each presented pair ofdot arrays, which array contained Building directly on the already developed cognitive the greater number ofdots. In each of these three exam­ theory of generalization, I supposed that any difference ples, a reciprocal function with a fitted (positive) as­ might make a difference. Even highly similar stimuli, if at ymptote appears to provide an approximation to the all different, could potentially correspond to objects of chronometric data. different kinds and, so, harbor distinct consequences. I ac­ I include the fourth example, presented in Figure 4D, cordingly invoked the same consequential regions and to show that the dimension with respect to which dis­ their associated maximum-entropy priors that I had intro­ tances are mentally represented need not be physically duced for generalization. I made the simplest additional instantiated. Beginning with Moyer and Landauer (1967), assumption that each stimulus elicits internal representa­ a number ofexperiments have shown that the time to de­ tional events corresponding to the potentially consequen­ cide whether one number is larger or smaller than an­ tial regions containing that stimulus with probabilities, per other decreases as distance between those numbers on an unit time, proportional to these priors. I assumed that the underlying number line increases, even though the num­ discriminative reaction is precipitated by the first such bers are presented only in the symbolic form of Arabic event corresponding to a region that includes either of the numerals-whether as single digits (see, e.g., Buckley & stimuli but not both. Integration over all potential regions Gillman, 1974; Parkman, 1971; Shepard et aI., 1975, then yielded an expected discrimination time that does in­ pp. 131-133) or as two-digit numbers (e.g., Dehaene, deed decrease with distance in accordance with a recipro­ Dupoux, & Mehler, 1990; Hinrichs, Yurko, & Hu, 1981). cal function plus an additive constant---eurves ofthe form In Figure 4D, I replotted data from an experiment by fitted to the discriminative reaction times plotted in Fig­ Dehaene in which subjects pressed a left- or right-hand ure 4 (Shepard, 1987, p. 1322). This does not constitute button to indicate, as quickly as possible, whether a pre­ proof that the reciprocal discrimination-time function is sented two-digit number (from 3I to 99, omitting 65) optimal in the world. (Such a proofmight require a precise was, respectively, greater or less than the fixed compar­ specification ofthe form assumed for the processing lim­ ison number 65 (Dehaene, Dupoux, & Mehler, 1990). itations ofa finite being.) But the reciprocal fall-off, with The points are averaged for groups of four consecutive distance, ofdiscrimination time, in addition to fitting the numbers (except close to zero, where, because the curve data, does have the intuitively expected qualitative form rises so steeply, the averaging was within smaller groups and, of course, has (for me) an appealingly simple rela­ of two and, finally, of one), and then for such groups tion to the theory ofgeneralization. equally distant below and above 65. Although there is evidence that symbolic and other factors influence deci­ Investigations ofMusic Cognition sion time, I take such results to provide compelling evi­ The evolutionary emergence ofmental structures spe­ dence that humans compare mentally represented analog cific to the generation and appreciation of music pre­ magnitudes corresponding to the symbolic numbers sents a challenge not posed by the evolutionary emer­ much as if they were comparing physically presented gence of generalization, discrimination, or mental lengths oflines or sizes ofcircles. transformations. The abilities both to discriminate stim­ As before, however, a proposal ofuniversality requires uli and to determine that they may nevertheless corre­ a theoretical justification for why the law should ap­ spond to objects ofthe same kind, with the same poten­ proximate this reciprocal form. I had already argued that tial consequence, are crucial for survival and must be the exponential law of generalization arose as an opti­ possessed in some form, however rudimentary, by all an­ mum cognitive strategy in the world, and not merely as imals capable oflearning. Similarly, a capability for de­ a failure of discrimination. I now turned around and termining as quickly as possible that two successive im­ asked whether the proposed reciprocal law ofdiscrimi­ ages are of the very same object undoubtedly confers nation time might itself have arisen because it too is in some benefit. I believe it to be represented-at least in some way optimal. In our earlier investigations of the the more perceptual form ofapparent motion-in many time needed to determine that two superficially dissimi­ visually and spatially proficient animals. But musical lar stimuli nevertheless correspond to the same external ability does not seem essential for survival. At most, as is object in different spatial orientations, my students and I perhaps suggested by the case ofsongbirds or whales, the found that decision time increases linearly with the ex­ lack ofsuch ability might put one at some disadvantage tent of the kinematically simplest transformation that in competing for a mate and, hence, in perpetuating one's would bring the object from one orientation into the genes. Apart from the debatable cases ofbirds and whales, other. For that case, I argued that actually performing what we recognize as music seems (at least on this planet) such a transformation mentally was the most efficient to be the exclusive province ofhumans. Even among hu­ strategy for establishing structural identity. Establishing mans, moreover, innate musical ability varies apprecia- 14 SHEPARD

A B

••••• Krumhansl & Shepard (1979) ...... Krumhansl & Shepard (1979) e- .~ e- x 7 _ Krumhansl & Kessler (1962) x 7 __ Krumhansl & Kessler (1982) CD 0 CD

C ~ c 0--- 0--- SChubert (op, 94, No.1) SChubert (op. 94, No.1) 8 0 8-e- 0 -e- Q t? Q 0) 6 GI ~6 c: .D '0 ::J '0 CD CD Q 8'5 u CD "'~ CD ... 5 5 a. (ll c. ~ s: ~ ~ ~ ~! ~ := 4 := 4 '0 c: '0 en .g en en en CD 8 ~ CD c: 3 ~"'C c: 3 "'C "'C 0 0 0 ~ 0 g I- .9 ; 0) 2 0 0)2 j o c: c: ! Tones not in the tii if ~=:..~~.~~::.~, a: a: L...... J U m2 M2 m3 M3 P4 T P5 m6 M6 m7 M7 U P5 M2 M6 M3 M7 T m2 m6 m3 m7 P4 Probe Tone (interval up from the tonic for that context) Probe Tone (reordered around circle of fifths)

Figure 5. Musical listeners' ratings of how well each of 12 chromatic probe tones in an octave fits a preceding major-key context, and durations of corresponding tones in an actual piece of music. (A) Broken line: ratings for the context of the diatonic scale (Krumhansl & Shepard, 1979); solid heavy line: ratings for a chord cadence (Krumhansl & Kessler, 1982). Solid thin line: total du­ ration oftones ofthe same 12 pitch classes in Schubert's op. 94, D780 (Krumhansl, 1985). Copyrights 1979 (for Krumhansl & Shep­ ard, 1979) and 1982 (for Krumhansl & Kessler, 1982) by the American Psychological Association. (Adapted with permission.) The profile for Schubert is from Krumhansl (1985). Copyright 1985 by Sigma Xi. (Adapted with permission.) (8) The same data, but with the 12 pitch classes ofthe probe tones permuted from their order around the chroma circle into their order around the circle offifths. bly. The question therefore arises as to whether, and in fa, sol, la, and ti). The heavy solid line shows the results what , there may be universals ofmusic cognition. from a later study by Krumhansl and Kessler (1982), At the outset of our investigations into music cogni­ using richer musical contexts ofchord cadences. The hi­ tion, Carol Krumhansl and I introduced a probe tone erarchy of tonal functions specified by music theory method (Krumhansl & Shepard, 1979) that has proved emerges clearly in the resulting profiles. The highest rat­ capable of yielding stable and reproducible evidence ings are uniformly given to the tonic tone ofthe key de­ about the representation of tonal hierarchies (for many fined by the context (and to the tonic tone an octave subsequent examples, see Krumhansl, 1990). With this above the unison, although that is not shown here). The method, we presented various brief musical contexts next highest ratings are given to the dominant (or perfect consisting, for example, of a musical scale, a chord se­ fifth above the tonic). The next highest ratings are given quence, or a short excerpt from an actual piece ofmusic to the other tones belonging to that key (the subdomi­ (consistent, in each case, with a particular musical key). nant, mediant, and so on). Finally, the lowest ratings are Following each presentation of such a context, we pre­ uniformly given to all the tones not in the key defined by sented a single probe tone of some pitch, and listeners the presented context (the musical accidentals-which, rated (usually on a seven-point scale) how well that in the key of C Major, would correspond to the black probe tone fit in with the preceding context. keys on a piano). In Figure 5A, averaged ratings are plotted for probe These suhjective ratings agree remarkably well with tones presented at each of the successive steps of the objectively measured attributes ofactual pieces ofmusic chromatic scale within an octave determined by the mu­ (Krumhansl, 1990, pp. 68, 72). As one example (from sical key ofthe preceding context. [These probes corre­ Krumhansl, 1985), the solid thin line plots the total du­ spond to the unison, U, taken as the tonic tone ofthe key, rations in numbers of musical beats (as tabulated by and (ascending in pitch) the minor and major seconds, Hughes, 1977) ofthe corresponding notes in the first of m2 and M2, the minor and major thirds, m3 and M3, the Schubert's Moments Musicaux, op. 94, D780. perfect fourth, P4, the tritone, T, and so on up the chro­ In Figure 5B, the data from Figure 5A are replotted, matic scale.] The broken line shows the results for the but with the probe tones permuted from their chromatic more musical listeners ofour original study (Krumhansl order ofascending pitch to their order around the circle of & Shepard, 1979), in which the context was the succes­ fifths (as is listed along the x-axis). According to music sive tones ofthe major diatonic scale (named do, re, mi, theory (as well as to the spatial and group-theoretic struc- KEYNOTE ADDRESS 15

mres soon to be discussed), the circle of fifths is more (Krumhansl, 1990, chap. 10). Fundamental is the usual relevant to musical relations oftonality and key. The for­ selection ofeither five or seven principal tones arranged, merly irregular sawtooth profiles now take on a simpler in the same way within each octave, to achieve a certain U shape such that all the probe tones not in the key de­ uneven spacing in pitch. This uneven spacing is reflected fined by the context are grouped together at the bottom in the unevenness of the sawtooth pattern in Figure 5A ofthe U. (The rating profiles for probe tones following and in the corresponding uneven spacing of the black minor key contexts also show the same bottorn-of-U keys on a piano. Moreover, this selection and uneven grouping oftones not included in the natural minor.) spacing is not arbitrary. It achieves the uniquely best sat­ The rating profiles shown in Figure 5 were obtained isfaction ofeach oftwo independent conditions. from musically inclined listeners. Whenever we tested The first condition arises from physical acoustics. Its unselected populations, we found large individual dif­ recognition can be traced back to what has been deemed ferences. The rating profiles obtained from listeners re­ "the first application of mathematical description to porting less interest or background in music do not show physical quantity"-namely, the "enunciation ofthe laws the sharp, sawtooth pattern ofFigure 5A or the strikingly of musical harmony" by Pythagoras (Gillispie, 1958, orderly dependence on the circle offifths of Figure 5B. p. 63). It has been widely recognized and progressively Instead, the profiles of the less musical listeners show refined during the last century and a half, beginning with (in addition to a degree of octave equivalence) mainly a Helmholtz (1862/1954). It requires that the frequency ra­ monotonic decrease in rating with distance along the tios and the approximations to overtone coincidences chromatic scale from the tonic tone of the context among the selected tones be such that combinations of (Krumhansl & Shepard, 1979). Evidently, whereas mu­ these tones, when played together, achieve blends that sical listeners pick up structural features such as tonal are physically smooth and, hence, pleasingly consonant hierarchies keyed to the circle offifths, less musical lis­ to the ear (see, e.g., Hutchinson & Knopoff, 1978; teners may pick up only pitch height and octave equiva­ Kameoka & Kuriyagawa, 1969). Recently, Schwartz, lence. This is the case even in the Western world, where Howe, and Purves (2003) have obtained striking evi­ diatonic music is ubiquitous. dence that the universals of tone selection and spacing The question of the universality of tonal hierarchies correspond to statistical characteristics common to the led me to undertake, in collaboration with two students speaking voices of all humans, regardless of language. (Christa Hansen and Edward Kessler), my one and only Of course, this does not entail universality of musical cross-cultural experiment. In reading up for that study, I structures for all musical beings wherever they may have was struck by the preoccupation ofmost ethnomusicol­ evolved-unless these statistical characteristics of the ogists with the concrete and particular. Those studying human voice reflect (as I am inclined to suggest) univer­ the music indigenous to Indonesia emphasized that the sals of physical systems capable of generating suitably pelog and slendro musical scales of Java and Bali not complex sound signals in general. only differ completely from the Western diatonic major The second condition arises from abstract group­ and minor scales but also differ, in specific tuning ofthe theoretic considerations. These have been identified only gamelan, from one village to the next. To understand more recently and are still not widely appreciated. They such music, they said, one must immerse oneself in the concern neither frequency ratios (except the fundamen­ music and culture of a particular village. But, as a cog­ tal octave relation) nor anything about overtones. In­ nitive psychologist, I wanted to know what might be stead, they place constraints on the merely ordinal rela­ common to all human cultures. Hansen (who was then tions among pitch intervalsnecessary to achieve important undertaking field work in Indonesia), with the help of cognitive-musical objectives (Balzano, 1980; Shepard, her husband, Mervyn Dennehy (who was fluent in the 1982a). Offundamental importance is the establishment three dialects ofBali), made the rare find ofa remote Ba­ of a fixed and readily remembered pitch framework. linese village whose residents had never been exposed to This is the reason for the unequal spacing of the (seven Western diatonic music. Yet the ratings these villagers or five) selected tones. (In the example ofthe Western C­ gave to (simplified) probe-tone tests using gamelan con­ Major scale, the seven correspond to the white keys on the texts (which Kessler and I had devised back at Stanford) piano and the five to the black keys ofthe complementary provided evidence that a subset ofthese villagers, just as pentatonic scale.) If, instead, the selected tones were a musically inclined subset of Western listeners, was equally spaced, each would have exactly the same pitch sensitive to similar tonal hierarchies (Kessler, Hansen, relations to all other tones. Such a spacing fails to provide & Shepard, 1984). the fixed, anchoring reference frame required for the es­ Along with other cognitively oriented music re­ tablishment ofthe tonic and hierarchy oftonal functions searchers (including Burns, 1999, Carterette & Kendall, with respect to which there can be melodic and harmonic 1999, and Dowling, 1978), we have noted that the musi­ tension, motion, and resolution. The selection ofthe un­ cal scales of all cultures, despite their differences, have equally spaced subset (of five or seven tones) from an in common certain abstract features. This holds for the equally spaced larger set (of 12 chromatic tones in West­ pelog and slendro scales of Bali (Kessler et aI., 1984, ern music) additionally provides for motion, at a higher pp, 139-140) and for the different scales ofNorth India level, through modulations between keys (Balzano, 1980; 16 SHEPARD

Krumhansl, 1990; Krumhansl & Kessler, 1982; Shepard, temporal scale. Moreover, the same group-theoretic ar­ 1982a; Shepard & Jordan, 1984). guments apply to rhythmic structures as to tonal struc­ These two sets of conditions-namely, the physical tures. It may therefore be, as argued by Jeffrey Pressing conditions that determine relations of sensory conso­ (1983), that there is a cognitive isomorphism between, nance among tones and the group-theoretic conditions for example, the rhythmic structures of West African that determine which abstract structures furnish a cogni­ music and the diatonic system ofpitch in Western Euro­ tive framework supporting the full tonal hierarchy and pean music. the sense of location and motion of tones and keys in My provisional conclusions from these investigations space-share only the assignment ofcentral importance into music cognition are summarized in the following to the octave. Remarkably, in view of their otherwise points: (1) Very general principles of perception and complete independence, they both specify as optimal the cognition apply in the domain ofmusic just as they do in same division of the octave into the 12 chromatic tones other domains. Between musical objects (tones, chords, and, also, the same type ofselection from these tones of melodies, keys, rhythms, styles, etc.), I propose that gen­ the subset defining a given key. eralization probability falls off exponentially and dis­ For all divisions of the octave into any number of crimination time falls off reciprocally with distance in equally spaced tones, analysis ofphysical properties such the appropriate representational space. On the other as the extent to which consonant frequency ratios are hand, time to achieve a full mental connection between achieved yields good approximations for 12, 19,31,41, those objects (as parts ofthe same melodic stream, chord 53, ... tones per octave. But the group-theoretic analy­ cadence, etc.) increases linearly with the least-time sis yields results satisfying the specified cognitive crite­ transformational path between them in the appropriate ria only for 12,20,30,42,56, ... tones per octave (see space. (2) The perception and cognition specifically of Balzano, 1980). Notice that the only one of these divi­ music is, however, subject to large individual differences sions that satisfies both the more sensory, acoustic con­ among humans in those components of the representa­ ditions and the more cognitive, group-theoretic condi­ tional space corresponding to higher order cognitivestruc­ tions is the division into 12 tones per octave (which tures, such as the circle of fifths, tonal hierarchies, and corresponds, in the group-theoretic analysis, to the structural relations among keys or modes. (3) Despite cyclic group C l 2 = C3 X C4). The selection of no more differences among individuals within anyone culture, than 7 tones to define a given key may also be the only sensitivity to these higher order structures is manifested choice favored by either analysis, alone, that is compat­ by the more musical individuals within cultures having ible with Miller's (1956) "seven-plus-or-minus-two" lim­ such diverse musics as those of Bali, West Africa, and itation ofhuman memory. Western Europe. (4) Because these higher order struc­ It seems almost miraculous that these two fundamen­ tures are determined by physical and abstract mathemat­ tally different sets of(acoustic and group-theoretic) con­ ical universals, I conjecture that they may be universally ditions converge in specifying the same musical structure. approximated in all sufficiently advanced musical beings. As just one consequence, consider the musical fifth, which is (after the octave) the most important interval in The Necessarily Abstract Character ofMental music theory, the most prevalent in world musics, and Universals the most prominent in the rating profiles shown in Fig­ The investigations I have reviewed here on mental ure 5. In the acoustic analysis, the perfect fifth is unique transformations, on generalization and discrimination, because it achieves (after the octave) the simplest (3:2) and on music cognition indicate how in psychology, just frequency ratio and the least dissonance-causing beating as in physics, universality is attained by framing the for­ of mismatched overtones. But, in the group-theoretic mulations at a sufficiently abstract level. If we conceive analysis, the fifth is unique for an entirely different rea­ of stimuli, objects, or situations as points in an abstract son. Beyond the semitone, which (on iteration) generates representational space, then two fundamental types of the chromatic scale (laid out on the x-axis ofFigure 5A), geometrical properties specifiable for those points are the fifth is the only ascending interval that generates (on distances between the points and connected subsets of iteration) each of these 12 tones before repeating. In the points. These are just the two properties from which doing so, it produces the very different circle of fifths I derived the empirically established forms of two can­ (laid out on the x-axis ofFigure 5B). didate universal laws: All of the preceding has focused on the cognitive structures underlying just one of the two most funda­ Law ofgeneralization: The probability that an en­ mental components of music-namely, pitch. There is, countered stimulus has the potential for the conse­ however, reason to believe that similar cognitive struc­ quence associated with a previous stimulus, despite tures underlie the other most fundamental component: their perceived difference, decreases exponentially time. The periodicities and, hence, frequency ratios of with the distance between them in their representa­ the (horizontal) rhythmic patterns in time are analogous tional space. to the periodicities and, hence, frequency ratios of the Law ofdiscriminative reaction time: The time re­ (vertical) patterns in pitch, but on a hugely expanded quired to determine that two stimuli are not identical, KEYNOTE ADDRESS 17

despite their perceived similarity, decreases (toward Ofcourse, all ofthe four principal laws just listed re­ its asymptote) as a reciprocal function ofthe distance quire other supporting abstract principles and structures. between them in their representational space. To avoid technical details, I mention only the most rele­ vant here. These laws ofgeneralization and ofdiscriminative reac­ In the theory ofgeneralization, I propose that the local tion time extend beyond pairs of stimuli. In the case of form ofthe metric ofthe abstract representational space categorization learning and performance, generalization has been determined by the degree of correlation (over and discrimination time both fall off according to the evolutionary history) ofthe extensions of consequential same sorts of (exponential and reciprocal) functions of regions along the dimensions ofthat space. This, I pro­ distance of a stimulus from the category boundary in the pose, is what distinguishes what have been termed integral representational space (Shepard, 198Ia). (Concerning and separable relations between dimensions (Garner, dependence ofreaction time on distance from category 1974; Shepard, 1991; see also Shepard, 1964a). In the boundary, see, e.g., De Rosa & Morin, 1970; Levy, Gold­ way that I indicated earlier, the shapes of the equal­ berg, & Schmid, 1980. Concerning generalization in cat­ generalization contours produced by the Bayesian inte­ egorization, see Nosofsky, 1987, 1989, 1992; Shepard & gration implicate a spatial metric that locally is approx­ Chang, 1963; Shepard & Kannappan, 1991; Tenenbaum imately Euclidean if these extensions have been highly & Griffiths, 200 Ia, 200 Ib.) correlated. But the shapes of the equal-generalization Another equally fundamental type of property speci­ contours produced by this integration implicate a metric fiable for any two points in a representational space is that deviates from the Euclidean (in the manner of a the most direct path connecting those points in that Minkowski r-metric with r < 2) to the extent that these space. This isjust the property from which I have derived extensions have been uncorrelated (Myung & Shepard, the empirically confirmed form ofmy candidates for the 1996; Shepard, 1987, 1991). (The same theory provides following universal laws ofmental transformations: an explanation for a striking difference in classification Law ofspatial transformation: The time required to learning that depends on whether the dimensions ofthe establish that two stimuli, despite their apparent dis­ stimuli are integral or separable-as shown in the exper­ similarity, correspond to the same object or to ob­ iments of Garner, 1974; Nosofsky, 1986; Shepard & jects of the same shape increases linearly with the Chang, 1963; Shepard, Hovland, & Jenkins, 1961.) shortest path along a geodesic connecting the stim­ In the theory of spatial transformations, the global ulus points in the representational space ofthe ob­ structure ofthe representational space is determined by ject's possible positions. the object's symmetries and degrees ofapproximation to Law ofmusical transformation: The time required such symmetries. Specifically, the space ofdistinguish- . able orientations ofan asymmetric object is a curved six­ to assimilate two musical tones to the same melody or to pitch-shift one's tonal schema following a dimensional manifold [which is isomorphic to the semi­ direct product of the translation group, R3, and the modulation between musical keys increases linearly with the shortest connecting path in the appropriate rotation group, SO(3); Carlton & Shepard, 1990a]. But, to the extent that the object approximates some symme­ representational space. try, this space becomes more convoluted as determined The law of spatial transformation is intended to cover by the symmetry group ofthe object [S(O)] and the de­ the cases of both apparent motion and imagined transfor­ gree ofapproximation to that symmetry (as set forth in mations. Indeed, both this law and that ofmusical trans­ Carlton & Shepard, 1990b, and, in simplified form, in formation can be regarded as instances ofa single, more. Shepard, 1981c, 1994). abstract law of mental transformation. Such a more ab­ . In the case oftransformations in music cognition, the stract law might serve as a quantitative specification for appropriate abstract representational spaces arise as group­ a general principle of "normalization of irrelevant di­ theoretic products ofcomponents generated by iteration mensions in stimulus comparison" proposed by Dixon offundamental musical intervals (Balzano, 1980). Thus, and Just (1978). the space ofmusical keys has a circular component (the In some ofthese cases, such as that ofmusical trans­ already mentioned circle of fifths) such that among ei­ formation, the law oftransformation admittedly has, as ther major or minor keys, the most closely related keys yet, less extensive empirical support. The case of per­ are neighbors around that circle. Correspondingly, the ceiving the relation between tones is, however, quite par­ most common transitions from one such key to another allel to that for visual apparent motion (Jones, 1976; result from the smallest angles of rotation in the space Shepard, 1981c, pp. 318-321; 1984,p. 427; VanNoorden, (Shepard, 1982a). Depending on the circumstances of 1975, p. 48). Thus, coherence ofmelodies requires that the test, other circular components also emerge. Multi­ most melodic intervals be small and, also, that additional dimensional scaling applied to probe-tone ratings yielded time be allowed for the larger intervals (see Figure 8 and toroidal configurations in which one circular component ensuing discussion of"melodic motion" in Huron, 200 I; is the circle offifths (laid out linearly along the x-axis in concerning shift ofthe tonal schema in the space ofmu­ Figure 4B). When the context of a major diatonic scale sical keys, see, especially, Krumhansl & Kessler, 1982). was presented to listeners of diverse musical abilities 18 SHEPARD

(Krumhansl & Shepard, 1979), the second circular com­ (e.g., McCloskey, 1983; McCloskey, Caramazza, & ponent that emerged (Shepard, 1982a) was the chroma Green, 1980). Concerning any result that might be taken circle (first realized in concrete, physical form by Shep­ as evidence against a theory-including a theory about ard, 1964b, and laid out linearly along the x-axis in Fig­ internalization, levels ofabstraction, and universality­ ure 5A). When richer contexts ofboth major and minor one should ask whether the experimental conditions are chord cadences were presented to musical listeners those that are appropriate according to the theory itself. (Krumhansl & Kessler, 1982), yet another circular com­ Three points are particularly relevant here. ponent emerged that brought major thirds together and The first point is that the abstract paper-and-pencil achieved proximity between any key ofone mode (major tasks typically used in the experiments just cited bear lit­ or minor) and its relative and parallel keys ofthe oppo­ tle resemblance to the concrete, perceptually experi­ site mode (minor or major, respectively). In this repre­ enced, real-life challenges that confronted our Pleis­ sentation, probable transitions between keys of both tocene ancestors. There would have been little selective modes (major and minor) in Western diatonic music cor­ pressure favoring the cognitive skills needed for the swift respond to nearest neighbor transitions between points and accurate solution of such abstract paper-and-pencil falling on a toroidal surface in a four-dimensional em­ problems. Indeed, tests of the same general kinds of bedding space (Krumhansl & Kessler, 1982). knowledge by means ofperceptually more concrete, re­ alistic, or meaningful tasks have usually led to better per­ The Sense in Which Cognitive Principles and formance (see, e.g., Cosmides & Tooby, 1997; Kaiser, Structures May Be Universal Proffitt, & Anderson, 1985; Kaiser, Proffitt, Whelan, & Now, when I put forward as "universal" a psychological Hecht, 1992). (Ofcourse, some humans are now able to law or structure, I am not claiming that the behavior ofany solve such abstract problems by paper-and-pencil calcu­ given organism-letalone every livingorganism-behaves lation, but they do so by means ofexplicit mathematical in conformity with that law or an associated abstract spa­ principles of reasoning slowly developed over the last tial or group-theoretic structure. Although I expect the few millennia and transmitted culturally rather than by exponential law of generalization to be approximated means of individually available intuition.) Similarly, quite widely among animals, I regard it as unlikely that people's behavior relative to objects in space-even when an earthworm appreciates music or is capable ofmental they are blindfolded-reveals more precise knowledge rotation, let alone mental rotation yielding a decision ofspatial relations than they are able to express verbally time that increases linearly with angular difference. (see, e.g., Milner & Goodale, 1998; Philbeck, Loomis, & Rather, I am proposing that as organisms evolve more Beall, 1997). and more powerful capabilities for representing the ex­ The second point is closely related to the first. The ternal world, the principles governing their cognitive basic perceptual skills that actually were important to our representations and, hence, their behavior will increas­ ancestors throughout the Pleistocene evidently do operate ingly approximate what would be optimum in that world. in close approximation to optimum principles such as At any given evolutionary stage, including that cur­ those ofBayesian inference (see, e.g., Geisler & Kersten, rently attained by humankind, the approximation is very 2002; Knill & Richards, 1996; Weiss, Simoncelli, & Adel­ likely imperfect and perhaps unevenly distributed in the son, 2002) and color constancy (see, e.g., Maloney & population. This is particularly likely to be true for ca­ Wandell, 1986; Shepard, 1990b, 1994, pp. 16-22). In con­ pabilities that are not crucial for an individual's survival trast with our effortful and blundering struggles with the and reproduction. Music cognition may be a case in abstract paper-and-pencil tasks, we have much less con­ point. In many other domains, cognitive process may scious awareness ofthese perceptual processes of more amount to a collection ofheuristics (Gigerenzer, Todd, & ancient provenance. Through , the latter the ABC Research Group, 1999; Newell, Shaw, & Simon, processes have become so deeply internalized and auto­ 1958) or a utilitarian "bag of tricks" (Ramachandran, matic that conscious, deliberative intervention would only 1990) that is not guaranteed to yield the optimum result interfere with their swift, accurate, and reliable operation. in all situations but, for typically occurring situations, The third point is apt to be overlooked until one ac­ may "satisfice" (Simon, I956, I990). If any selective knowledges a terrestrially unprecedented evolutionary de­ pressure for still better representation continues, how­ velopment that has occurred with the emergence of a ca­ ever, I know ofno impenetrable barrier preventing such pacity for abstract thought and for language in the human collections of heuristics from evolving still closer ap­ line. This is the development ofan ability to set aside im­ proximations to theoretically optimum principles for still mediate self-interest and, dispassionately and hypotheti­ wider ranges of situations. Surely, we already have ad­ cally, (l) to imagine or simulate different alternatives vanced quite a way beyond the earthworm. within a potentially large but well-defined domain or space But what about the widely cited evidence that even we and (2) to evaluate those alternatives with respect to some humans fall far short of optimum performance on vari­ explicitly specified criterion. I have referred to this devel­ ous cognitive tasks? Examples include simple logical de­ opment as the step to rationality (Shepard, 2001, p. 741). duction (see, e.g., Wason & Johnson-Laird, 1972; Wood­ Admittedly, this capacity may not yet be fully or worth & Sells, 1935), probabilistic inference (e.g., Tversky widely developed even in the human line, and in our ef­ & Kahneman, 1974, 1981, 1983), and intuitive physics forts to exercise it, we often depend on the support ofex- KEYNOTE ADDRESS 19

ternal aids-yes, including paper and pencil and, more (1886) judged it to be with "high probability ... the uni­ recently, computers. Still, it is this new, rational capacity versal law governing all processes in nature," and Planck that has enabled a precious few human individuals to (whose quantum ofaction is the cornerstone ofquantum make their transformative contributions to science as mechanics) wrote that "Einstein's theory ofrelativity ... well as to other disciplines-including musical compo­ has shown that it occupies the highest position among sition and moral philosophy. (In the latter connection, I physical laws" (Planck, 1925/1993, pp. 69, 80). With have argued that a recognition ofthe nature ofthis ratio­ Feynman's culminating path-integral extension of it to nal capacity also makes possible a reconciliation offree quantum mechanics (Feynman & Hibbs, 1965), the prin­ will and , which have for so long seemed in­ ciple of least action was seen to be, itself, a symmetry compatible; Shepard, 2001, p. 747.) Finally, and ofmore principle. It has been described as the "democracy of his­ immediate relevance here, only by developing explicit tories" in that all possible histories ofmotion are treated, and correct formulations of symbolic logic, probability symmetrically, as "equal under the law." (In the curved theory, and theoretical physics are we now able to deter­ four-dimensional space-time manifold of general rela­ mine the correct answers to the paper-and-pencil tests on tivity, incidentally, least action entails the geodesic paths which we otherwise fail so miserably. Only thus are we of motion that suggested to me the idea that mental provided with a benchmark against which to evaluate transformations might traverse least-time geodesics in how closely our innate perceptual or cognitive functions the curved six-dimensional space of possible positions approach optimality. ofan object.) I may be foolhardy to challenge the prevailing em­ How Cognitive Principles and Structures May piricist world view with the claim that a law ofphysics Underlie Our Capacity for Scientific Discovery might be discoverable by thought alone, or to justify this In my Paul Fitts Lectures at the University ofMichi­ claim with the suggestion that what is generally regarded gan in 1993 and, more extensively, in my as an empirical fact might be a mathematical necessity. Lectures at Harvard in the fall of 1994, I explored the But, just as I hope that psychological laws are not arbi­ possibility that the cognitive principles ofgeneralization trary, I hope that physical laws are not arbitrary. Other­ and mental transformation considered here may have wise, in arguing ,that psychological laws have emerged made possible some ofthe most far-reaching theoretical partly as accommodations to physical universals (and advancesin science. A book largelybased on these lectures not just to mathematical universals), I would have suc­ is still in preparation, alas. But I have recently described ceeded merely in pushing arbitrariness in psychology how the capacity for mental transformations, which I be­ back to arbitrariness in physics. There are, however, two lieve to underlie our deep intuitions about symmetry, alternatives to the possibility that the ultimate laws of'na­ may help to explain the power ofthought experiments in ture are merely arbitrary or capricious. the discovery ofsuch physical laws as Archimedes's law The first alternative is that those laws constitute the ofthe lever, Galileo's mass-independence law for falling only mathematically self-consistent set. Such an alterna­ bodies, and Newton's action-and-equal-reaction law of tive has been contemplated by several theoretical physi­ motion (Shepard, 2001, pp. 742-743). I showed how cists, including the late particle physicist and cosmolo­ these laws can be established without carrying out any gist David Schramm (see Lightman & Brawer, 1990, experiment physically, by using the single principle that p. 450). Similarly, Einstein wrote: "In a certain sense, the physics ofa situation is unchanged by any permuta­ pure thought can grasp reality, as the ancients dreamed" tion ofequivalent objects. (Einstein, 193411954, p. 274) and is reported to have For modern theoretical physicists, ofcourse, symme­ wondered aloud whether "God had any choice." Much tries are fundamental and pervasive. The mathematician earlier, Leibniz (who closely followed Maupertuis as one Emmy Noether proved that every continuous symmetry ofthe two earliest articulators ofthe least-action princi­ implies a conservation law and vice versa. And, as re­ pie) expressed the related idea that the universe is unique marked by the Nobel Laureate codeveloper ofthe stan­ in realizing the richest set ofactualities by the simplest dard theory ofpartic1e physics, (1992, possible means. (As a visualizable analogy to Leibniz's p. 2 I2), "it is principles of symmetry that dictate the idea, consider that one simple recursive formula generates dramatis personae . . . on the quantum stage." In my re­ the inexhaustible fractal complexity ofthe Mandelbrodt sponse to the commentators in the 2001 special issue of set, beautifully illustrated in Peitgen & Richter, 1986.) Behavioral and Brain Sciences, I also indicated how the The second alternative is that the universe in which we symmetry of invariance under permutation can be seen reside is but one of infinitely many universes and that to underlie the "golden rule" and the essential idea be­ within the whole infinite set, everything mathematically hind the highly influential theory ofjustice of John possible somewhere becomes a physical reality, as con­ Rawls (1971). Going well beyond the realm of science, jecturedby the physicist Max Tegmark (1998). If universes then, symmetry may also be crucial to the formulation of endlessly spawn other universes, perhaps in one of the a rational grounding for universal principles of ethics ways envisioned by Andrei Linde (1994), Lee Smolin (Shepard, 2001, pp. 744-748). (1997), or Tegmark, symmetry may be broken differently In physics, the principle ofleast action is perhaps the in the birth ofeach new universe, yielding different laws of most fundamental symmetry principle ofall. Helmholtz physics (perhaps corresponding to such symmetry groups 20 SHEPARD

as SU(5), SU(4) X U(1), or SU(3) X SU(2) X D(1), as on circumstances) are circular, helical, or toroidal in em­ Linde suggested). Only in a tiny fraction of these uni­ bedding spaces of (respectively) two, three, or four di­ verses would the resulting laws ofphysics be compatible mensions. Here, application ofmultidimensional scaling with the emergence oflife and, hence, ofmind. The laws to similarity measures based on human listeners' data in these rare life-engendering universes would therefore have yielded just the predicted circular, helical, and not be arbitrary. In accordance with the anthropic cosmo­ toroidal configurations (Krumhansl, 1990; Krumhansl logical principle, then, even under a cosmology in which & Kessler, 1982; Shepard, 1982a, pp. 310, 323; Ueda & all possibilities somewhere become actualities, mental Ohgushi, 1987). laws-which can reflect, among physical laws, only those Second, although the experiments yielding data that that are life engendering-will also not be arbitrary. fit these highly abstract mathematical theories are ob­ tained under highly controlled laboratory conditions Prospects for Psychological Science having little resemblance to the complex situations we I hasten to emphasize that my preceding discussion deal with in everyday life, I claim that this is just the way concerned discovery ofcertain laws ofphysics. I am not in which fundamental science advances. The physicist advocating that physicists-let alone psychologists­ confirms general and precise quantitative laws, such as abandon the laboratory for an armchair. Despite our halt­ the law offalling bodies, not by observing the descent of ing steps toward rationality, we still have only partial ac­ leaves on a gusty autumn day but by timing the descent cess to only partially internalized wisdom about universals ofa heavy ball in an evacuated cylinder. Moreover, as the of the world. On looking back, I find that most of my theory strives for greater and greater simplicity and gen­ own successful experiments (such as those on mental ro­ erality, it necessarily becomes, as noted by Einstein tation, apparent motion, and generalization) were ones (1934/1954, p. 282) "steadily more abstract and remote for which I had a pretty clear idea of what the results from experience." Thus, in Einstein's eloquent words would be before I even began collecting the data. But I (quoted in Holton, 1973, p. 377), the scientist, like the was surely guided toward thinking ofthese experiments, artist, erects a "simplified and lucid image ofthe world," and of how they should be set up, by my preceding ex­ lifting into it "the center ofgravity ofhis emotional life." periences in the psychology laboratory. This admission is not, however, inconsistent with my REFERENCES claim that if formulated at the appropriate level of ab­ ANDERSON, 1. A. (1990). The adaptive character ofthought. Hillsdale, straction, psychological laws may partake ofthe univer­ NJ: Erlbaum. sality ofthe laws ofmathematics and physics. Granted, ANDERSON, 1. A. (1991). The adaptive nature of human categorization. the non-Euclidean, curved, and high-dimensional spaces , 98, 409-429. and group-theoretic structures posited here are foreign ATTNEAVE, F. (1950). Dimensions ofsimilarity. American Journal of both to what we directly experience and to any folk psy­ Psychology, 63, 516-556. BALZANO, G. J. (1980). The group-theoretic description oftwelvefold chology. One might well wonder whether such abstract and microtonal pitch systems. Computer Music Journal, 4, 66-84. spaces and structures and their associated principles BEHAVIORAL AND BRAIN SCIENCES (2001). [Special issue on the work have any real existence or relevance for human life, ex­ of ], 24(4). perience, or behavior. In closing, I offer two justifica­ BOWER, G. H., & TRABASSO, T.(1963). Reversals prior to solution in con­ tions for believing that they do. cept identification. Journal ofExperimental Psychology, 66, 409-418. BOWER, G. H., & TRABASSO, T.(1964). Concept identification. In R. C. First, these abstractions, like the equally nonintuitive Atkinson (Ed.), Studies in mathematical psychology (pp, 32-96). abstractions oftheoretical physics, have yielded empiri­ Stanford: Stanford University Press. cally confirmed predictions. As one example, the theory BUCKLEY, P.B., & GILLMAN, C. B. (1974). Comparison of digits and dot ofmental transformations specifies that the representa­ patterns. Journal ofExperimental Psychology, 103, 1131-1136. BURNS, E. M. (1999). Intervals, scales, and tuning. In D. Deutsch (Ed.), tional subspace corresponding to polygons that have a The psychology ofmusic (pp. 215-264). New York:Academic Press. fixed, intermediate degree ofcentral symmetry but that CARLTON, E., & SHEPARD, R. N. (1990a). Psychologically simple mo­ vary through 360 0 in orientation is a curve that both lies tions as geodesic paths: 1. Asymmetric objects. Journal ofMathe­ on a torus and forms the edge ofa Mobius strip in a four­ matical Psychology, 34, 127-188. dimensional embedding space. Yet, when reaction time CARLTON, E., & SHEPARD, R. N. (1990b). Psychologically simple mo­ tions as geodesic paths: 11. Symmetric objects. Journal ofMathe­ data from human subjects presented with these polygons matical Psychology, 34, 189-228. were submitted to multidimensional scaling (of the CARROLL, J. D., & CHANG, 1.-1.(1970). Analysis of individual differ­ INDSCAL type developed by my former Bell Labs co­ ences in multidimensional scaling via an N-way generalization of workers Carroll and Chang, 1970), the points represent­ "Eckart-Young" decomposition. Psychometrika, 35, 283-319. CARTERETTE, E. C., & KENDALL, R. A. (1999). Comparative music per­ ing these polygons fell on exactly the predicted curve in ception and cognition. In D. Deutsch (Ed.), The psychology ofmusic the four-dimensional solution (Shepard & Farrell, 1985). (pp. 725-791). New York:Academic Press. Likewise, in the case ofmusic cognition, the acoustic, COOPER, L. A. (1976). Demonstration ofa mental analog ofan external geometric, and group-theoretic analyses mentioned here, rotation. Perception & Psychophysics, 19, 296-302. as well as analyses by music theorists (e.g., Werts, 1983), COSMIDES, L., & TOOBY, 1.(1997). Dissecting the computational archi­ tecture ofsocial inference mechanisms. In G. R. Bock & G. Cardew dictate that musical tones, at one level, and musical keys, (Eds.), Characterizing human psychological adaptations (pp. 132­ at another, have spatial representations that (depending 156). Ciba FoundationSymposium 208. New York:Wiley. KEYNOTE ADDRESS 21

CUNNINGHAM, J. P., & SHEPARD, R. N. (1974). Monotone mapping of HULL, C. L. (1952). A behavior system. New Haven, CT: Yale Univer­ similarities into a general metric space. Journal ofMathematical Psy­ sity Press. chology, 11, 335-363. HURON, D. (2001). Tone and voice: A derivation ofthe rules of voice­ CURTIS, D. W., PAULOS, M. A., & RULE,S. J. (1973). Relation between leading from perceptual principles. , 19, 1-64. disjunctive reaction time and stimulus difference. Journal ofExper­ HUTCHINSON, W., & KNOPOFF, L. (1978). The acoustic component of imental Psychology, 99,167-173. Western consonance. Interface, 7, 1-29. DEHAENE, S., Dusoux, E., & MEHLER, J. (1990). Is numerical com­ JONES, M. R. (1976). Time, our lost dimension: Toward a new theory of parison digital? Analogical and symbolic effects in two-digit number perception, attention, and memory. Psychological Review, 83, 323­ comparison. Journal ofExperimental Psychology: Human Percep­ 355. tion & Performance, 16,626-641. KAISER, M. K., PROFFITT, D. R., & ANDERSON, K. (1985). Judgments of DE ROSA, D., & MORIN,R. (1970). Recognition reaction time for digits natural and anomalous trajectories in the presence and absence of in consecutive and non-consecutive memorized sets. Journal ofEx­ motion. Journal ofExperimental Psychology: Learning, Memory, & perimental Psychology, 83, 472-479. Cognition, 11, 795-803. DIXON, P., & JUST, M. A. (1978). Normalization of irrelevant dimen­ KAISER, M. K., PROFFITT, D. R., WHELAN, S., & HECHT, H. (1992). In­ sions in stimulus comparison. Journal ofExperimental Psychology: fluence of animation on dynamical judgments. Journal ofExperi­ Human Perception & Performance, 4, 36-46. mental Psychology: Human Perception & Performance, 18,669-690. DoWLING, W. J. (1978). Scale and contour: Two components ofa the­ KAMEOKA, A., & KURIYAGAWA, M. (1969). Consonance theory Part II: ory of memory for melodies. Psychological Review, 85, 341-354. Consonance ofcomplex tones and its calculation method. Journal of EINSTEIN, A. (1923a). The ofrelativity. Princeton, NJ: Prince­ the Acoustical Society ofAmerica, 45, 1460-1469. ton University Press. KESSLER, E. J., HANSEN, c.,& SHEPARD, R. N. (1984). Tonal schemata EINSTEIN, A. (I923b). Sidelights ofrelativity. New York: Dutton. in the perception ofmusic in Bali and in the West. Music Perception, EINSTEIN, A. (1954). Ideas and opinions (Sonja Bargmann, Trans.). 2, 131-165. (Originally published as Mein Weltbild, 1934) KNILL, D. c., & RICHARDS, W. (EDS.) (1996). Perception as Bayesian EKMAN, G. (1954). Dimensions ofcolor vision. Journal ofPsychology, inference. Cambridge: Cambridge University Press. 38,467-474. KRECHEVSKY, 1. (1932a). "Hypotheses" in rats. Psychological Review, EsTES, W. K. (1950). Toward a statistical theory of learning. Psycho­ 39,516-532. logical Review, 57, 94-107. KRECHEVSKY, 1. (1932b). "Hypotheses" versus "chance" in the pre­ EsTES,W. K. (2002). Traps in the route to models ofmemory and deci­ solution period in sensory discrimination-learning. University ofCal­ sion. Psychonomic Bulletin & Review, 9, 3-25. ifornia Publications in Psychology, 6, 27-44. FARRELL, J. E., & SHEPARD, R. N. (1981). Shape, orientation, and ap­ KRUMHANSL, C. L. (1985). Perceiving tonal structure in music. Ameri­ parent rotational motion. Journal ofExperimental Psychology: can Scientist, 73, 371-378. Human Perception & Performance, 7, 477-486. KRUMHANSL, C. L. (1990). Cognitive foundutions ofmusicalpitch. New FEYNMAN, R. P., & HIBBS, A. R. (1965). Quantum mechanics and path York: Oxford University Press. integrals. New York: McGraw-Hill. KRUMHANSL, C. L., & KESSLER, E. J. (1982). Tracing the dynamic FOSTER, D. H. (1975). Visual apparent motion and some preferred paths changes in perceived tonal organization in a spatial representation of in the rotation group SO(3). Biological Cybernetics, 18,81-89. musical keys. Psychological Review, 89, 334-368. GARNER, W. R. (1974). The processing ofinformation and structure. KRUMHANSL, C. L., & SHEPARD, R. N. (1979). Quantification ofthe hi­ Potomac, MD: Erlbaum. erarchy of tonal functions within a diatonic context. Journal ofEx­ GEISLER, W. S., & KERSTEN, D. (2002). Illusion, perception, and Bayes. perimental Psychology: Human Perception & Performance, 5, 579­ Nature Neuroscience, 5, 508-510. 594. GIGERENZER, G., TODD, P. M., & THE ABC RESEARCH GROUP(1999). KRUSKAL, J. B. (1964a). Multidimensional scaling by optimizing good­ Simple heuristics that make us smart. New York: Oxford University ness offit to a nonmetric hypothesis. Psychometrika, 29, 1-27. Press. KRUSKAL, J. B. (l964b). Nonmetric multidimensional scaling: A nu­ GILLISPIE, C. C. (1958). A physicist looks at Greek science. American merical method. Psychometrika, 29, 115-129. Scientist, 46, 62-74. LAKATOS, S., & SHEPARD, R. N. (I997a). Constraints common to apparent GUTTMAN, N., & KALISH, H. 1. (1956). Discriminability and stimulus motion in visual, tactile, and auditory space. Journal ofExperimental generalization. Journal ofExperimental Psychology, 51, 79-88. Psychology: Human Perception & Performance, 23, 1050-1060. HADAMARD, J. (1945). The psychologyofinvention in the mathematical LAKATOS, S., & SHEPARD, R. N. (1997b). Time-distance relations in field. Princeton, NJ: Princeton University Press. shifting attention between locations on one's body. Perception & HELMHOLTZ, H. VON (1886). Ober die physikalische Bedeutung des Psychophysics, 59, 557-566. Princips der kleinsten Wirkung. Journal fiir reine und angewandte LEVINE, M. (1966). Hypothesis behavior by humans during discrimi­ Mathematik, 100, 137-166. nation learning. Journal ofExperimental Psychology, 71, 331-338. HELMHOLTZ, H. VON (1954). On the sensations oftone as a physiolog­ LEVINE, M., MILLER,P., & STEINMEYER, C. H. (1967). The none-to-all ical basis for the theory ofmusic (A. 1. Ellis, Trans.). New York: theorem ofhuman discrimination learning. Journal ofExperimental Dover. (Original work published 1862) Psychology, 73, 568-573. HENMON, V. A. C. (1906). The time ofperception as a measure ofdif­ LEVY, R. M., GOLDBERG, D. M., & SCHMID, J. C. (1980). Different pre­ ferences in sensations. New York: Science Press. dictors of memory scanning with unidimensional and digit stimuli. HINRICHS, J. V., YURKO, D. S., & Hu, J.-M. (1981). Two-digit number Bulletin ofthe Psychonomic Society, 15,331-334. comparison: Use ofplace information. Journal ofExperimental Psy­ LEYTON, M. (1992). Symmetry, causality, mind. Cambridge, MA: MIT chology: Human Perception & Performance, 7, 890-901. Press/Bradford Books. HOLTON, G. (1973). Thematic origins ofscientific thought: Kepler to LIGHTMAN, A., & BRAWER, R. (1990). Origins: The lives and worlds of Einstein. Cambridge, MA: Press. modern cosmologists. Cambridge, MA: Harvard University Press. HOVLAND, C. 1.(1937). The generalization ofconditioned responses: 1. LINDE,A. (1994). The self-reproducing inflationary universe. Scientific The sensory generalization of conditioned responses with varying American, 271, 48-55. frequencies of tone. Journal ofGeneral Psychology, 17,125-148. LUCE,R. D. (1959a).Individual choice behavior: A theoretical analysis. HOVLAND, C. 1. (1952). A "communication analysis" ofconcept learn­ New York: Wiley. ing. Psychological Review, 59, 347-350. LUCE,R. D. (1959b). On the possible psychophysical laws. Psycholog­ HUGHES, M. (1977). A quantitative analysis. In M. Yeston (Ed.), Read­ ical Review, 66, 81-95. ings in Schenker analysis and other approaches. New Haven, CT: MALONEY, L. T., & WANDELL, B. A. (1986). Color constancy: A method Press. for recovering surface spectral reflectance. Journal ofthe Optical So­ HULL, C. L. (1943). Principles ofbehavior. New York: Appleton-Century­ ciety ofAmerica A, 3,29-33. Crofts. McBEATH, M. K., & SHEPARD, R. N. (1989). Apparent motion between 22 SHEPARD

shapes differing in location and orientation: A window technique for SHEPARD, R. N. (1957). Stimulus and response generalization: A sto­ estimating path curvature. Perception & Psychophysics, 46, 333-337. chastic model relating generalization to distance in psychological MCCLOSKEY, M. (1983). Intuitive physics. Scientific American, 248, space. Psychometrika, 22, 325-345. 122-130. SHEPARD, R. N. (I 958a). Stimulus and response generalization: De­ MCCLOSKEY, M., CARAMAZZA, A., & GREEN, B. (1980). Curvilinear duction of the generalization gradient from a trace model. Psycho­ motion in the absence ofexternal forces: Naive beliefs about the mo­ logical Review, 65, 242-256. tion ofobjects. Science, 210, 1139-1141. SHEPARD, R. N. (1958b). Stimulus and response generalization: Tests of MCGUIRE, W. 1. (1954). A multi-process model for paired-associates a model relating generalization to distance in psychological space. learning. Unpublished doctoral dissertation, Yale University. Journal ofExperimental Psychology, 55, 509-523. MILLER, G. A. (1956). The magic number seven, plus or minus two. SHEPARD, R. N. (1962a). The analysis ofproximities: Multidimensional Psychological Review, 63, 81-97. scaling with an unknown distance function. I. Psychometrika, 27, MILLER, G. A., & NICELY, P. E. (1955). An analysis ofperceptual con­ 125-140. fusions among some English consonants. Journal ofthe Acoustical SHEPARD, R. N. (l962b). The analysis ofproximities: Multidimensional Society ofAmerica, 27, 338-352. scaling with an unknown distance function. II. Psychometrika, 27, MILNER, A. D., & GOODALE, M. A. (1998). The visual brain in action. 219-246. Oxford: Oxford University Press. SHEPARD, R. N. (1963). Comments on Professor Underwood's paper MOYER, R. S., & LANDAUER, T. K. (1967). Time required for judgments "Stimulus selection in verbal learning." In C. N. Cofer & B. S. Mus­ ofnumerical inequality. Nature, 215, 1519-1520. grave (Eds.), Verbalbehavior andlearning: Problems and processes MYUNG, I. J., & SHEPARD, R. N. (1996). Maximum entropy inference (pp. 48-70). New York: McGraw-HilI. and stimulus generalization (theoretical note). Journal ofMathemat­ SHEPARD, R. N. (I 964a). Attention and the metric structure ofthe stim­ ical Psychology, 40,342-347. ulus space. Journal ofMathematical Psychology, 1, 54-87. NEWELL, A., SHAW, J. C., & SIMON, H. A. (1958). Elements ofa theory SHEPARD, R. N. (1964b). Circularity in judgments of . ofhuman problem solving. Psychological Review, 65,151-166. Journal ofthe Acoustical Society ofAmerica, 36, 2346-2353. NOSOFSKY, R. M. (1986). Attention, similarity, and the identification­ SHEPARD, R. N. (1965). Approximation to uniform gradients ofgeneral­ categorization relationship. Journal ofExperimental Psychology: ization by monotone transformations of scale. In D. I. Mostofsky General, 115, 39-57. (Ed.), Stimulus generalization (pp. 94-110). Stanford: Stanford Uni­ NOSOFSKY, R. M. (1987). Attention and learning processes in the iden­ versity Press. tification and categorization of integral stimuli. Journal ofExperi­ SHEPARD, R. N. (I 966a). Learning and recall as organization and search. mental Psychology: Learning, Memory, & Cognition, 13, 87-108. Journal ofVerbal Learning & VerbalBehavior.S, 201-204. NOSOFSKY, R. M. (1989). Further tests of an exemplar-similarity ap­ SHEPARD, R. N. (I966b). Metric structures in ordinal data. Journal of proach to relating identification and categorization. Journal ofEx­ Mathematical Psychology, 3, 287-315. perimental Psychology: Perception & Psychophysics, 45, 279-290. SHEPARD, R. N. (1972). Psychological representation ofspeech sounds. NOSOFSKY, R. M. (1992). Similarity scaling and cognitive process models. In E. E. David & P. B. Denes (Eds.), Human communication: A uni­ Annual Review ofPsychology, 43, 25-53. fied view (pp. 67-113). New York: McGraw-HilI. PALMER, S. E. (1982). Symmetry, transformation, and the structure of SHEPARD, R. N. (1975). Form, formation, and transformation of internal perceptual systems. In 1. Beck (Ed.), Organization and representa­ representations. In R. L. Solso (Ed.), Information processing and tion in perceptual systems (pp. 95-144). Hillsdale, NJ: Erlbaum. cognition: The Loyola Symposium (pp. 87-122). Hillsdale, NJ: Erl­ PALMER, S. E. (1983). The psychology of perceptual organization: A baum. transformational approach. In 1. Beck, B. Hope, & A. Rosenfeld SHEPARD, R. N. (I978a). Externalization of mental images and the act (Eds.), Human and machine vision (pp. 269-339). New York: Acad­ ofcreation. InB. S. Randhawa & W.E. Coffman (Eds.), Visuallearn­ emic Press. ing, thinking, and communication (pp. 133-189). New York: Acade­ PARKMAN, 1.M. (1971). Temporal aspects ofdigit and letter inequality mic Press. judgments. Journal ofExperimental Psychology, 91,191-205. SHEPARD, R. N. (1978b). On the status of"direct" psychophysical mea­ PElTGEN, H.-O., & RICHTER, P. H. (1986). The beauty offractals: Im­ surement. In C. W.Savage (Ed.), Minnesota studies in the philosophy ages ofcomplex dynamical systems. Heidelberg: Springer-Verlag. ofscience (Vol. 9, pp. 441-490). Minneapolis: University of Min­ PHILBECK, J. w., LooMIS, J. M., & BEALL, A. C. (1997). Visually per­ nesota Press. ceived location is an invariant in the control ofaction. Perception & SHEPARD, R. N. (1980). Multidimensional scaling, tree-fitting, and Psychophysics, 59, 601-612. clustering. Science, 210, 390-398. PLANCK, M. (1993). A survey ofphysical theory. New York: Dover. SHEPARD, R. N. (I981a, August). Discrimination and classification: A [Original work published 1925 as A survry ofphysics (R. Jones & search forpsychological laws. Presidential address to Division 3 pre­ D. H. Williams, Trans.)] sented at the annual meeting ofthe American Psychological Associ­ PRESSING, J. (1983). Cognitive isomorphisms in world music: West Africa, ation, Los Angeles. the Balkans, Thailand and Western tonality. Studies in Music, 17, 38-61. SHEPARD, R. N. (198Ib). Psychological relations and psychophysical RAMACHANDRAN, V. S. (1990). Interactions between motion, depth, scales: On the status of"direct" psychophysical . Jour­ color and form: The utilitarian theory ofperception. In C. Blakemore nal ofMathematical Psychology, 24, 21-57. (Ed.), Vision: Coding and efficiency (pp. 346-360). Cambridge: SHEPARD, R. N. (1981c). Psychophysical complementarity. In M. Kubovy Cambridge University Press. & J. Pomerantz (Eds.), Perceptual organization (pp. 279-341). Hills­ RAWLS, 1. (1971). A theory ofjustice. Cambridge, MA: Harvard Uni­ dale, NJ: Erlbaum. versity Press. SHEPARD, R. N. (1982a). Geometrical approximations to the structure RESTLE, F. (1962). The selection ofstrategies in cue learning. Psycho­ ofmusical pitch. Psychological Review, 89, 305-333. logical Review, 69, 329-343. SHEPARD, R. N. (1982b). Perceptual and analogical bases ofcognition. RUSSELL, S. J. (1988). Analogy by similarity. In D. Helman (Ed.), Ana­ In 1. Mehler, M. Garrett, & E. Walker (Eds.), Perspectives on mental logical reasoning (pp. 251-269). Boston: Reidel. representation (pp. 46-67). Hillsdale, NJ: Erlbaum. SAMBURSKY, S. (1956). The physical world ofthe Greeks. (M. Dagut, SHEPARD, R. N. (1984). Ecological constraints on internal representation: Trans.). New York: Macmillan. Resonant kinematics ofperceiving, imagining, thinking, and dream­ SCHWARTZ, D. A., HOWE, C. Q., & PURVES, D. (2003). The statistical ing. Psychological Review, 91, 417-447. structure ofhuman speech sounds predicts musical universals. Jour­ SHEPARD, R. N. (1986). Discrimination and generalization in identifica­ nal ofNeuroscience, 23, 7160-7168. tion and classification: Comment on Nosofsky. Journal ofExperimen­ SHEPARD, R. N. (1955). Stimulus and response generalization during tal Psychology: General, 115, 58-61. paired-associates learning. Unpublished doctoral dissertation, Yale SHEPARD, R. N. (1987). Toward a universal law of generalization for University. psychological science. Science, 237,1317-1323. KEYNOTE ADDRESS 23

SHEPARD, R. N. (I 988a). George Miller's data and the development of SHEPARD, R. N., & METZLER, J. (1971). Mental rotation of three­ methods for representing cognitive structures. In W. Hirst (Ed.), The dimensional objects. Science, 171, 701-703. making ofcognitive science: Essays in honor ofGeorge A. Miller SHEPARD, R. N., & SHEENAN, M. M. (1965). Immediate recall ofnum­ (pp. 45-70). Cambridge: Cambridge University Press. bers containing a familiar prefix or postfix. Perceptual & Motor SHEPARD, R. N. (1988b). The role oftransformations in spatial cogni­ Skills, 21, 263-273. tion. In 1. Stiles-Davis, M. Krichevsky, & U. Bellugi (Eds.), Spatial SHEPARD, R. N., & ZARE,S. (1983). Path-guided apparent motion. Sci­ cognition: Brain bases and development (pp. 81-110). Hillsdale, NJ: ence,220,632-634. Erlbaum. SIMON, H. A (1956). Rational choice and the structure ofenvironments. SHEPARD, R. N. (l990a). Mind sights. New York: Freeman. Psychological Review, 63, 129-138. SHEPARD, R. N. ([ 990b). A possible evolutionary basis for trichromacy. SIMON, H. A. (1990). Invariants ofhuman behavior. Annual Review of In Proceedings ofthe SPIEISPSE symposium on electronic imaging: Psychology, 41, 1-19. Science and technology (SPIE: Vol. 1250. Perceiving, measuring, and SKINNER, B. F. (1938). The behavior oforganisms: An experimental using color, pp. 301-309). Bellingham, WA: SPIE Press. analysis. New York: Appleton-Century. SHEPARD, R. N. (1991). Integrality versus separability of stimulus di­ SMOLIN, L. (1997). The life ofthe cosmos. London: Weidenfeld & No­ mensions: From an early convergence ofevidence to a proposed the­ colson, oretical basis. In G. R. Lockhead & 1. R. Pomerantz (Eds.), Percep­ SPENCE, K. W (1937). The differential response in animals to stimuli tion ofstructure: Essays in honor ofWendell R. Garner (pp. 53-71). varying within a single dimension. Psychological Review, 44, 430-444. Washington, DC: American Psychological Association. TAYLOR, A E. (1984). A life in mathematics remembered. American SHEPARD, R. N. (1992). The advent and continuing influence ofmath­ Mathematical Monthly, 91, 605-618. ematicallearning theory: Commentary on Estes and Burke (1955). 'TEGMARK, M. (1998). Is "the theory ofeverything" merely the ultimate Journal ofExperimental Psychology: General, 121,419-421. ensemble theory? Annals ofPhysics, 270, 1-51. SHEPARD, R. N. (1994). Perceptual-cognitive universals as reflections of 'TENENBAUM, 1.B., & GRIFFITHS, T. L. (2001a). Generalization, similar­ the world. Psychonomic Bulletin & Review, 1,2-28. ity, and Bayesian inference. Behavioral & Brain Sciences, 24, 629-640. SHEPARD, R. N. (1998). Carl Iver Hovland, June 12, I912-April 16, 'TENENBAUM, J. B., & GRIFFITHS, T. L. (2001b). Some specifics about 1961. In Biographical memoirs (Vol. 73, pp. 230-260). Washington, generalization. Behavioral & Brain Sciences, 24, 762-778. DC: National Academy Press. TVERSKY, A., & KAHNEMAN, D. (1974). Judgment under uncertainty: SHEPARD, R. N. (2000). Carl Iver Hovland: Statesman of psychology, Heuristics and biases. Science, 185, 1124-1131. sterling human being. In G. A. Kimble & M. Wertheimer (Eds.), Por­ TVERSKY, A, & KAHNEMAN, D. (1981). The framing of decisions and traits ofpioneers in psychology (Vol. IV,pp. 284-301). Mahwah, NJ: the psychology ofchoice. Science, 211, 453-458. Erlbaum. TVERSKY, A, & KAHNEMAN, D. (1983). Extensional versus intuitive SHEPARD, R. N. (200 I). On the possibility ofuniversal mental laws: A reasoning: The conjunction fallacy in probabilityjudgment. Psycho­ reply to my critics. Behavioral & Brain Sciences, 24, 712-748. logical Review, 90, 293-315. SHEPARD, R. N. (2003). A funny thing happened on the way to the for­ VEDA, K., & OHGUSHI, K. (1987). Perceptual components ofpitch: Spa­ mulation: How I came to frame mental laws in abstract spaces. In tial representation using a multidimensional scaling technique. Jour­ R.1. Sternberg (Ed.), Psychologists defying the crowd (pp. 215-237). nal ofthe Acoustical Society ofAmerica, 82, 1193-1200. Washington, DC: American Psychological Association. VAN NOORDEN, L. P. A S. (1975). Temporal coherence in the percep­ SHEPARD, R. N., & ARABIE, P. (1979). Additive clustering: Representa­ tion oftone sequences. Unpublished doctoral dissertation. Eind­ tion of similarities as combinations of discrete overlapping proper­ hoven, The Netherlands: Institute for Perception Research. ties. Psychological Review, 86, 87-123. WASON, P. C., & JOHNSON-LAIRD, P. N. (1972). Psychology ofreason­ SHEPARD, R. N., & CARROLL, J. D. (1966). Parametric representation of ing: Structure andcontent. London: Batsford. nonlinear data structures. In P. R. Krishnaiah (Ed.), Multivariate WEINBERG, S. (1992). Dreams ofa final theory. New York: Pantheon. analysis (pp. 561-592). New York: Academic Press. WEISS, Y., SIMONCELLl, E. P., & ADELSON, E. H. (2002). Motion illu­ SHEPARD, R. N., & CHANG,J.-J. (1963). Stimulus generalization in the sions as optimal percepts. Nature Neuroscience.S, 598-604. learning ofclassifications. Journal ofExperimental Psychology, 65, WERTHEIMER, M. (1945). Productive thinking. New York: Harper. 94-102. WERTS, D. (1983). A theory ofscale references. Unpublished doctoral SHEPARD, R. N., & CHIPMAN, S. (1970). Second-order isomorphism ofin­ dissertation, Princeton University. ternal representations: Shapes ofstates. , 1, 1-17. WOODWORTH, R. S., & SELLS, S. B. (1935). An atmosphere effect in SHEPARD, R. N., & COOPER, L. A. (1982). Mental images and their formal syllogistic reasoning. Journal ofExperimental Psychology, transformations. Cambridge, MA: MIT Press, Bradford Books. 18,451-460.. SHEPARD, R. N. & COOPER, L. A. (1992). Representation ofcolors in the blind, color blind, and normally sighted. Psychological Science, 3, NOTES 97-104. SHEPARD, R. N., & FARRELL, 1. E. (1985). Representation ofthe orien­ 1. This historical section is a shortened version ofthe corresponding tations ofshapes. Acta Psychologica, 59,104-121. section ofmy chapter "A Funny Thing Happened on the Way to the For­ SHEPARD, R. N., HOVLAND, C. I., & JENKINS, H. M. (1961). Learning mulation: How I Came to Frame Mental Laws in Abstract Spaces" and memorization of classifications. Psychological Monographs, (Shepard, 2003-in Psychologists Defying the Crowd, edited by Robert 75(13, Whole No. 517). 1. Sternberg), copyright 2003 by the American Psychological Associa­ SHEPARD, R. N., & JORDAN, D. (1984). Auditory illusions demonstrat­ tion. Adapted with permission. ing that tones are assimilated to an internalized musical scale. Sci­ 2. For more about such inner phenomena, including my own, see ence, 226, 1333-1334. Shepard (1978a, pp. 167-183; I990a, pp. 13-40). SHEPARD, R. N., & JUDD, S. A. (1976). Perceptual illusion ofrotation of 3. For further reminiscences about Hovland, see Shepard (1998, 2000). three-dimensional objects. Science, 191,952-954. 4. For more on my view of Estes's trailblazing work, see Shepard SHEPARD, R. N., & KANNAPPAN, S. (1991). Connectionist implementa­ (1992). tion ofa theory ofgeneralization. In R. P.Lippmann, 1. E. Moody, & 5. Cooper's 1976 article is reproduced, in a slightly edited form, in D. S. Touretzky (Eds.), Advances in neural information processing Shepard and Cooper (1982, chap. 7, pp. 159-170). systems 3 (pp. 665-671). San Mateo, CA: Morgan Kaufinann. SHEPARD, R. N., KILPATRIC, D. W, & CUNNINGHAM, J. P. (1975). The (Manuscript received April 4, 2003; internal representation ofnumbers. Cognitive Psychology, 7, 82-138. revision accepted for publication June 25, 2003.)