The Relationship Between Online Visual Representation of a Scene and Long-Term Scene Memory
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Experimental Psychology: Copyright 2005 by the American Psychological Association Learning, Memory, and Cognition 0278-7393/05/$12.00 DOI: 10.1037/0278-7393.31.3.396 2005, Vol. 31, No. 3, 396–411 The Relationship Between Online Visual Representation of a Scene and Long-Term Scene Memory Andrew Hollingworth The University of Iowa In 3 experiments the author investigated the relationship between the online visual representation of natural scenes and long-term visual memory. In a change detection task, a target object either changed or remained the same from an initial image of a natural scene to a test image. Two types of changes were possible: rotation in depth, or replacement by another object from the same basic-level category. Change detection during online scene viewing was compared with change detection after delay of 1 trial (Experiments 2A and 2B) until the end of the study session (Experiment 1) or 24 hr (Experiment 3). There was little or no decline in change detection performance from online viewing to a delay of 1 trial or delay until the end of the session, and change detection remained well above chance after 24 hr. These results demonstrate that long-term memory for visual detail in a scene is robust. Human beings spend their lives within complex environments— These picture memory studies show that some form of repre- offices, parks, living rooms—that typically contain scores of indi- sentation is retained robustly from many individual pictures after a vidual objects. All of the visual detail within a scene cannot be single trial of study, and this information is sufficient to discrim- perceived in a single glance, as high acuity vision is limited to a inate studied pictures from pictures that were not studied. The relatively small, foveal region of the visual field (Riggs, 1965), so distractors used in these experiments, however, were typically the eyes and attention are oriented serially to local scene regions to chosen to be highly different from studied images, making it obtain high resolution, foveal information from individual objects difficult to identify the type of information supporting such feats of (see Henderson & Hollingworth, 1998, for a review). Visual scene memory. Participants might have remembered studied pictures by perception is therefore extended over time and space as the eyes maintaining visual representations (coding visual properties such and attention are oriented from object to object. To construct a as shape, color, orientation, texture, and so on), by maintaining visual representation of a scene as a whole, visual information conceptual representations of picture identity (i.e., scene gist, such from previously fixated and attended objects must be retained in as bedroom), or by maintaining verbal descriptions of picture memory and integrated with information from subsequently at- content. It is possible that these latter types of nonvisual represen- tended objects. tation, and not visual memory, accounted for observations of Once constructed during viewing, how robustly are visual rep- high-capacity, robust picture memory (Chun, 2003; Simons, 1996; resentations of scenes retained in memory? Early evidence from Simons & Levin, 1997). the picture memory literature suggested that visual memory for Long-term memory (LTM) for the specific visual properties of scenes exhibits exceedingly large capacity and is highly robust objects in scenes was observed in a set of experiments examining (Nickerson, 1965, 1968; Shepard, 1967; Standing, 1973; Standing, the influence of scene contextual structure on object memory Conezio, & Haber, 1970). Shepard (1967) presented participants (Friedman, 1979; Mandler & Johnson, 1976; Mandler & Parker, with 612 photographs of varying subject matter for 6 s each. 1976; Mandler & Ritchey, 1977; Parker, 1978). Participants reli- Immediately after the study session, mean performance on a two- ably remembered token-specific details of individual objects alternative recognition test was essentially perfect (98% correct). within line drawings of natural scenes. However, the scenes were Even more impressive capacity was observed by Standing (1973), very simple, typically consisting of a handful of discrete objects; who presented each participant with 10,000 different photographs scene items were repeated many times; and verbal encoding was for 5 s each. On a two-alterative recognition test, mean percentage not controlled. Thus, memory for visual object detail could easily correct was 83%, suggesting the retention of approximately 6,600 have been based on verbal descriptions generated for a small pictures. In addition to such demonstrations of capacity, robust number of objects during the many repetitions of each scene. retention of pictorial stimuli has been observed over long dura- Subsequent work has questioned whether LTM reliably retains tions, with above chance picture recognition after a delay of almost the visual details of complex stimuli. In a frequently cited study, 1 year (Nickerson, 1968). Nickerson and Adams (1979) tested memory for the visual form of a common object, the penny. Participants were shown a line drawing of the “head” side of a correct penny among 14 alterna- This research was supported by National Institute of Health Grant R03 tives that differed from the correct penny in the surface details. MH65456. Correspondence concerning this article should be addressed to Andrew Forty-two percent of participants chose the correct alternative, Hollingworth, Department of Psychology, The University of Iowa, 11 leading Nickerson and Adams to conclude that memory for pen- Seashore Hall E, Iowa City, IA 52242-1407. E-mail: andrew-hollingworth@ nies is largely schematic, retaining little visual detail. This con- uiowa.edu clusion must be qualified in two respects, however. First, memory 396 VISUAL MEMORY AND SCENE REPRESENTATION 397 encoding was not controlled; the authors assumed that pennies ment. Theories claiming that memory for visual detail is either were familiar from everyday experience. Although pennies might nonexistent (O’Regan, 1992; O’Regan & No¨e, 2001) or limited to be handled quite often, most people rarely attend to the surface the currently attended object (Rensink, 2000, 2002) predict that details of a penny. Second, although fewer than half the partici- such changes should be undetectable. Yet, participants detected pants (42%) chose the correct penny on the recognition test, the these subtle changes to previously attended objects at a rate sig- correct alternative was chosen at a rate well above chance of 6.7%, nificantly above the false-alarm rate, demonstrating that visual and participants were more than twice as likely to choose the representations do not necessarily disintegrate upon the with- correct penny than to choose any particular distractor. Further, the drawal of attention (see also Hollingworth, Williams, & Hender- distractors were highly similar to the correct alternative, often son, 2001). Subsequent work has demonstrated that verbal encod- differing in only a single surface feature, such as the position of a ing could not account for accurate memory for the visual details of text element. Thus, one could just as validly interpret the Nicker- previously attended objects, as change detection performance re- son and Adams results as demonstrating relatively accurate mem- mained accurate in the presence of a verbal working memory load ory for the perceptual details of a common object. and articulatory suppression (Hollingworth, 2003, 2004) More recently, the phenomenon of change blindness has been In addition to testing object memory during online scene view- used to argue that scene representations in memory are primarily ing, Hollingworth and Henderson (2002) examined longer-term schematic and gist-based, retaining little, if any, specific visual retention of visual object representations. For scenes in which the information (Becker & Pashler, 2002; Irwin & Andrews, 1996; target was not changed during initial viewing, a delayed, two- O’Regan, 1992; O’Regan & No¨e, 2001; O’Regan, Rensink, & alternative forced-choice test was administered after all scenes had Clark, 1999; Rensink, 2000; Rensink, O’Regan, & Clark, 1997; been viewed, introducing a delay of approximately 15–20 min, on Simons, 1996; Simons & Levin, 1997; Wolfe, 1999). In change average, between scene viewing and test. Participants were sur- blindness studies, a change is introduced into a scene during some prisingly accurate on this delayed test, performing above 80% form of visual disruption, such as a saccadic eye movement (e.g., correct both when discriminating targets from rotated distractors Grimes, 1996; Henderson & Hollingworth, 1999, 2003b) or brief and when discriminating targets from different token distractors. interstimulus interval (ISI; e.g., Rensink et al., 1997). Poor change These data provide preliminary evidence that memory for the detection performance in these paradigms has led researchers to visual details of individual objects in scenes is robust not only propose that a memory representation of the visual detail in a scene during the online viewing of a scene but also over time scales is either nonexistent (O’Regan, 1992; O’Regan & No¨e, 2001), directly implicating LTM, as tested in the picture memory limited to the currently attended object (Rensink, 2000, 2002), or literature. limited to the small number of objects (three or four) that can be To account for these and concomitant results, Hollingworth and maintained in visual