<<

List length and word frequency effects in the Sternberg paradigm

A Thesis

Presented in Partial Fulfillment of the Requirements for the Degree Master of Arts in the

Graduate School of The Ohio State University

By

Allison M. Chapman, B.A.

Graduate Program in Psychology

The Ohio State University

2012

Master’s Examination Committee:

Simon J. Dennis, Adviser

Mark A. Pitt Per B. Sederberg c Copyright by

Allison M. Chapman

2012 ABSTRACT

There is building evidence in long-term recognition paradigms documenting

a null list length effect (LLE) in which performance is not improved for items studied in shorter lists compared with items from longer lists. The global-matching mechanism implemented in many recognition memory models predicts that the LLE is a direct con- sequence of item noise. In these models, item noise is also assumed to drive the word-

frequency effect (WFE) – a recognition advantage for words that occur with low frequency in English compared to words that occur with high frequency. The current experiments modified a short-term recognition memory task (the Sternberg paradigm) to include an ex- tended and/or filled delay between study and test lists, and manipulated word frequency.

Results demonstrated a null LLE when the design included both a longer study-test lag and a distracter task. Frequency effects emerged only when there was an unfilled delay and a distracter task separating the study-test cycles. These results suggest that item interference is not implicated in short-term recognition memory for words, and contextual reinstatement must be sufficiently noisy to demonstrate the low frequency word advantage. BCDMEM is able to succinctly capture the range of findings that seem to provide evidence against distinct short-term- and long-term-memory systems.

ii Dedicated to my friends and family for their continued encouragement and support, but especially to my mother, Robin Gofberg, who first introduced me to the concept of list

length effects in one of many impromptu intelligence tests.

iii ACKNOWLEDGMENTS

Thanks to my adviser, Simon Dennis, for his mentorship. Also thanks to Troy Smith for helpful discussion.

iv VITA

August 19, 1985 ...... Born - Columbus, USA

2007 ...... B.A. Neuroscience, Kenyon College. 2010 ...... Graduate Teaching Associate, The Ohio State University.

PUBLICATIONS

Chapman, A. & Dennis, S. (2011). Item noise in the Sternberg paradigm. In S. Ohlsson & R. Catrambone (Eds.), Proceedings of the XXXIII Annual Conference of the Cognitive Science Society, (pp. 2359-2364). Mahwah, NJ: Erlbaum.

Dennis, S. & Chapman, A. (2010). The Inverse List Length Effect: A Challenge for Pure Exemplar Models of Recognition Memory. Journal of Memory and Language, 63, 416-424.

FIELDS OF STUDY

Major Field: Cognitive Psychology

Studies in : Prof. Simon J. Dennis

v TABLE OF CONTENTS

Page

Abstract ...... ii

Dedication ...... iii

Acknowledgments ...... iv

Vita ...... v

List of Figures ...... viii

Chapters:

1. Introduction ...... 1

2. Literature Review ...... 4

2.1 Memory Models and Empirical Findings ...... 4 2.2 The Sternberg Paradigm ...... 11

3. Experiment 1 ...... 16

3.1 Method ...... 16 3.1.1 Participants ...... 16 3.1.2 Stimuli ...... 17 3.1.3 Design ...... 17 3.1.4 Procedure ...... 17 3.2 Results ...... 19 3.3 Discussion ...... 21

vi 4. Experiment 2 ...... 27

4.1 Method ...... 27 4.1.1 Participants ...... 27 4.1.2 Stimuli ...... 28 4.1.3 Design ...... 28 4.1.4 Procedure ...... 28 4.2 Results ...... 30 4.3 Discussion ...... 32

5. General Discussion ...... 38

5.1 The BCDMEM Model ...... 41 5.2 The EBRW Model ...... 43

6. Conclusions ...... 45

References ...... 47

vii LIST OF FIGURES

Figure Page

3.1 Schematic of the experimental paradigm for Experiment 1. The top row demonstrates the 2-second unfilled delay condition and the bottom row demonstrates the 2-second filled delay condition...... 18

3.2 The averaged median latencies to correctly respond yes to targets (solid lines) and no to distracters (dashed lines) in Experiment 1 as a function of list length. The square symbol marks data corresponding to the suppression condition where articulation was suppressed during the 2-second delay be- tween study and test. The triangle symbols mark data corresponding to the condition where there was no distracter task during the 2-second study-test delay. Error bars represent the standard error...... 22

3.3 Results of the serial position analysis in the condition without articulatory suppression. The graph displays averaged median latencies for combined positions to correctly respond yes to targets in Experiment 1 as a function of position for each list length in the unfilled 2-second delay condition. Error bars represent the standard error...... 23

3.4 Results of the serial position analysis in the articulatory suppression condi- tion. The graph displays averaged median latencies for combined positions to correctly respond yes to targets in Experiment 1 as a function of position for each list length in the 2-second articulatory suppression delay. Error bars represent the standard error...... 24

3.5 The word-frequency effect for the condition without articulatory suppres- sion. Results depict the averaged median latencies to correctly respond yes to targets in Experiment 1 as a function of word frequency across list length in the unfilled 2-second delay condition. Reaction time to low-frequency target words is depicted with a solid line, while high-frequency word RT is depicted with a dashed line. Error bars represent the standard error...... 25

viii 3.6 The word-frequency effect in the articulatory suppression condition. Re- sults depict the averaged median latencies to correctly respond yes to tar- gets in Experiment 1 as a function of word frequency across list length in the 2-second articulatory suppression delay condition. Reaction time to low-frequency target words is depicted with a solid line, while high- frequency word RT is depicted with a dashed line. Error bars represent the standard error...... 26

4.1 Schematic of the experimental paradigm for Experiment 2. The top row demonstrates the 2-second unfilled delay condition (with the filler task be- tween cycles) and the bottom row demonstrates the 15-second filled delay condition...... 29

4.2 The averaged median latencies to correctly respond yes to targets (solid lines) and no to distracters (dashed lines) in Experiment 2 as a function of list length. The square symbol marks data corresponding to the rehearsal condition where there was no distracter task during the 2-second study- test delay. The triangle symbols mark data corresponding to the condition where there was a 15-second distracter task during the study-test delay. Error bars represent the standard error...... 33

4.3 Results of the serial position analysis in the condition where the 15 second distracter task occurred between study-test cycles. The graph displays av- eraged median latencies for combined positions to correctly respond yes to targets in Experiment 2 as a function of position for each list length in the unfilled 2-second delay condition. Error bars represent the standard error. . 34

4.4 Results of the serial position analysis in the condition where the 15-second distracter task occurred within study-test cycles. The graph displays aver- aged median latencies for combined positions to correctly respond yes to targets in Experiment 2 as a function of position for each list length in the 15-second delay condition. Error bars represent the standard error...... 35

4.5 The word-frequency effect for the filler between cycles condition. The graph displays the averaged median latencies to correctly respond yes to targets in Experiment 2 as a function of word frequency across list length in the unfilled 2-second delay condition. Reaction time to low frequency target words is depicted with a solid line, while high frequency word RT is depicted with a dashed line. Error bars represent the standard error...... 36

ix 4.6 The word-frequency effect for the filler within cycles condition. The graph displays the averaged median latencies to correctly respond yes to targets in Experiment 2 as a function of word frequency across list length in the 15-second delay condition. Reaction time to low frequency target words is depicted with a solid line, while high frequency word RT is depicted with a dashed line. Error bars represent the standard error...... 37

x CHAPTER 1

INTRODUCTION

Historically, memory researchers purport delineation between short-term and long-term memory systems. In fact, this distinction dates to William James (1890) who first ar- gued that there are qualitative differences between immediate (primary) and permanent

(secondary) memory. Whether memory should be fractionated is a topic of continued de- bate, with those who favor separate systems (e.g., Atkinson & Shiffrin, 1968; Baddeley &

Hitch,1974), and others arguing that a unified of processes oversee phenomena observed across both short- and long-term spans (e.g., Brown, Neath, & Chater, 2007).

Not only are there distinctions between short- and long-term memory phenomena, but memory is also characterized as being either episodic (referring to events) or semantic (re- ferring to general knowledge) (Tulving, 1972). One well-established paradigm for testing episodic memory is the yes/no recognition task in which a series of items are presented, followed by one or more test items that either appeared previously (positive probes / tar- gets) or were not presented (negative probes / distracters). The overall quality of long-term recognition is often measured with discriminability (d0), which is obtained by subtracting the standardized false alarm rate (responding yes to distracters) from the standardized hit rate (responding yes to targets). In short-term recognition memory, it is most salient to examine trends in reaction time (RT) because discriminability is typically at ceiling. In this

1 case, the duration of response time reflects the accessibility of the memory trace and/or the

accuracy of the retrieval mechanism.

Researchers in episodic memory work with mechanistic models to specify ,

, and retrieval processes to account for and predict a number of fundamental em-

pirical findings. One such finding, the list-length effect (LLE) in recognition memory,

demonstrates that performance decreases as a function of the number of items studied. The

LLE was first observed in 1912 (Strong), and is crucial to the interpretation of interference

in retrieval (e.g., Underwood, 1978; Bowles & Glanzer, 1983; Murnane & Shiffrin, 1991;

Murdock & Kahana, 1993; Gronlund & Elam, 1994).

Capturing the LLE served as a constraint for a series of models developed in the 1980s

called global matching models (GMMs; Clark & Gronlund, 1996; Humphreys, Bain, Pike,

& Tehan, 1989). The retrieval mechanism of GMMs relied on a match between the test item and each memory representation of the studied items, rendering a familiarity index to quan- tify the strength of the match. The decision process generally followed a signal-detection framework (and more recently, Bayesian methods). The concept of a global match lends itself to the LLE because as the study list becomes longer, each new item introduces addi- tional variance to the composite match (Clark & Gronlund, 1996). Therefore, performance on a longer list is worse than performance on shorter lists. Models of recognition memory that incorporate only item noise are unable to accommodate the absence of a list length ef- fect. Though null length effects have been widely demonstrated in long-term memory (e.g.

Murnane & Shiffrin, 1991; Dennis & Humphreys, 2001; Jang & Huber, 2008; Dennis, Lee

& Kinnell, 2008), the traditional finding in short-term memory is that RT increases linearly with set size (Sternberg, 1966, 1969).

2 The current research aims to examine whether short-term memory (STM) and long- term memory (LTM) reflect two functionally disparate systems by assessing the empirical phenomena such as the list length effect and two other findings: the word-frequency effect and the recency effect, as described later. The bind cue decide model of episodic memory

(BCDMEM; Dennis & Humphreys, 2001) has been effective at capturing both list length and word-frequency effects in long-term recognition memory paradigms. The current ex- periments have been designed to determine whether BCDMEM can capture these findings by the same basic mechanisms in a short-term recognition task, without requiring a short- term store. If one computational model can capture the range of effects, it would suggest that a STM-LTM distinction is not necessary.

3 CHAPTER 2

LITERATURE REVIEW

2.1 Memory Models and Empirical Findings

The short-term (STM) and long-term memory (LTM) distinction was formally articu- lated in two-store models, most notably those by Waugh & Norman (1965) and Atkinson &

Shiffrin (1968). Waugh & Norman (1965) utilized the terms established by James (1890), dividing memory into primary and secondary systems. As items are presented, they are held in primary memory until the system reaches capacity and replaces those items with new input. Items are transferred to secondary memory only if they are rehearsed. Theo- retically, if rehearsal is prevented, no long-term retention of any presented items will be possible (Waugh & Norman, 1965).

Atkinson & Shiffrin (1968) suggested a similar two-store model with a short-term store

(STS) that oversaw , and a permanent, unlimited capacity long-term store

(LTS). Similar to Waugh & Norman’s model, information is eventually dumped from the

STS, and lost, unless it has been copied to the LTS. Atkinson & Shiffrin also note the importance of rehearsal (both duration and quality) in successfully copying information between the two stores. Rehearsal determines the types of cues that will bind to the item trace when copied (visual, acoustic, verbal, etc.). The main distinction between Atkinson

4 & Shiffrin’s model and that of Waugh & Norman, is that the former suggests that regard-

less of rehearsal, some information will get transferred to LTS, as evidenced by the fact that

takes place even under incidental testing conditions (Hebb, 1961). The Atkinson

& Shiffrin model was a formal, mathematical model because it specified parameters repre-

senting: the number of items that could be held in STS (buffer size), the transfer rate from

STS to LTS, how quickly information was replaced in the STS (decay rate), among other

factors.

The main point of contention against the two-store model (or, modal model, Murdock

(1967)) emerged in the late-1970s with evidence of the long-term recency effect (LTR). In free paradigms where participants are asked to retrieve the items from a study list, a serial position trend emerges, in which the first items (primacy) and the last few items

(recency) are recalled in higher proportions than the middle items (Murdock, 1962). The modal model suggested primacy reflects the fact that the first items benefit from superior rehearsal – with fewer items in the STS to interfere. The recency effect was to reflect items that are immediately available in the STS at the time of retrieval. This as- sumption was supported with the finding that a 30-second delay after the presentation of the study list and before retrieval attenuated recency (Glanzer & Cunitz, 1966).

A long-term recency effect where the last items are still recalled in higher proportions despite an interval of retention between study and test is problematic for the modal model, which would predict primacy but not recency after a delay. LTR has been demonstrated in numerous continuous distracter experiments (e.g., Bjork & Whitten, 1974; Baddeley &

Hitch, 1974). In the continuous distracter paradigm, rather than having the delay function solely as a retention interval (RI) between study and test, the time between each study item

5 presentation is extended and incorporates a distracter task. In these instances, the recency effect cannot emerge from a limited-capacity STS.

LTR could have served as evidence to discount the STS entirely; however, some re- searchers suggested instead that LTR reflects a separate process from recency effects in immediate testing paradigms (e.g., Raaijmakers, 1993). The search of associative memory model (SAM; Raaijmakers & Shiffrin, 1981; Gillund & Shiffrin, 1984) reflected a trend toward adapting the modal model framework to a dual-store account that formally details memory retrieval mechanisms (Clark & Gronlund, 1996). SAM is just one example of the global-matching models (GMMs) that include Minerva 2 (Hintzman, 1984), the the- ory of distributed associative memory (TODAM; Murdock, 1982), and the matrix model

(Humphreys, Bain, & Pike, 1989; Pike, 1984). Global-matching models suggested that a test probe is compared in parallel to separate (e.g., SAM & Minerva 2) or composite

(e.g., TODAM & the matrix model) memory representations and a match of familiarity is summed over successive comparisons (Clark & Gronlund, 1996).

There are two ways to retrieve episodic memory: recognition and recall. The later re- quires production of information that is not currently at hand whereas the former entails mere identification. According to SAM (Raajmakers & Shiffrin, 1980), the first step in recall (i.e., sampling) involves comparing the strength of the context as a cue for the probe item to the sum of this comparison across all item traces in memory (called images). Con- ceptually, the recall phase is similar to a sequential search for the target trace. The overall probability of recall is contingent on the relative strength of the particular item compared with the sum across all items. However, the second stage of recall (i.e., recovery) reflects the inverse exponential strength of the cues in the probe set. SAM utilizes a small num- ber of parameters: (a) for the context-to-item retrieval strengths, (b) for the strengths of

6 inter-item associations (associations among items), (c) for the strength of item-to-image / self-associations, and (d) to reflect pre-experimental associations between the probe item and images in memory.

A variant of the search of associative memory model (SAM; Raaijmakers & Shiffrin,

1981) is able to capture a finding called contiguity, or the lag recency effect (Kahana, 1996), where each recalled word will tend to have been studied in a serial position near the word that was previously recalled. SAM is able to capture this prediction because though in the

first cycle of recall the cue is the context, in subsequent cycles the probe may include a previously recalled item. In this case, the recovery is based on an exponential function of the sum of strengths of both the context and item as cues, and will tend to retrieve items that were studied together in the STS. However, just as modal models like SAM have difficulty capturing LTR, SAM cannot predict the contiguity effect in delayed (also called the long-term lag recency effect) (Howard & Kahana, 1999).

The temporal context model (TCM; Howard & Kahana, 2002) is a context-driven single-store model developed in an effort to capture the long-term lag recency effect. It does so by first establishing a current state of context, ti. The study of an item triggers the retrieval of its pre-experimental context, which is then added to ti; in this sense, the representation of context continually updates. As context drifts, there is also a decrement of the previous state of context, such that the currently studied item has the highest ac- tivation. During each presentation, the item studied is associated to other items that are simultaneously active in the context layer. At test, the current state of context is used as a retrieval cue. When an item is successfully recovered, an input context layer is created which is composed of the items pre-experimental and experimentally-learned contextual associations. Though TCM is able to capture LTR and long-term contiguity, it does so

7 by predicting that context is driven by the items themselves, and therefore recency and

contiguity would not be affected by unfilled study-test delays (Howard & Kahana, 2002).

In addition to LTR and the long-term lag recency effects, another empirical finding that

proved problematic for the GMMs was the null list strength effect (LSE). The null LSE

was first demonstrated by Ratcliff, Clark and Shiffrin (1990) in a series of experiments that

included pure weak lists with only weak items, pure strong lists with only strong items,

and mixed lists that contained both weak and strong items. (Items are strengthened at

study by increasing the duration of presentation or repeating the items multiple times.)

According to convention, performance on the strong items in the mixed list should be better

than performance on the strong items in the pure strong list because in the pure strong list

there is increased interference from the strong items (and an increase in the false alarm

rate). Contrastingly, performance on weak items in the pure weak list should be better than

performance on weak items in the mixed list.

However, Ratcliff, Clark & Shiffrin (1990) demonstrated that in both cases there were

no differences in performance – weak items from pure weak lists were not easier to detect

than weak items from mixed lists, and strong items were no better discriminated. The null

LSE has been replicated extensively (e.g., Murnane & Shiffrin, 1991; Ratcliff, Sheu, &

Gronlund, 1992), prompting new models: the retrieving effectively from memory model

(REM: Shiffrin & Steyvers, 1997) and the model of McClelland and Chappell (1998).

In the retrieving effectively from memory model (REM; Shiffrin & Steuvers, 1997),

items are represented as vectors with feature values (positive integers) that are stored in

memory with a probability u per unit of time. However, the value stored is not always the

correct copy (predicted by probability c). Features also have environmental base rates (g) that vary depending on frequency: high frequency words have higher base rates than low

8 frequency words. At test, the probe is compared to each item vector in parallel and the pattern of matches and mismatches is used to compute a likelihood ratio.

REM is able to simultaneously predict a list length effect (LLE) and capture the null

LSE by implementing a concept known as differentiation (Shiffrin et al., 1990). Differ- entiation occurs when strengthening memory traces at encoding allows for the copying of additional features, thereby building a stronger single memory trace as opposed to creating multiple traces for the same item. Highly differentiated traces are less likely to overlap with (and so respond less strongly to) distracter and/or non-target items. Capturing both effects with the same basic mechanisms is impossible under the standard global-matching framework.

The list length effect (LLE), that memory performance decreases as the list length in- creases, is widely evidenced in both recall (e.g., Murdock, 1962) and recognition (Strong,

1912; Bowles & Glanzer, 1983; Gronlund & Elam, 1994; Murnane & Shiffrin, 1991; Un- derwood, 1978). Basic GMM mechanisms require a list length effect because more items studied entails more overall noise. REM, for example, predicts a LLE because presenta- tions of additional items creates additional memory traces, thus increasing the variance of the match (Shiffrin & Steyvers, 1997; Criss & Shiffrin, 2004).

The bind cue decide model of episodic memory (BCDMEM; Dennis & Humphreys,

2001) differs from GMMs that use context to probe memory. Instead, BCDMEM suggests that the probe item first retrieves all associated contexts, which are then compared to a reinstated study list context. As such, BCDMEM is characterized as a context noise model.

The study-list context is represented as a set of active or inactive nodes that are encoded with sparsity (s) and learning rate (r) predicting the proportion of active nodes related to the study item. At test, the probe word retrieves a composite vector of associated contexts,

9 containing pre-experimental contexts (based on parameter p, an estimate of context noise)

and also elements of the study list context (based on parameter r). Additionally, a noisy

version (determined by the parameter, d) of the study list context is reinstated.

The retrieved composite vector for related contexts (m) is then matched to the reinstated study context vector (c), and the pattern of that match reflects the degree of similarity between the test word and the study context, indicating the likelihood that a probe item was presented at study.

Long-term recognition memory experiments have been demonstrated to eliminate the list length effect by controlling for key experimental confounds (Dennis & Humphreys,

2001; Dennis & Chapman, 2010; Kinnell & Dennis, 2011). Specifically, null list length effects have been demonstrated by introducing a long delay between study and test that is filled with a distracter task. The delay requires reinstatement of the study list as a unit rather than resorting to the end-of-list context to assess probe familiarity. This confound will be referred to as contextual reinstatement. The delay is also designed to equate the overall time elapsed for a given item between presentation at study and again at test (the retention interval). It is important that the delay not only equate the retention interval for both short and long lists, but also be incorporated after all study lists to encourage full contextual reinstatement. Furthermore, because shorter lists may be rehearsed more completely during any unfilled delay, the delay must include a distracter task (Dennis, Lee

& Kinnell, 2008).

BCDMEM is able to simultaneously predict null list length and null list strength effects without a differentiation mechanism because interference is based on how well the rein- stated study list context matches the retrieved study item context, and therefore the number

10 of traces created for studied items has no effect on noise in the matching process at re- trieval. Although REM could capture the null list length and strength effects by using high levels of proactive interference or low levels of word similarity, doing so would undermine the motivation for differentiation, suggesting that simpler models would suffice (Dennis &

Humphreys, 2001).

Context noise can also account for another landmark finding in recognition memory known as the word frequency effect (WFE), which predicts better performance for target and distracter words that occur with low frequency in English than words that occur with high frequency (Glanzer & Bowles, 1976). The WFE is called a mirror effect because it produces a pattern where higher hit rates coincide with lower false alarm rates for low frequency words. To provide a context-noise explanation for frequency effects, Dennis

& Humphreys (2001) suggested that pre-experimental contexts will interfere with high frequency words to a greater extent than low frequency words because high frequency words have occurred in more prior contexts and will overlap more by chance with list context. REM accommodates the WFE by stipulating that high frequency words have less distinctive features than low frequency words (due to the base rate, g); thus, more features will match by chance, resulting in a greater interference (Shiffrin & Steyvers, 1997).

2.2 The Sternberg Paradigm

If item noise is the driving source of interference, the latency to respond to items in a short-term recognition task should increase with list length. Sternberg (1966) established an ideal paradigm for observing list length effects in short-term recognition memory. Ac- cording to the 1966 procedure, lists were comprised of one to six digits (set size) presented

11 for 1.2 seconds each. There was a 2-second, unfilled delay between study and the recog-

nition test, in which participants had to distinguish a single test item for each series from

distracter items (Sternberg, 1966, 1969). Sternberg found that latency for both distracters

and targets increased linearly with set size. More specifically, with each additional item

presented at study, there was an average increase in reaction time of about 40 ms. When

considering methodology in a Sternberg paradigm, it is important to note that:

Several conditions must be met for reliable observation of the Sternberg effect. The items in the original list must be presented slowly enough to ensure per- fect acquisition (Sternberg presented his lists at a rate of 1.2 sec per item), an interval of several seconds must elapse between the presentation of the list and the probe, and subjects must be allowed to rehearse the list during this interval without being distracted by any other task. (16, Reed, 1976)

Sternberg concluded that retrieval occurs via an exhaustive serial scan (1966, 1967).

Meaning, the probe item is successively compared to all the items from the study set and instead of terminating when a positive match is made, the scanning process is exhaustive and continues through the entire list. Consequently, the mean reaction time will increase by approximately 40 ms for each item studied for both positive and negative probe items.

Exhaustive serial scanning may seem counterintuitive for positive probe trials because sub-

jects must continue scanning even after they have identified the target item; however, it

seems less implausible when considering that any short-term search process must be highly

efficient to capture trends such as fast correct rejections, for example (Clark & Gronlund,

1996). If the scanning process is automatic, it seems most parsimonious to suggest an

exhaustive serial search for both positive and negative probes (Nosofsky et al, 2011).

Exhaustive serial scanning, however, is no longer considered tenable; instead, data is

now fit with global-familiarity and parallel self-terminating models (Van Zandt & Townsend,

1993, Townsend & Fific,´ 2004). The parallel processing approach assumes that the memory

12 traces are processed simultaneously rather than one at a time. Self-terminating parallel pro- cess models capture the reaction time trends by assuming that processes are limited in ca- pacity (i.e., processing slows down with increasing memory load) (Van Zandt & Townsend,

1993).

Recency has been hypothesized to drive list length effects in short-term memory (Mon- sell, 1978; McElree & Dosher, 1989). Recency predicts that as more items are studied, the mean overall familiarity will decrease due to averaging across more items with longer lags.

The recency explanation is concurrent with Sternbergs findings when the delay between study and test is relatively short – on the order of 0.1 (Monsell, 1978) to 0.3 (McElree &

Dosher, 1989) seconds. Monsell (1978) generated the recency hypothesis to account for data from a recognition task with study lists of 1-4 English letters and a retention interval

(RI) of 0.1 second. One concern is that because the stimuli were from a closed set (i.e., there were only four possible items), there may have been increased proactive interference as items are repeated from trial to trial (Coltheart, 1993). Recency results were replicated by McElree and Dosher (1989) with an open set of study lists 2-5 words long and a RI of

0.3 seconds.

The recency explanation predicts that the time to accurately respond to test items de- creases as the delay between study and test is reduced. When the delay is greater than 2 seconds, however, the effect of recency is diminished yet the length effect remains (Forrin

& Cunningham, 1973). Forrin and Cunningham (1973) specifically manipulated the study- test delay in intervals of 500, 1500, 2500, or 3500 ms. The study list contained either three or six digits presented for 1 second each. There was no effect of probe position on RT for the list of six digits, but there were still primacy and recency trends for the 3 item list length

13 (in both cases average RT latency increased with list length). The second experiment uti- lized letters instead of digits to ensure that participants were no more apt to preferentially rehearse the shorter lists because the distracters were less easily predicted (for lists of 6 digits, there were only 4 possible distracters). In the second experiment there were only two retention interval durations – 620 ms and 3620 ms. There were no recency effects at the 3620 ms delay for both the 3-item and 6-item list lengths, and yet list length effects were still significant.

Though difficult to envisage on a short-term time scale, rehearsal may affect the efficacy of contextual reinstatement such that response times increase when participants reference the last item rehearsed preferentially relative to the study-list context as a whole. More- over, rehearsal may potentially introduce a number of different processing strategies, for example, participants may rehearse items more elaborately when anticipating a serial recall task as compared to a yes/no recognition task. The Sternberg paradigm originally included a serial recall task following each study-test cycle (Sternberg, 1966).

Duncan & Murdock (2000) found an interaction between intentionality and serial posi- tion for a mixed recognition-recall paradigm with a 1 second study-test delay. Each study list was randomly assigned to either a recall or recognition test. When participants were pre-cued for task, there was a clear recency effect in the recognition task trials. However, when post-cued, the recognition serial position function was flat. A questionnaire con-

firmed that participants prepared for recall in the absence of cue. Indeed, the effect of the serial recall task in the Sternberg paradigm has been demonstrated to affect the shape of the response curve (Corbin & Marquer, 2008; Donkin & Nosofsky, in press). It is clear that factors such as the presentation rate and the study-probe lag can markedly affect the data in replications of the Sternberg (1966) paradigm.

14 The current research aims to disambiguate factors underlying list length findings in the Sternberg paradigm and to determine if, as in long-term paradigms, it is possible to eliminate the length effect. Such a result would undermine item noise accounts such as the global-matching models of memory. In addition to evaluating item versus context ap- proaches to interference in short-term recognition memory, another overarching theoretical question of interest is whether it is necessary to posit separate memory systems. To the ex- tent that one model is able to capture the list length and word frequency effects across both short- and long-term recognition memory paradigms, it would seem to suggest that there is little motivation to posit distinct memory systems governed by fundamentally disparate processes.

15 CHAPTER 3

EXPERIMENT 1

The main objective of Experiment 1 was to manipulate the opportunity to rehearse in the Sternberg paradigm (1966) by introducing an articulatory suppression task during the 2 second study-test delay. The serial recall task was eliminated from the paradigm to coun- teract the effects of strategic encoding. Moreover, stimuli were words rather than digits, to examine the word-frequency effect and make direct comparisons to the long-term mem- ory literature. It was hypothesized that if the articulatory suppression task is sufficiently engaging, the prototypical length effect will be greatly diminished or eliminated.

3.1 Method

3.1.1 Participants

Participants were 44 undergraduates enrolled in an introductory psychology course at

The Ohio State University. Participants selected this experiment to fulfill a partial-credit course requirement and all provided informed consent. The 2-second unfilled delay con- dition included 11 women and 11 men averaging 18.59 years in age. Participants in the

2-second delay with suppression were 10 women and 12 men with a mean age of 19.27 years.

16 3.1.2 Stimuli

Stimuli were lists of lengths two, four, and six composed of 5-, 6-, and 7-letter low fre-

quency (1-4 Google counts per million) and high frequency (100-200 counts) words. An

additional 22 words drawn from mid-frequency ranges (42-62 counts) were presented dur-

ing the practice portion of the experiment. All words and word frequency counts came from

the Google word database1. Distracter words were drawn from the same stimuli pool as test words, but did not appear in studied lists. The same words were used between-subjects across conditions. The distracter task presented five random integers as a single number between 10,000 and 90,000 (displayed without commas) on the screen for 2 seconds.

3.1.3 Design

Within-subjects factors were the length of the study list and word-frequency. The between-subjects factor was the study-test delay. The dependent measure was response time (RT). There were three levels of length, two levels of delay (2-second unfilled and

2-second filled), and two levels of word-frequency (high frequency and low frequency).

3.1.4 Procedure

The experiment was completed in one session that contained three iterations. Each se- rial position of each list length was probed six times for a total of 12 position-by-length combinations. Each of the 12 types was tested in pure high-frequency and pure low- frequency cycles such that every iteration contained 24 target trials and 24 distracter trials.

Participants were seated at individual desks separated by partitions. Up to eight par- ticipants were tested at once, except in the articulatory suppression condition where each person was tested individually. Participants were instructed to remember a list of study

1The Google word database can be found at http://mall.psy.ohio-state.edu/wiki

17 words for a test where they must indicate whether a word was studied in the prior list or was new (Q = Old / P = New). In addition to this instruction, participants in the sup- pression condition were informed that digits would appear and they should quickly and accurately say the digits aloud as a number. Response times were logged utilizing the

Python-Experiment Programming Library (PyEPL: Geller, Schleifer, Sederberg, Jacobs, &

Kahana, 2007).

Words were presented in the center of the screen for 1.2 seconds each. At the end of study, there was a 2 second delay followed by 0.5 seconds of fixation preceding the test item. Participants were allotted 2.7 seconds to respond. There then followed a 1 second delay and 0.5 second fixation interval before the next cycle. In the articulatory suppression condition, the five digits appeared on the screen immediately following study and remained there for a fixed duration of 2 seconds. The remainder of the cycle proceeded as in the standard condition. A schematic of the study-test blocks for both the 2 second filled- and unfilled-delay conditions can be seen in Figure 3.1.

Figure 3.1: Schematic of the experimental paradigm for Experiment 1. The top row demon- strates the 2-second unfilled delay condition and the bottom row demonstrates the 2-second filled delay condition.

18 3.2 Results

Two participants with d0 < 0.9 were eliminated from the analysis2. Accuracy was at ceiling (d0(length 2) = 3.07, d0(length 4) = 3.119, and d0(length 6) = 2.769), indicating there were no speed-accuracy tradeoffs. A three-way mixed factors ANOVA on the target RTs revealed a statistically significant main effect of word-frequency, F(1, 42) = 6.433, p <

0.05. The main effect of list length was statistically significant, F(2, 84) = 24.266, p <

0.001. There was also a statistically significant interaction effect of delay condition by list length, F(2, 84) = 7.391, p < 0.01. All other effects were not statistically significant: main effect of delay condition [ F(1, 42) = 0.018, p = 0.895 ]; interaction effect of delay condition by word-frequency [ F(1, 42) = 1.123, p = 0.295 ]; interaction effect of list length by word- frequency [ F(2, 84) = 0.387, p = 0.680 ]; and three-way interaction of delay condition by list length by word-frequency [ F(2, 84) = 0.164, p = 0.849 ].

Because there was an interaction of delay condition by list length, a linear regression was run to examine the list length effect in each of the two delay conditions. The linear regression analysis assessed the average of the median RTs across combined positions (1-2,

3-4, and 5-6) for each length per participant. In the 2-second unfilled delayed condition, the slope was statistically significantly greater than zero: t(15.77) = 2.656, p < 0.0013. In the 2 second articulatory suppression condition, the slope was not statistically significantly different from zero: t(23.28) = 0.457, p = 0.649. For the negative probes, an ANOVA revealed a non-statistically significant interaction of length by delay, F(2, 84) = 1.073, p =

0.346. The combined positive and negative probe results are displayed in Figure 3.2.

2Edge corrections were performed to avoid infinite values of d0 by adding 0.5 to the hit and false alarm counts and 1 to the target and distracter counts (Snodgrass & Corwin, 1988). 3All t-test degrees of freedom were corrected with the Welch-Satterwaithe equation.

19 Recency trends were assessed on the average of median RTs for combined positions

(1-2, 3-4, and 5-6) for each given length. The results of the linear regression analysis are as

follows for each position in the 2-second unfilled delay condition: list length 4 [ t(66.84)

= -1.267, p = 0.212 ]; and list length 6 [ t(17.3) = -0.726, p = 0.471 ]. The results for the

suppression condition are: list length 4 [ t(48.77) = -0.774, p = 0.443 ]; and list length 6

[ t(25.19) = -1.789, p = 0.078 ]. Athough there appears to be a recency trend in the filled delay condition, it was not significant and it was not sufficient to generate a significant list length effect. The recency results for the 2-second unfilled delay and the 2-second delay with articulatory suppression are presented in Figures 3.3 and 3.4, respectively.

In the 2-second unfilled delay condition, there was no interaction effect between word frequency and list length, F(2,42) = 0.213, p = 0.809. However, there was a statisti-

cally significant word-frequency effect when comparing high frequency words and low

frequency words, F(1, 21) = 6.817, p < 0.05. The word frequency results for the 2-second

unfilled delay condition are presented in Figure 3.5. As is evident in Figure 3.5, the dif-

ference in average RTs does not appear to be practically significant. To further investigate

this discrepancy, pairwise t-tests were conducted at each length between high frequency

and low frequency RTs in the unfilled delay condition. The results of the t-test are: list

length 2 [ t(21) = -1.317, p = 0.202 ]; list length 4 [t(21) = -1.331, p = 0.198]; and list

length 6 [ t(21) = -0.72, p = 0.479 ]. In the 2-second filled delay condition, there was no

statistically significant interaction between word frequency and list length, F(2,42) = 0.31,

p = 0.735; moreover, there was no word frequency effect: F(1, 21) = 1.037, p = 0.32. The

word-frequency results for the 2-second filled delay condition are displayed in Figure 3.6.

20 3.3 Discussion

The typical Sternberg list length effect was replicated when rehearsal was possible dur- ing the 2-second study-test delay; however, a filled delay eliminated the list length effect.

Recency effects were not found in the 2-second unfilled delay condition despite increases in response time as a function of length. In this case, recency alone does not capture the short-term list length effect, though it may when the recognition test occurs immediately af- ter study (Monsell, 1978). The overwhelming pattern, however, indicates that recency was not evident. Because there was no word frequency effect in either condition, it is possible that reinstatement of the study list context is facilitated in short-term recognition memory to the extent that frequency effects are not observed.

21 Length by RT 1400 Suppression: Correct Rejections No Suppression: Correct Rejections Suppression: Hits No Suppression: Hits 1300 1200 1100 RT 1000 900 800 700

0 1 2 3 4 5 6 7

Length

Figure 3.2: The averaged median latencies to correctly respond yes to targets (solid lines) and no to distracters (dashed lines) in Experiment 1 as a function of list length. The square symbol marks data corresponding to the suppression condition where articulation was sup- pressed during the 2-second delay between study and test. The triangle symbols mark data corresponding to the condition where there was no distracter task during the 2-second study-test delay. Error bars represent the standard error.

22 Serial Position: No Suppression 1400 ● Length 6 ● Length 4 ● Length 2 1300 1200

● ● ● 1100 ● RT

● 1000

● 900 800 700

−7 −6 −5 −4 −3 −2 −1 0

Position

Figure 3.3: Results of the serial position analysis in the condition without articulatory sup- pression. The graph displays averaged median latencies for combined positions to correctly respond yes to targets in Experiment 1 as a function of position for each list length in the unfilled 2-second delay condition. Error bars represent the standard error.

23 Serial Position: Suppression 1400 ● Length 6 ● Length 4 ● Length 2 1300 1200

● 1100

● ●

RT ●

● ● 1000 ● ● 900 800 700

−7 −6 −5 −4 −3 −2 −1 0

Position

Figure 3.4: Results of the serial position analysis in the articulatory suppression condition. The graph displays averaged median latencies for combined positions to correctly respond yes to targets in Experiment 1 as a function of position for each list length in the 2-second articulatory suppression delay. Error bars represent the standard error.

24 Length by RT 1400 Low Frequency High Frequency 1300 1200 1100 RT 1000 900 800 700

0 1 2 3 4 5 6 7

Length

Figure 3.5: The word-frequency effect for the condition without articulatory suppression. Results depict the averaged median latencies to correctly respond yes to targets in Exper- iment 1 as a function of word frequency across list length in the unfilled 2-second delay condition. Reaction time to low-frequency target words is depicted with a solid line, while high-frequency word RT is depicted with a dashed line. Error bars represent the standard error.

25 Length by RT 1400 Low Frequency High Frequency 1300 1200 1100 RT 1000 900 800 700

0 1 2 3 4 5 6 7

Length

Figure 3.6: The word-frequency effect in the articulatory suppression condition. Results depict the averaged median latencies to correctly respond yes to targets in Experiment 1 as a function of word frequency across list length in the 2-second articulatory suppression delay condition. Reaction time to low-frequency target words is depicted with a solid line, while high-frequency word RT is depicted with a dashed line. Error bars represent the standard error.

26 CHAPTER 4

EXPERIMENT 2

Experiment 2 was designed to investigate whether context noise plays a role in short- term recognition memory. More specifically, we aimed to demonstrate a word-frequency effect by changing the probability of accurate reinstatement of the study-list context. In short-term memory paradigms, contextual reinstatement is typically very accurate because study lists appear in rapid succession requiring clear boundaries between study-test cycles.

As such, participants are motivated to isolate the most recent context to decrease false alarms in response to items from prior lists. We tested the hypothesis that increasing the delay between study and test increases the word-frequency effect by separating lists with

15 seconds, either within or between study-test cycles. To counter the effects of rehearsal, the 15 second delay was filled with a distracter task.

4.1 Method

4.1.1 Participants

Participants were 36 undergraduate students enrolled in an introductory psychology course at The Ohio State University. Participants selected this experiment to fulfill a partial- credit course requirement and all provided informed consent. The 2 second delay condition

27 included 11 women and 7 men averaging 21.44 years in age. There were 6 women and 12 men with a mean age of 20.11 years in the 15 second delay condition.

4.1.2 Stimuli

Word stimuli were identical to those used in Experiment 1. The distracter task consisted of randomly generated addition / subtraction problems. Half the problems were incorrect by one numeric interval, for example: 9 - 7 = 2 (Q = Yes / P = No). The task was self-paced and included feedback (Correct / Wrong).

4.1.3 Design

The within-subjects factors of Experiment 2 were the length of the study list and word frequency. The between-subjects factor was the study-test delay. The dependent measure was response time (RT). There were three levels of length, two levels of delay (2 seconds unfilled and 15 seconds filled), and two levels of word-frequency.

4.1.4 Procedure

The experiment was completed in 3 single-iteration sessions conducted 7 days apart.

The same stimuli were used in both sessions, but the order of individual stimuli, list lengths, and probe positions were randomized between days. Each iteration consisted of 24 target trials and 24 distracter trials for a total of 6 observations per positive probe position.

The 15 second math task appeared in both conditions (either between or after each study-test cycle) to equate overall testing time. Participants were informed of when the math test would appear, and were instructed to respond both quickly and accurately on all tasks.

28 Words were presented to the center of the screen for 1.2 seconds each. At the end of study, there was a 2 second delay followed by 0.5 seconds of fixation preceding the test item. Participants were allotted 2.7 seconds to respond. When a response was registered, the math portion began. Each of the 16 problems was presented for up to 1.875 seconds each. The total possible time was 30 seconds; but, on average, the delay was approximately

15 seconds. A 1 second delay and 0.5 second fixation interval signaled the start of the next study block. In the delayed condition, the first of 8 math problems appeared on the screen immediately following study. There then followed a 1 second delay and 0.5 second fixation interval before the next cycle. A schematic of the study-test blocks for the 15 second filled- and 2 second unfilled-delay conditions can be seen in Figure 4.1.

Figure 4.1: Schematic of the experimental paradigm for Experiment 2. The top row demon- strates the 2-second unfilled delay condition (with the filler task between cycles) and the bottom row demonstrates the 15-second filled delay condition.

29 4.2 Results

Four participants with d0 < 0.9 were eliminated4. Accuracy was at ceiling (d0(length 2)

= 2.663, d0(length 4) = 2.371, and d0(length 6) = 2.661), indicating there were no speed- accuracy tradeoffs. A three-way mixed factors ANOVA on the target RTs revealed a sta- tistically significant main effect of delay condition, F(1, 32) = 4.584, p < 0.05. The main effect of list length was also statistically significant, F(2, 66) = 8.924, p < 0.001. All other effects were not statistically significant: main effect of word-frequency [ F(1, 32) = 2.08, p = 0.159 ]; interaction effect of delay condition by list length, [ F(2, 66) = 0.122, p =

0.885 ]; interaction effect of delay condition by word-frequency [ F(1, 32) = 1.352, p =

0.254 ]; interaction effect of list length by word-frequency [ F(2, 63) = 0.561, p = 0.573 ];

and three-way interaction of delay condition by list length by word-frequency [ F(2, 63) =

0.077, p = 0.926 ].

Although the interaction effect of delay condition by list length was not statistically sig-

nificant, there were main effects of both delay condition and list length, and to keep anal-

yses consistent with Experiment 1, a linear regression was run to examine the list length

effect in each of the two delay conditions. The linear regression analysis assessed the aver-

age of the median RTs (for combined positions: 1-2, 3-4, and 5-6) as a function of position

for each list length. In the 2-second unfilled delay condition, the slope was statistically sig-

nificantly greater than zero: t(16.19) = 3.272, p < 0.015. The slope in the 15 second filled

delay condition was not statistically significantly different from zero: t(15.62) = 0.63, p

4Edge corrections were performed to avoid infinite values of d0 by adding 0.5 to the hit and false alarm counts and 1 to the target and distracter counts (Snodgrass & Corwin, 1988). 5All t-test degrees of freedom were corrected with the Welch-Satterwaithe equation.

30 = 0.531. For the negative probes, an ANOVA revealed a non-statistically significant in-

teraction of length by delay, F(2, 68) = 0.315, p = 0.731. The list length results for both

positive and negative probes in Experiment 2 are presented in Figure 4.2.

A linear regression analysis was conducted to examine the recency effects. RTs were

combined such that the medians for positions 1-2, 3-4, and 5-6 were averaged for each

participant. The results for combined positions across each length in the 2-second unfilled

delay condition are as follows: list length 4 [ t(33.67) = 0.163, p = 0.871 ]; and list length

6 [ t(17.253) = 0.549, p = 0.586 ]. The results for the 15-second filled delay condition are: list length 4 [ t(29.287) = 0.305, p = 0.762 ]; and list length 6 [ t(15.869) = 0.248, p

= 0.805 ]. Serial position graphs for the 2-second unfilled delay and the 15-second filled delay are presented in Figures 4.3 and 4.4.

In the 2-second unfilled delay condition, there was no interaction effect between word frequency and list length, F(2,33) = 0.218, p = 0.805. Similarly, there was no word- frequency effect when comparing high frequency words and low frequency words, F(1, 16)

= 2.46, p = 0.136. The word frequency results for the 2 second unfilled delay condition are presented in Figure 4.5. The frequency effect result for the 2-second unfilled delay condition was surprising given the results depicted in Figure 4.5, where the RT for low frequency words is consistently lower than high frequency words at each list length.

To further investigate this trend, pairwise t-tests were conducted at each length between high frequency and low frequency RTs. The results of the t-test at list length 2 were not statistically significant, t(16) = 0.188, p = 0.853. There was no statistically significant difference between high frequency words and low frequency words at list length 4 either, though the results are certainly trending in that direction: t(17) = 1.748, p = 0.098. How- ever, there was a significant difference in the RTs at list length 6, t(17) = 2.476, p < 0.05.

31 In the 15-second filled delay condition, there was no significant interaction between word frequency and list length, F(2,30) = 0.511, p = 0.605. Additionally, there was no word frequency effect: F(1, 16) = 0.198, p = 0.662. The word-frequency effect results for the

15-second filled delay condition are displayed in Figure 4.6.

4.3 Discussion

Results demonstrated a null list length effect when the design included both a longer study-test lag and a distracter task. As with Experiment 1, recency effects do not appear to drive the list length effects. Interestingly, although there was no WFE in the 2-second un-

filled delay condition in Experiment 1, the effect is evidenced here when the only difference is the 15 second distracter task between study-test cycles. Because the word-frequency ef- fect was attenuated when the filler task occurred within cycles, it is possible that proactive interference from previous study list contexts played a role.

32 Length by RT 1400 15 s Delay: Correct Rejections 2 s Delay: Correct Rejections 15 s Delay: Hits 2 s Delay: Hits 1300 1200 1100 RT 1000 900 800 700

0 1 2 3 4 5 6 7

Length

Figure 4.2: The averaged median latencies to correctly respond yes to targets (solid lines) and no to distracters (dashed lines) in Experiment 2 as a function of list length. The square symbol marks data corresponding to the rehearsal condition where there was no distracter task during the 2-second study-test delay. The triangle symbols mark data corresponding to the condition where there was a 15-second distracter task during the study-test delay. Error bars represent the standard error.

33 Serial Position: 2 s Delay 1400 ● Length 6 ● Length 4 ● Length 2 1300 1200 1100

RT ● ● ● 1000

● ● 900

● 800 700

−7 −6 −5 −4 −3 −2 −1 0

Position

Figure 4.3: Results of the serial position analysis in the condition where the 15 second distracter task occurred between study-test cycles. The graph displays averaged median latencies for combined positions to correctly respond yes to targets in Experiment 2 as a function of position for each list length in the unfilled 2-second delay condition. Error bars represent the standard error.

34 Serial Position: 15 s Delay 1400 ● Length 6 ● Length 4 ● Length 2 1300 1200 ● 1100 ● ● ● RT ● ● 1000 900 800 700

−7 −6 −5 −4 −3 −2 −1 0

Position

Figure 4.4: Results of the serial position analysis in the condition where the 15-second distracter task occurred within study-test cycles. The graph displays averaged median la- tencies for combined positions to correctly respond yes to targets in Experiment 2 as a function of position for each list length in the 15-second delay condition. Error bars repre- sent the standard error.

35 Length by RT 1400 Low Frequency High Frequency 1300 1200 1100 RT 1000 900 800 700

0 1 2 3 4 5 6 7

Length

Figure 4.5: The word-frequency effect for the filler between cycles condition. The graph displays the averaged median latencies to correctly respond yes to targets in Experiment 2 as a function of word frequency across list length in the unfilled 2-second delay condi- tion. Reaction time to low frequency target words is depicted with a solid line, while high frequency word RT is depicted with a dashed line. Error bars represent the standard error.

36 Length by RT 1400 Low Frequency High Frequency 1300 1200 1100 RT 1000 900 800 700

0 1 2 3 4 5 6 7

Length

Figure 4.6: The word-frequency effect for the filler within cycles condition. The graph dis- plays the averaged median latencies to correctly respond yes to targets in Experiment 2 as a function of word frequency across list length in the 15-second delay condition. Reaction time to low frequency target words is depicted with a solid line, while high frequency word RT is depicted with a dashed line. Error bars represent the standard error.

37 CHAPTER 5

GENERAL DISCUSSION

One of the most important questions in memory research is what role context plays in encoding and retrieval (Tulving, 1972). Most memory models focus on specifying the processes and mechanisms implicated in memory with the simplifying assumption that context or the items presented at study are immediately accessible (Izawa, 1999). Early mechanistic models attempted to resolve this problem by postulating a short-term store that isolated the most recent study context from long-term memory (Atkinson & Shiffrin,

1968). According to the Atkinson & Shiffrin (1968) model, the STS had control processes regulating retrieval and rehearsal that determined how well information was retained and accessed. BCDMEM (Dennis & Humphreys, 2001), a context noise model, captures these processes through assumptions regarding contextual reinstatement and pre-experimental associations, without positing any mechanistically distinct short-term system.

The current research aimed to elucidate factors underlying forgetting in a short-term recognition memory task – in particular, addressing the role item noise and context noise play in the Sternberg paradigm (1966). Recognition impairment is generally attributed to interference that may derive from an accumulation of noise over the length of a studied list

(e.g., Shriffrin & Steyvers, 1997), interference in matching items to the original context of interest (e.g., Dennis & Humphreys, 2001), or a combination of both sources (e.g., Criss

38 & Shiffrin, 2011). Global-matching models suggest that interference in predicting accu- rate yes/no decisions derives from the accumulation of noise over successive item-to-item matches between the test probe and the items presented in the study list, thereby requir- ing a list length effect due to the fact that more items studied entails more matches and consequently more overall noise (Humphreys et al., 1989).

The list length effect has been demonstrated to be a robust finding in recognition mem- ory. However in long-term paradigms, it has been argued that decrements in performance, at least for word stimuli (Kinnell & Dennis, 2011), are not a direct consequence of item noise but rather due to a number of confounding factors inherent to the testing paradigm

(Humphreys, Bain & Pike, 1989; Dennis & Humphreys, 2001; Dennis et al., 2008). The current data suggests that in no instance does item noise drive the list length effect in short- term memory. When there are list length effects, the causes are likely due to: (1) recency in immediate tests; (2) serial rehearsal in recall-oriented tasks; and (3) rehearsal when the study-test delay does not include an engaging distracter task.

The first experiment incorporated one of the main experimental controls found to be effective in eliminating list length effects in long-term recognition paradigms: contextual reinstatement (e.g., Schulman, 1974; Murnane & Shiffrin, 1991; Dennis & Humphreys,

2001; Dennis & Chapman, 2010; Kinnell & Dennis, 2011). Contextual reinstatement was manipulated in the Sternberg paradigm by introducing an engaging distracter task during the study-test delay to prevent subvocal articulation. Preventing rehearsal effectively iso- lates the study-list context such that participants will not defer to the last item as indicative of the entire study list and will instead engage in full contextual reinstatement.

Results from Experiment 1 demonstrated that there was no list length effect at a 2- second delay in which rehearsal is prevented. When rehearsal is possible, list length effects

39 are maintained. However, in both instances, recency does not appear to affect the list length

results. The fact that there are no recency effects suggests that the items are not simply

accessed from a short-term store.

The second experiment modulated contextual reinstatement with the unfilled versus

filled delay manipulation, and additionally by manipulating the temporal distinctiveness of

study list contexts. This was achieved by adding a 15 second distracter task either between

study-test cycles or within study-test cycles. When the distracter occurred between cycles, the LLE was maintained. This result is concordant with Experiment 1 because in both cases the within-cycle delay consisted of a 2 second unfilled interval and in both cases there were length effects. Interestingly, the between-cycle delay condition in particular exhibited a WFE, which was not evident in any other condition. One possibility for this effect is that participants utilized strategic partitioning of context in order to guard against proactive interference. Proactive interference (PI) refers to the negative impact that prior information has on encoding efficacy (Postman & Underwood, 1973).

In short-term memory paradigms, contextual reinstatement is typically very accurate because lists appear in quick succession, requiring clear boundaries between study-test cy- cles (these could be established by a mechanism similar to directed forgetting; Sahakyan &

Kelley, 2002). In the 2-second unfilled delay condition, the current study list is more dis- tanced from the prior study list, so strategic forgetting may not be utilized. Consequently, contextual reinstatement may actually be worse in the 2-second delay condition compared with the 15-second delay condition.

A strategic forgetting scenario is very plausible in short-term recognition paradigms.

Although Sternberg (1966, 1969) did not manipulate word-frequency, theoretically using a closed set of stimuli (in the case of the Sternberg paradigm, digits 0-9), would accentuate

40 the problem of proactive interference because there may be a carry-over effect in which

participants rehearse the previous sets of digits to predict those that are probabilistically

most likely to appear in the current study list.

5.1 The BCDMEM Model

The pattern of results from the two data sets is easily explained by BCDMEM (Dennis

& Humphreys, 2001). The word-frequency effect can be easily associated with contextual

reinstatement (parameter d), whereas the list length effect depends more on the probability

of learning a given link (r) between active study context vector components and the local item nodes. Dennis and Humphreys (2001) suggested that the low frequency advantage is due to the fact that low frequency words will cue fewer pre-experimental contexts. This prediction was captured by adjusting the context noise parameter (p) such that the low

frequency words had lower estimates.

According to BCDMEM, context noise can vary on an item-by-tem basis, but must be

estimated through an automatic low-level mechanism. Consequently, it is unreasonable

to adjust parameter p to account for proactive interference from prior study-test cycles.

However, it makes sense that proactive interference from previous study lists would impair

participants ability to match features from the relevant study list context. Although there

will be more active nodes in the retrieved vector representing pre-experimental contexts for

high frequency words, this source of noise will have less of an impact if the retrieved com-

posite vector for related contexts (m) matches very closely to the reinstated study context

vector (c).

Moreover, the word-frequency effect was disproportionately more affected by the abil-

ity to isolate the current study context from prior contexts in a short-term task compared

41 to long-term tasks, which is why simply adjusting context noise cannot capture the range

of the effect. Preventing rehearsal in a 2-second lag has the same result as dramatically

increasing the delay to approximately 8 minutes in a long-term task, suggesting that it

reduces the tendency to reference end-of-the-list context, rather than instantiating full con-

textual reinstatement. There is no need to further adjust context noise on an item-by-item

basis in this case, rather, the contextual reinstatement parameter (d) can be increased in

the unfilled 2-second delay conditions where rehearsal was possible (perfect reinstatement

occurs when d = 0).

In both experiments, the list length effect consistently occurred when there was an unfilled delay between study and test. When there is no distracter task, participants have the opportunity to rehearse items from the study list. During the fixed delay interval, items from shorter lists may be rehearsed more completely, thereby increasing the probability of encoding features associated with those items. To capture this finding with BCDMEM the learning parameter (r) can be increased, resulting in an encoding advantage for shorter study lists. Context noise will have less of an impact if the learning (parameter r) was

higher, meaning there will be more links between the study item and the study-list context.

Additionally, the contextual reinstatement parameter (d) can be sufficiently small so as to

negate the effects of context noise.

While GMMs could theoretically capture null list strength and length effects, by re-

ducing the item interference to a negligible degree, it would be difficult for these models to

simultaneously predict the word frequency effect. For example, REM could predict null list

and strength effects without the differentiation mechanism by reducing the inter-item simi-

larity in the memory representations. At the same time, since item interference is predicted

42 to be sufficiently low as to eliminate length effects, it cannot predict frequency effects ei- ther. Moreover, since learning is captured by parameter (t) which is based on the item strength, there is no list-wide adjustment to account for proactive interference. Arguably,

REM cannot capture the LLE, WFE, and LSE as well as BCDMEM, but REM is only one of many exemplar-based models (though highly relevant in the recognition memory literature). Therefore, it is also useful to examine models that are specifically designed to capture reaction time effects in short-term domains.

One such model, the generalized context model (GCM; Nosofsky, 1986), was originally developed to explain categorization, but has since been applied to recognition memory and adopts many of the same mechanisms as GMMs. In GCM, similarity reflects the distance

(in psychological space) between stimulus i and stimulus j. The scalar (c j) affects the rate of the exponential decay between similarity and psychological distance such that low sensitivity predicts a shallow similarity gradient; meaning, it will be difficult to discriminate amongst exemplar traces. In this sense, high-similarity stimuli are conceptually the same as high-frequency words. As with the GMM framework, the probe item is compared to all of the studied item traces, and familiarity is summed over each match. The activation match (ai j) is determined by the probe items strength in memory (m j), which varies on an item-by-item basis, multiplied by the similarity between item i and item j (si j).

5.2 The EBRW Model

The exemplar-based random-walk model (EBRW: Nosofsky & Palmeri, 1997) adopts the same framework as GCM and has been specifically adapted to recognition domains.

According to this model, memory is represented as a combination of the stored exemplars presented at study and background noise. Retrieval follows a random-walk decision process

43 where the participant sets criteria to indicate whether the item was old or new. Retrieved

exemplars cause a step toward the old criteria and retrieval of a noise element results in a

step toward new. There are two ways in which the concept of criterion is expressed in the

model. The first is the location of the old and new thresholds, which affects the number of steps required before accumulating sufficient evidence to respond. The second is the drift rate (and direction) of the random walk, which is ultimately contingent on the degree of activation of background elements (B).

According to EBRW, “memory-set size influences [the memory-strength and sensitiv- ity] parameter settings indirectly: The greater the memory-set size, the more exemplars there will be that have greater lags” (284, Nosofsky et al., 2011). In other words, length effects would seem to be driven by a recency effect (Monsell, 1978). Since activation (A) reflects the summed strength, increases in set size would lead to infinitely increasing proba- bility (p) of steps toward the old criteria threshold. The activation of background elements,

B, must increase to offset the summed strength; consequently, B increases linearly with

set-size (S).

While it makes sense under this exemplar-based framework for activation of back-

ground elements (B) to increase as activation of studied exemplars (A) increases, thereby

predicting a list length effect, it is not clear why the activation of background elements

would change as a consequence of a filled versus unfilled delay. Utilizing articulatory

suppression during a study-test lag theoretically has no effect on the exemplars and back-

ground elements whatsoever (particularly since the distracter task involves non-word stim-

uli). Moreover, the authors suggest that recency drives the list length effect in both Monsell

(1978) and their first experiment (Experiment 1) that involved a study-test delay of approx-

imately 2 seconds, which is known to attenuate recency (Forrin & Cunningham, 1973).

44 CHAPTER 6

CONCLUSIONS

For modal models of memory, rehearsal is crucial to the transfer of information from short-term to long-term memory (Waugh & Norman, 1965; Atkinson & Shiffrin, 1968).

As argued here, rehearsal may increase the extent to which the studied context, as a unit, is learned. However, rehearsal may also affect recency in a way similar to the concept of circular scanning (Sternberg, 1969). Circular scanning was an idea Sternberg introduced to account for the finding that some participants displayed flat serial position curves, while others showed strong primary effects.

Sternberg (1969) hypothesized that individual differences may reflect starting strate- gies in scanning. In particular, rehearsal may induce a circular scan that begins with the last item rehearsed. This is an issue that has also been addressed by Nosofsky et al. (2011) for participants who demonstrated flat lag-RT functions. However, these results apply to how set size affects recency, not list length across multiple lists. The current results and similar findings in long-term recognition paradigms suggest that delay and rehearsal func- tion in consort to predict serial position and list length trends. Examining contingencies in strategies of rehearsal, length of delay, presentation time, and stimulus type (i.e., closed set or open set) may be an interesting area for future research.

45 As demonstrated, there is continuity in the primary source of interference across short- and long-term paradigms – specifically, item interference does not drive list-length and word-frequency effects. Arguably, REM cannot capture both a null list length and null list strength effect as parsimoniously as BCDMEM. Fundamentally, GMMs cannot possibly capture the range of findings presented over short- and long-term time frames because they hinge critically on an item noise explanation. Context noise does appear to play a role in short-term recognition memory, but only when there is difficulty isolating the study-test cycles as well as the study-list context itself. In other instances, contextual reinstatement is not sufficiently noisy to demonstrate the list length effect and the low frequency word ad- vantage. BCDMEM can easily account for the current experimental findings, demonstrat- ing that rehearsal affects recognition memory performance similar across both short-term and long-term memory domains.

46 References

Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation. New York: Academic Press. Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. A. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory. New York: Academic Press. Bjork, R. A., & Whitten, W. B. (1974). Recency-sensitive retrieval processes in long-term free recall. Cognitive Psychology, 6, 173–189. Bowels, N. L., & Glanzer, M. (1983). An analysis of interference in recognition memory. Memory and , 11, 307–315. Brown, G. D. A., Neath, I., & Chater, N. (2007). A temporal ratio model of memory. Psychological Review, 107, 127–181. Clark, S. E., & Gronlund, S. D. (1996). Global matching models of recognition memory: How the models match the data. Psychonomic Bulletin and Review, 3, 36–60. Coltheart, V. (1993). Effects of phonological similarity and concurrent irrelevant articu- lation on short-term-memory recall of repeated and novel word lists. Memory and Cognition, 21, 539–545. Corbin, L., & Marquer, J. (2008). Effect of a simple experimental control: The recall constraint in Sternberg’s memory scanning task. European Journal of Cognitive Psychology, 20, 913–935. Criss, A. H., & Shiffrin, R. M. (2004). Context-noise and item-noise jointly determine recognition memory: A comment on Dennis and Humphreys (2001). Psychological Review, 111, 800–807. Dennis, S., & Chapman, A. (2010). The inverse list length effect: A challenge for pure exemplar models of recognition memory. Journal of Memory and Language, 63, 416–424. Dennis, S., & Humphreys, M. S. (2001). The role of context in episodic recognition: The bind cue decide model of episodic memory. Psychological Review, 108, 452–478. Dennis, S., Lee, M. D., & Kinnell, A. (2008). Bayesian analysis of recognition memory: The case of the list-length effect. Journal of Memory and Language, 59, 361–376. Donkin, C., & Nosofsky, R. M. (n.d.). The form of short-term memory scanning: an inves- tigation based on response time distributions. Psychonomic Bulletin and Review. Duncan, M., & Murdock, B. (2000). Recognition and recall with precuing and postcuing. Journal of Memory and Language, 42, 311–301.

47 Forrin, B., & Cunningham, K. (1973). Recognition time and serial position of probed item in short-term memory. Journal of Experimental Psychology, 99, 272–279. Geller, A. S., Schleifer, I. K., Sederberg, P. B., Jacobs, J., & Kahana, M. J. (2007). PyEPL: A cross-platform experiment-programming library. Behavior Research Methods, In- struments and Computers, 65, 50–64. Gillund, G., & Shiffrin, R. M. (1984). A retrieval model for both recognition and recall. Psychological Review, 91, 1–67. Glanzer, M., & Bowles, N. (1976). Analysis of the word-frequency effect in recognition memory. Journal of Experimental Psychology: Human Learning and Memory, 2, 21-31. Glanzer, M., & Cunitz, A. R. (1966). Two storage mechanisms in free recall. Journal of Verbal Learning and Verbal Behaviour, 5, 351–360. Gronlund, S. D., & Elam, L. E. (1994). List-length effect: Recognition accuracy and variance of underlying distributions. Journal of Experimental Psychology: Learning, Memory and Cognition, 20, 1335–1369. Hebb, D. O. (1961). Distinctive features of learning in the higher animal. In J. E. De- lafresnaye (Ed.), Brain mechanisms and learning. New York: Oxford University Press. Hintzman, D. L. (1984). MINERVA 2: A simulation model of human memory. Behavior Research Methods Instruments and Computers, 16, 96–101. Howard, M. W., & Kahana, M. J. (1999). Contextual variability and serial position ef- fects in free recall. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25, 923–941. Howard, M. W., & Kahana, M. J. (2002). A distributed representation of temporal context. Journal of Mathematical Psychology, 46, 268–299. Humphreys, M. S., Bain, J. D., & Pike, R. (1989). Different ways to cue a coherent memory system - a theory for episodic, semantic, and procedural tasks. Psychological Review, 96, 208–233. Humphreys, M. S., Bain, J. D., Pike, R., & Tehan, G. (1989). Global matching: A compar- ison of the SAM, Minerva II, Matrix and TODAM models. Journal of Mathematical Psychology, 33, 36–67. Izawa, C. (1999). On human memory: Evolution, progress, and reflections on the 30th anniversary of the Atkinson-Shiffrin model. Mahwah, N.J.: Lawrence Erlbaum As- sociates. James, W. (1890). The principles of psychology. New York, NY: Henry Holt and Co. Jang, Y., & Huber, D. E. (2008). Context retrieval and context change in free recall: Recalling from long-term memory drives list isolation. Journal of Experimental Psy- chology: Learning, Memory, and Cognition, 34, 112–127. Kahana, M. J. (1996). Associative retrieval processes in free recall. Memory and Cognition, 24, 103–109. Kinnell, A., & Dennis, S. (2011). The list length effect in recognition memory: An analysis of potential confounds. Memory and Cognition, 39, 348–363.

48 McClelland, J. L., & Chappell, M. (1998). Familiarity breeds differentiation: A subjective- likelihood approach to the effects of experience in recognition memory. Psychologi- cal Review, 105, 724–760. McElree, B., & Dosher, B. A. (1989). Serial position and set size in short-term memory: The time course of recognition. Journal of Experimental Psychology: General, 118, 346–373. Monsell, S. (1978). Recency, immediate recognition memory, and reaction time. Cognitive Psychology, 10, 465–501. Murdock, B. B. (1962). The serial position effect of free recall. Journal of Experimental Psychology, 64, 482–488. Murdock, B. B. (1982). A theory for the storage and retrieval of item and associative information. Psychological Review, 89, 609–626. Murdock, B. B., & Kahana, M. J. (1993). List-strength and list-length effects: Reply to Shiffrin, Ratcliff, Murnane, and Nobel (1993). Journal of Experimental Psychology: Learning, Memory, and Cognition. Murnane, K., & Shiffrin, R. M. (1991). Interference and the representation of events in memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 855–874. Neath, I., & Surprenant, A. M. (2003). Human memory: An introduction to research, data, and theory, second edition. Belmont, CA: Wadsworth. Nosofsky, R. M. (1986). , similarity, and the identification-categorization rela- tionship. Journal of Experimental Psychology: General, 115, 39–57. Nosofsky, R. M., Little, D. R., Donkin, C., & Fific,´ M. (2011). Short-term memory scanning viewed as exemplar-based categorization. Psychological Review, 118, 280– 315. Nosofsky, R. M., & Palmeri, T. J. (1997). An exemplar-based random walk model of speeded classification. Psychological Review, 104, 266–300. Pike, R. (1984). Comparison of convolution and matrix distributed memory systems for associative recall and recognition. Psychological Review, 91, 281–294. Postman, L., & Underwood, B. J. (1973). Critical issues in . Memory and Cognition, 1, 19–40. Raaijmakers, J. G. W. (1993). The story of the two-store model: Past criticisms, current status, and future directions. In D. E. Meyer & S. Kornblum (Eds.), Attention and performance xiv: Synergies in experimental psychology, artifical intelligence, and cognitive neuroscience. Cambridge, M.A.: MIT Press. Raaijmakers, J. G. W., & Shiffrin, R. M. (1980). SAM: A theory of probabilistic search of associative memory. In G. H. Bower (Ed.), The psychology of learning and motiva- tion: Advances in research and theory (vol.14). New York: Academic Press. Raaijmakers, J. G. W., & Shiffrin, R. M. (1981). Search of associative memory. Psycho- logical Review, 88, 93–134. Ratcliff, R., Clark, S. E., & Shiffrin, R. M. (1990). List-strength effect : I. Data and discussion. Journal of Experimental Psychology: Learning Memory and Cognition, 16, 163–178.

49 Ratcliff, R., Sheu, C. F., & Gronlund, S. D. (1992). Testing global memory models using ROC curves. Psychological Review, 99, 518–535. Reed, A. V. (1976). List length and the time course of recognition in immediate memory. Memory and Cognition, 4, 16–30. Sahakyan, L., & Kelley, C. M. (2002). A contextual change account of the directed forget- ting effect. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 1064–1072. Schulman, A. I. (1974). Memory for words recently classified. Memory and Cognition, 2, 47–52. Shiffrin, R. M., Ratcliff, R., & Clark, S. E. (1990). List-strength effect: II. Theoretical mechanisms. Journal of Experimental Psychology: Learning Memory and Cogni- tion, 16, 179–195. Shiffrin, R. M., & Steyvers, M. (1997). A model for recognition memory: REM - retrieving effectively from memory. Psychonomic Bulletin and Review, 4, 145–166. Sternberg, S. (1966). High-speed scanning in human memory. Science, 153, 652-654. Sternberg, S. (1969). Memory scanning: Mental processes revealed by reaction-time experiments. American Scientist, 57, 421–457. Strong, E. K. J. (1912). The effect of length of series upon recognition memory. Psycho- logical Review, 19, 447–462. Townsend, J. T., & Fific,´ M. (2004). Parallel versus serial processing and individual differences in human memory. and Psychophysics, 66, 953–962. Tulving, E. (1972). Episodic and . In E. Tulving & W. Donaldson (Eds.), Organization of memory. New York: Academic Press. Underwood, B. J. (1978). Recognition memory as a function of the length of study list. Bulletin of the Psychonomic Society, 12, 89–91. Waugh, N. C., & Norman, D. A. (1965). Primary memory. Psychological Review, 72, 89-104. Zandt, T. V., & Townsend, J. T. (1993). Self-terminating vs. exhaustive processes in rapid visual, and memory search: An evaluative review. Perception and Psychophysics, 53, 563–580.

50