Neurocognitive processes during repeated study and repeated testing

An fMRI experiment on the testing effect

Daniel Lindh

Student Spring 2013 Cognitive Science Program Bachelor’s thesis in Cognitive Science, 15.0 credits Advisor: Peter Vestergren, Ph.D.

I wish to express my sincere gratitude and appreciation to all the members of Umeå Centre for Functional Brain Imaging (UFBI) for welcoming me and always taking their time to assist. I would like to thank Micael, for his help with all the complicated stuff, Linnea, for all her great input, Johan, Carola and everybody else for their support. Special thanks go to Peter, who has been an excellent supervisor and inspiration. Constantly commentating and reflecting on ideas, giving constructive and knowledgeable critique. In addition, I would like to thank Sara, Fredrik, Karolina, Lars and all of the other PhD students for all the interesting discussions. NEUROCOGNITIVE PROCESSES DURING REPEATED STUDY AND REPEATED TESTING: AN fMRI EXPERIMENT ON THE TESTING EFFECT

Daniel Lindh

Testing facilitates retention more than studying. The current experiment aimed to investigate neural memory-related processes during repeated testing and studying, thereby contributing to more elaborate theories. 5 participants (aged 19-31) practiced word-pair associates during fMRI through repeated studying (40 pairs) and testing (40 pairs). One week later they returned for a test and the outcome was used to calculate subsequent memory differences during fMRI. Findings included higher subsequent memory difference in left parietal lobe, precuneus and superior frontal gyrus for the test condition, implying more elaborate semantic processing. Also, an interaction effect was found in anterior cingulate cortex, possibly indicating an early beneficial recruitment of memory enhancing functions in the test condition.

NEUROKOGNITIVA PROCESSER UNDER UPPREPAT STUDIUM OCH UPPREPAD TESTNING: ETT fMRI-EXPERIMENT PÅ TESTEFFEKTEN Testning förbättrar minnesinlagring mer än studium. Det aktuella experimentet syftade till att undersöka neurala minnesrelaterade processer under upprepad testning och studium, och därmed bidra till mer konkreta teorier. 5 deltagare (i åldern 19-31) tränade ordpar (Swahili-Svenska) i en fMRI-scanner, antingen genom upprepat studium (40 par) eller upprepad testning (40 par). En vecka senare återvände de för ett slutgiltigt test på alla ordpar. Resultatet användes för att beräkna skillnader mellan lyckade och ej lyckade omgångar i fMRI-scannern. Högre skillnader i vänstra inferiora parietalloben, precuneus och superiora frontalgyrus hittades för testning jämfört med studium, vilket kan spegla en djupare semantisk bearbetning. En interaktionseffekt hittades i anteriora cingulate cortex, vilket möjligen indikerar en tidig rekrytering av fördelaktiga minnesfunktioner i test-betingelsen.

At times when student classes are getting bigger and teachers have less time to spend on each individual, finding effective and reliable methods to assist learning and achievements is becoming increasingly important. Many of the practice techniques that students report that they are relying on are often less than effective or could even be considered useless (Dunlosky, Rawson, Marsh, Nathan, & Willingham, 2013). Investigating and understanding the cognitive processes that govern various efficient practice techniques is essential in order to develop further practice methods and elaborate our understanding of human memory. In the classroom, tests are usually carried out to evaluate learning. However, it has long been known that testing leads to better performance on a delayed subsequent test compared to restudying. This effect has been shown to be very robust over a variety of types of test and materials (Johnson & Mayer, 2009; Pyc & Rawson, 2009; Tulving, 1967; Wissman, Rawson, & Pyc, 2011) and is known as the testing effect (Roediger & Butler, 2011; Roediger & Karpicke, 2006). The testing effect can typically be defined as a higher number of correct retrievals on a subsequent test for items that have been initially studied and then practiced through testing, compared to items that were studied an equal number of times. This is known as the direct effect of testing and it should be distinguished from the indirect effects.

1 *

One example of indirect effects occurs when testing affects a later study session. That is, attempting to retrieve something may improve the efficiency of a subsequent study session. This effect is also known as test-potentiated learning (Arnold & McDermott, 2012). Even though studying compared to tests, as practice, often is more efficient within a short time period, the testing effect is best seen on delayed tests. Compared to tested items, items that have been practiced by studying are more frequently forgotten (Roediger & Karpicke, 2006). In a review of several common practice techniques, which were assessed from low, medium to high utility, Dunlosky et al. (2013) concluded that only testing and distributed practice (i.e., the testing effect and the ) should be seen as of high utility. Practice techniques as for example highlighting, summation, by using keywords and rereading (direct repeated study) were all considered to have low utility for practice purposes.

Much of the research conducted on the testing effect has been carried out with behavioral studies focusing on successful retrieval from memory, causing theories to be kept at an abstract level. Transfer appropriate processing (Blaxton, 1989; Graf & Ryan, 1990; Morris, Bransford, & Franks, 1977) have been s to explain the testing effect (Roediger & Karpicke, 2006) and suggests that practice tests benefit memory to the degree that the processes necessary to perform well on the final test match processes engaged during previous learning. In other words, more similarity between the practice rounds and the final test leads to better performance. This theory has, however, endured some criticism (Carpenter & DeLosh, 2006; Carpenter, Pashler, & Vul, 2006). Carpenter and DeLosh (2006) found evidence for elaborate retrieval processing in regard to the testing effect,

Figure 1. Taxonomy of the mammalian long-term memory system and assumed relevant brain structures. Adapted from “Memory systems of the brain: A brief history and current perspective.” by Squire, L. R., 2004, Neurobiology of learning and memory, 82 (3), p. 171-7. Copyright 2004 by Elsevier. Reprinted with permission.

2 which would mean that retrieval processes on items that are not readily accessible generates more extensive processes (Glover, 1989). Accordingly, a more difficult test would yield better performance on a subsequent test, provided that the memory retrieval was successful (Gardiner, Craik, & Bleasdale, 1973; Halamish & Bjork, 2011). Pyc and Rawson (2010) proposed a mediator effectiveness hypothesis, stating that testing improves learning by generating more effective mediators. These mediators could be seen as a link between the cue and the target, such as a word or a phrase. Some evidence for this view have been presented (Carpenter, 2011). These theories leave a few things to be desired and none of them clearly explains the different memory processes throughout repeated testing and repeated studying. Furthermore, predicting neural activity based on such abstract theories can be very hard. It is possible that functional mapping of relevant brain areas could lead to more comprehensive theories. However, there is a lack of brain imaging studies directly comparing repeated testing and repeated studying and consequently the neurocognitive mechanisms underlying the testing effect are still unclear.

Human memory enables us to utilize past experiences in order to plan for the future, model the external world and learn how to ride a bike. There exist two main ways of classifying memory. First regarding duration of memory, this encompasses working memory and long-term memory. Working memory subsumes the term short-term memory and was proposed and further developed by Baddeley (1992). Working memory temporarily stores and manipulates information necessary for complex cognitive tasks. Long-term memory on the other hand is our ability to sustain information over long periods of time (Purves et al., 2008). The second classification, declarative and nondeclarative, is in terms of quality or type of long-term . Nondeclarative memories comprise procedural (skills/habits) and priming memories, memories you cannot ‘put into words’, and are expressed mainly through action. Declarative memories refer to those memories that you can state, i.e. facts or episodes (Squire, 2004). In particular, episodic memories allow us to mentally time travel and are central in perceiving and understanding surrounding events as coherent whereas facts, semantic memories, allows us to remember the capital of France or what year you were born. The theoretical notion that long-term memory resides within several faculties in the brain is not a new idea (Squire, 2004). Studies of patients with lesions yielding amnesia have provided some insight into which brain structures are relevant for long-term memory retention (Robbins, 1996; Squire & Wixted, 2011). Hippocampus, situated in the medial temporal lobe, in interaction with neocortical and diencephalic areas are thought of as being indispensable for declarative memories. Conversely, nondeclarative memories are assumed to depend upon other structures, such as the cerebellum and amygdala (Gabrieli, 2004; Squire, 2004). For further details, see Figure 1.

The process of memory is generally divided into encoding, storage and retrieval. Encoding allows information from the outside world to be represented by a network of neural activations (memory traces or “engrams”) and is typically considered to be engaged during studying. Storage is the persistence of memory

3 *

traces over time. Finally, retrieval is the recovery of previously stored memories which links directly to testing (Purves et al., 2008). This sectioning may not be optimal as retrieval also includes some mental reexposure of the material which could lead to incidental encoding, an effect observed especially regarding the hippocampus (Stark & Okado, 2003), but is nonetheless the most common way for researchers to linearly divide the progression of memory. There is a wealth of reviews and meta-analyzes of brain imaging studies separately for encoding (e.g., Wagner, Koutstaal, & Schacter, 1999), retrieval (e.g., Cabeza, 2008; Hasegawa, 2000), and both (Spaniol et al., 2009). Nevertheless, only a few studies that directly target the testing effect exist. Eriksson, Kalpouzos and Nyberg (2011) scanned participants during a final test and showed increased activation in the anterior cingulate cortex (ACC) as a function of number of successful memory retrievals. Furthermore, Hashimoto, Usui, Taira and Kojima (2011) found greater activity in the right dorsolateral prefrontal, bilateral inferior parietal, and precuneus regions, whereas the left ventrolateral prefrontal cortex (VLPFC) showed reduced activity during a second compared to a first test phase. Both of these studies scanned participants after actual test practice and are thus unable to provide a clear picture of possible differences during repeated testing and repeated study. The present study addressed this issue by comparing neural activity during repeated testing and repeated studying of Swahili-Swedish word pair associates. Recently, Van den Broek, Takashima, Segers, Fernández, & Verhoeven (2013) conducted a similar study where subjects practiced the word pairs outside the scanner until a certain level of success rate was reached and then they were either retested or restudied the word pairs in the scanner. Findings included increased activation in VLPFC, ventral striatum and left supplementary motor area for tested compared to studied items. In addition, retrieval success but not encoding success was accompanied by activation in the inferior parietal lobe and the medial temporal gyrus. These regions are thought to mediate semantic storage and retrieval (Lau, Phillips, & Poeppel, 2008) and the authors argued that this implies that semantic processing is more important for successful memory performance when practicing with tests than restudy. Similarly, Wing, Marsh and Cabeza (2013), using weakly associated English word pairs, found evidence that testing enhanced memory performance through increased functional hippocampal-connectivity with left VLPFC, medial prefrontal cortex and posterior cingulate cortex that could possibly reflect more distinct and controlled semantic processing. Taken together these findings indicate that the testing effect is mediated by elaborate retrieval processing, even though such inferences are problematic. It’s worth noting that many reviews have emphasized the inferior frontal gyrus as being important for semantic processing (Friederici, 2002; Kim, 2011; Lau et al., 2008). In this thesis findings around this area will be referred as VLPFC for ease of discussion.

The finding that, with time, more words in restudy conditions are forgotten compared to those in test conditions (Roediger & Karpicke, 2006) stresses the importance of what is known as consolidation and reconsolidation. Consolidation can occur on both a cellular level (synaptic consolidation) and on a system level. Synaptic consolidation takes place when a neural circuit, representing the memory,

4 is strengthened as a result of gene expression-modulation, posttranslational modifications and altered synaptic efficacy (Dudai, 2012). System consolidation refers to processes in which either short-term memory is gradually reorganized into distributed long-term memory (i.e., consolidation) or existing memories are reactivated and strengthened (i.e., reconsolidation; Dudai, 2012). In other words, system consolidation occurs when a particular brain region that earlier was necessary for post-encoding memory performance is no longer required, or when it can be demonstrated that another brain region is now needed to preserve the memory (Purves et al., 2008). The standard consolidation theory posits that the hippocampus is just a temporary repository for memories, which later are moved to the neocortex for consolidation (McClelland, McNaughton, & O’Reilly, 1995). Findings from amnesic patients with medial temporal lobe lesions, where the only preserved autobiographic memories are several decades old, challenges this model in the sense that the consolidation process, in that case, would be unreasonable slow (for further detail, see Dudai, 2012; Nadel & Moscovitch, 1997). Furthermore, it has been shown in rat studies that consolidation can occur fast even in neocortical areas (Tse et al., 2011). Nadel and Moscovitch (1997) instead suggested a nine step multiple trace theory where hippocampus rapidly encodes episodic memories sparsely within an ensemble of hippocampal neurons. This ensemble works as an index to neocortical neurons that represent the information. Continued activation of these memories would result in additional memory traces making the memory less susceptible to damage, although never completely independent of MTL. Arguments against this notion has been made, saying that complete retrograde amnesia of remote memories only happens when lesions reach outside the hippocampus, while patients with well-characterized MTL- lesions still find intact remote autobiographical memories (Squire & Bayley, 2007). While both these theories of consolidation suffer from obvious issues they both form a basic theoretical foundation from where it is possible to make more far- reaching conclusions based on empirical findings, even though there is no current definite signature of how consolidation might look like in the brain (Dudai, 2012).

In the current experiment all of the brain will be considered. However, there are some brain-areas of particular interest that, based on earlier research on encoding and retrieval, have a high probability of being implicated. Primarily for encoding success are the medial temporal lobes (MTL), left VLPFC, bilateral premotor cortex, and posterior parietal cortex (Kim, 2011). Retrieval success is associated with higher activations in MTL, left VLPFC, ventral parietal cortex, and precuneus/posterior midline regions (Spaniol et al., 2009). Thereby, it appears as both encoding and retrieval success are associated with MTL and left VLPFC. Surprisingly, activation in posterior midline regions, while related to retrieval success, is negatively correlated with encoding success. This relationship of opposites has been dubbed the encoding/retrieval flip (Huijbers et al., 2012). Impaired ability to modulate this encoding/retrieval flip is associated with decreased memory performance (Vannini et al., 2012). Moreover, MTL, particularly the hippocampus, is suggested to play a predominant role in coding associations between unrelated information into coherent representations at encoding (Eichenbaum, 2004). The control processes associated with left VLPFC

5 *

during retrieval have been proposed to reflect the selection processes initiated when several competing memories are activated (Blumenfeld & Ranganath, 2007). The hippocampus is also associated with discriminating among similar episodic memories in retrieval (Yassa & Stark, 2011). Similar studies to the current experiment of the testing effect have also found higher activation in ventral striatum, left supplementary motor area, ventral parietal cortex (Van den Broek et al., 2013), insula and ventral lateral PFC (Wing et al., 2013) for tested compared to restudied items. All of these activations may not reflect consolidation processes per se but rather different processes leading to consolidation.

Despite a large amount of behavioral findings, it’s unclear why repeated testing is more effective than repeated studying for memory retention. The purpose of the current study was to investigate which aspects of neural activation (leading to possible consolidation) is more important for successful memory retention during repeated testing compared to repeated study. Furthermore, prior theoretical and functional understandings of relevant brain areas, was used to explore possible underlying mechanisms. To this end, successful and unsuccessful trials were contrasted to obtain a subsequent memory effect (SME) for test and restudy conditions individually. Furthermore, subsequent memory differences (SMD) were calculated by comparing neural activation for subsequently recalled and not recalled items. This allowed for analyzes of which areas were more important for successful memory retention during testing compared to studying. It is possible that a difference in MTL and left VLPFC exist between testing and study. However, since both encoding and retrieval depend on these areas, finding subtle differences with fMRI may be a methodological challenge. Neural differences are probably to be found elsewhere. One such area, in regard to the encoding/retrieval flip, is precuneus which was hypothesized to be significantly more involved in retrieval success than encoding success. Other predictions were that repetition will generally lead to suppressed neural activity (Grill-Spector, Henson, & Martin, 2006), more specifically, it has been shown that repeated recall leads to lower activation in frontoparietal regions (Kuhl, Dudukovic, Kahn, & Wagner, 2007), which may reflect that the task becomes easier with practice and thus requires less executive resources. Also, results from Xue et al. (2010) provided evidence that successful repeated encoding appear to more precisely reactivate existing memory traces, which could be interpreted as an attenuation of irrelevant activations. The testing effect is one of the most robust findings found in the memory literature today (Roediger & Karpicke, 2006) and understanding how it works might potentially generate new methods, which would have great impact on how we should pursue further education. In general, brain imaging studies on memory will always add to the vast existing knowledge of memory at a level of basic research, helping to functionally map brain areas to relevant mechanisms and processes.

6

Method

The current experiment consisted of two main parts - (1) a behavioral pilot experiment, and (2) an fMRI-experiment. The behavioral pilot experiment was conducted in order to evaluate the levels of successful recall during testing. The procedure and tasks were identical to the ones later performed inside the scanner, except that the participants were seated in front of a computer and instructed to respond by typing into a textbox. This provided information regarding recall performance at three test occasions for words that were successfully recalled on a final test, which could not be done in the scanner where participants performed covert recall. The fMRI experiment had a factorial within-groups design with a categorical event-related protocol. The critical part of the analysis was to compare

Figure 2. Schematic representation of the experimental design. Ten word pairs per block during fMRI. Each block started with initial study (IS) then either restudy three times (S1- S2-S3) or test three times (T1-T2-T3). Stimuli were presented during 4 seconds. The participants were shown either both the Swahili and Swedish word and asked to learn the word pair (restudy condition), or only the Swahili word with the first letter of the Swedish word, with boxes indicating the missing letters, and asked to recall the target word (test condition). Between all trials an interstimulus (cross) was displayed during 2.1 to 3.9 (randomized) seconds.

7 *

SMD in neural activity between the test and restudy conditions. The experiment consisted of two sessions. First, an fMRI-session directly followed by a confirmatory test at Day 1, and second, one week later participants engaged in a delayed test to examine the performance differences between word pairs that had been tested and restudied (the basis for classification of item-events in the scanner).

Participants For the pilot, participants (n = 10, age: 20-29 years, 6 males) were recruited via requests. They were healthy with normal or corrected to normal vision. 5 of the participants returned one week later to do a delayed subsequent test. In the fMRI experiment, participants (n = 5, age: 19-31 years, 3 males) were recruited through advertisements and requests. They were neurologically healthy, right-handed, with normal or corrected to normal vision and with no previous experience of Swahili. All participants gave written informed consent.

Material Stimulus presentation and behavioral data collection were performed with a PC running E-Prime 2.0 (Psychology Software Tools, Inc., USA). The stimuli consisted of 80 Swahili-Swedish word pairs (Vestergren & Nyberg, 2013) randomized to 8 lists with 10 word pairs per list. Half of the word pairs were practiced by study and the other half with testing, resulting in 40 words in each condition. Additionally, 240 Swedish words were used in a multiple-choice-recognition task on day 7. These words all started with the same letter and had the same number of letters as the target word. They were all nouns in basic form. In order to measure neural activity functional Magnetic Resonance Imaging (fMRI) was used to determine blood flow changes in the brain by means of blood-oxygen-level- dependent (BOLD) contrasts. A strong magnetic field is applied over the participants which allows for measuring of the hemodynamic response (HDR), i.e. blood flow changes. The HDR is later used for calculating the estimated energy use of neurons, which is assumed to reflect neural activity. The fMRI-data was gathered with a 3T General Electric Discovery MR 750 scanner. Ten dummy scans were run at the beginning of the session to eliminate signals due to progressive saturation. T2*-weighted images were collected with a single-shot gradient-echo EPI sequence for BOLD-imaging with parameters: TE: 30 ms, TR: 2000 ms (37 slices acquired), flip angle: 80°, FOV: 25 × 25 cm, 96 × 96 matrices, 3.4 mm slice thickness with 0.5 mm gaps. A 32-channel head coil was used. High-resolution T1- weighted 3D structural images were acquired immediately after the fMRI session. The images were slice-time corrected for acquisition order to the first slice in the sequence, realigned and unwarped, normalized to a common MNI (Montreal Neurological Institute) template and resliced to 2 x 2 x 2 mm voxels. In order to increase the signal-to-noise ratio, the images were spatially smoothed with an 8 mm Gaussian filter. Low frequency noise was eliminated with a 128 s high-pass filter. The stimuli were back-projected onto a mirror attached to the head coil, and foam pads were used to restrict head movement. Scanner noise was reduced with earplugs and protective headphones.

8

Scanning procedure When participants arrived they first ran a few practice rounds of the experiment on a computer outside the scanner. In the scanner, one block consisted of one list that was initially studied (IS) and then either tested (TTT) or studied (SSS) three rounds (figure 4). Before each block an instruction screen appeared for 5 seconds. The stimuli were all presented during 4 seconds at the center of the screen for all conditions. An interstimulus (cross) was displayed after each trial during 2.1 to 3.9 seconds, resulting in a random jittering of stimulus onset times. In the testing condition the participants were queued with the Swahili word and the first letter of the Swedish word followed by empty boxes indicating the missing letters. The participants were instructed to recall the Swedish word covertly. No feedback was given. Word pairs in the word lists were randomized before each experimental session and testing and restudy blocks were interleaved. Participants were encouraged to memorize the words through either visual imagery or by making semantic associations. Between each block, a distractor task was presented that required participants to determine whether a combination of letters ('sn') was present in a nonsense word (e.g., 'Danka'). The nonsense word was presented during 4 seconds with a cross during 2.2 to 3.8 seconds, after each trial. The functional scanning session took approximately 45 minutes, followed by 20 minutes of structural imaging.

Post-scan procedure Directly after the scanning session the participants took a confirmatory final test. For each word pair, the Swahili-word was presented as a cue (left side of the screen) along with the first letter of the Swedish word and empty boxes to represent the missing letters (right side of the screen). The participants were asked to type in the complete word in a text box. After 8 seconds the program registered the response in the text box and moved on to the next trial. One week after the fMRI session, participants returned to do a follow-up-delayed test on all of the word pairs (no fMRI). On a computer, the Swahili word was shown together with the first letter of the Swedish word and boxes indicating the missing letters. The participants were asked to type in the whole word in a text box. After 10 seconds the program registered the answer typed into the text box. Regardless of whether the answer was right or wrong, the participants did a multiple-choice recognition test for each word pair with 3 additional alternatives. Participants answered by typing in a number between 1 and 4 corresponding to one of the alternatives. After 7 seconds the program registered the response in the number box and moved on to the next trial.

Behavioral data analyzes For both the pilot and the fMRI behavioral data, responses on the final test and delayed follow-up test were categorized as correct or incorrect. All the written answers were manually inspected for spelling errors and the categorization adjusted accordingly. Behavioral data were analyzed with SPSS. In addition to descriptive data, paired t-test analyzes were performed on the proportions of subsequently recalled items for the conditions. Further analyzes were performed on the proportion of retained items from day 1 to day 7.

9 *

fMRI analyzes An in-house development software (DataZ) was used for batching and visualization of statistical maps and BOLD signal changes along with SPM8 (Wellcome Trust Centre for Neuroimaging, London, UK) implemented in Matlab 7.12 (Mathworks, Inc., MA, USA). For each participant, results from the subsequent memory test on day 7 were used to categorize item-events into successfully recalled (R) and not successfully recalled (N) for the sessions Distractor, IS, S1, S2, S3, T1, T2, T3 (see Figure 2). Task-related BOLD-activity was convolved with a canonical hemodynamic response function and modeled as epochs with duration of 4 seconds. The resulting 15 regressors (ISR, ISN, S1R, S1N, etc.) were included in first-level general linear models along with six motion parameters. For second order analyzes: First, one-sided t-tests were used for contrasting subsequently recalled and not recalled item-events, resulting in subsequent-memory effects (SME) separately for test and restudy conditions. Second, the beta values for item- events categorized as non-successful trials was subtracted from the beta values for successful trials (subsequently recalled – subsequently not recalled) to obtain subsequent-memory differences (SMD). The beta value is an associated coefficient for one specific regressor (e.g. S1R), which quantifies the regressor’s contribution in explaining one particular voxel’s observed time course. The second-level group analysis of differential SMD was implemented in a 2 x 3 ANOVA with Condition (Test vs. Restudy) and Repetition (1, 2, 3) as within-subjects factors. Main effects and the interaction were modeled in the SPM flexible factorial module. The reported regional effects survived statistical tests with a p-value of 0.005 uncorrected and cluster-extent threshold of k > 10 unless else is noted.

Ethical considerations The project was approved by the local Ethics Committee. Participation in the study was completely voluntary and subjects could cancel their participations of the experiment at any time. Participants in the pilot were paid 200 SEK for their contribution regardless if they returned to the one week follow-up test or not. Participants in the fMRI experiment were all paid 450 SEK.

Results

Pilot behavioral results The proportions of recalled items on Day 1 for the test (M = .58, SD = .13) compared to the restudy (M = .51, SD = .17) condition were not significantly different. No significant result for the same proportions based on day 7 (Mtesting = .39, SDtesting = .12, Mrestudy = .31, SDrestudy = .17) were found. However, a significantly higher proportion of word pairs recalled at Day 1 were also recalled at Day 7 (retained) in the testing condition (M = .70, SD = .12) compared to the restudy (M = .60, SD = .11) condition (p = .018, t(4) = -3.86). If the participants answered correctly on the Day 1 final test, the same word pair was successfully recalled with high probability (M = 0.90, SD = 0.05) in all three of the test practice (T1, T2, T3, see Figure 4) trials.

10

fMRI behavioral results Likewise to the results from the pilot, there was no significant difference at Day 1 (Mtesting = .40, SDtesting = .09; Mrestudy = .41, SDrestudy = .16; p = .95, t = .063) or at Day 7 (Mtesting = .31, SDtesting = .13, Mrestudy = .24, SDrestudy = .15; p = .54, t = -.67) in the proportions of recalled words. In contrast to the pilot, no significant difference was found between the test (M = .74, SD = .17) compared to the restudy (M = .57, SD = .24) condition in the proportions of items that were recalled on Day 1 that were recalled on Day 7 (p = .31, t(4) = 1.15). The absence of testing effect was probably due to the low number of participants, although, in both the behavioral pilot and fMRI behavioral results there were clear tendencies in favor of the test condition.

fMRI results First, t-contrasts comparing neural activity during fMRI for subsequently recalled and not recalled items on day 7 were performed to obtain separate subsequent memory effects (SME) for test and restudy conditions. As shown in Table 1, the test condition showed higher activation in the bilateral supramarginal gyrus, post- and precentral, left middle/inferior temporal and occipital gyri. Also, higher activation was observed in insula, putamen and globus pallidus bilaterally. For restudy, higher activation was observed in the left occipital area, cerebellum, precentral, and bilateral anterior cingulate cortex.

Figure 3. Anatomical regions showing higher SMD for Test (red) and Restudy (blue) conditions. Left inferior parietal lobe (A), Anterior Cingulate Cortex (B), precuneus (C) and superior frontal gyrus (D).

11 *

Table 1. Anatomical regions showing differences during fMRI between subsequently recalled (R) and not recalled (NR) items at Day 7. Significant at p = 0.005 uncorrected. All three repetitions are collapsed for both test and restudy condition. Maximum of 10 clusters reported, with a maximum of 5 local maxima. H, hemisphere; BA, Brodmann area; Size, cluster size (number of voxels); Ze, Z equivalent value for each local maxima. Coordinates are in MNI space. Region H BA x y z Size Ze Test R > NR Supramarginal Gyrus L 48 -58 - 26 28 372 4.9 Postcentral L 48 -50 -16 28 2.9 L 48 -50 -14 18 2.9 Precentral L 4 -52 -4 30 2.8 Middle Temporal Gyrus L 20 -48 2 -30 199 4.6 Inferior Temporal Gyrus L -40 12 -36 4 Middle Temporal Gyrus L 21 -54 -42 0 1619 4.6 L 39 -44 -60 20 4 Middle Occipital Gyrus L 39 -44 -80 18 3.9 Middle Temporal Gyrus L 20 -54 -20 -14 3.8 L 21 -46 -52 6 3.6 Superior Temporal Gyrus R 42 64 -38 16 1301 4.5 Supramarginal Gyrus R 48 58 -26 32 4.5 R 48 56 -42 28 4.5 Supramarginal Gyrus R 41 48 -44 22 4.4 Middle Temporal Gyrus R 21 56 -36 -2 3.9 Insula R 48 36 2 4 669 4.6 Globus Pallidus R 14 -2 -2 3.6 Putamen R 34 26 2 12 3.6 R 34 30 -2 -8 3.5 Superior/Medial Frontal Gyrus L 10 -10 56 20 362 4.5 Inferior Frontal Gyrus R 45 54 26 12 50 4.5 Superior Frontal Gyrus R 6 28 -2 62 229 4.4 R 6 24 8 62 3.5 R 6 22 2 70 3.4 Insula L 48 -32 -10 16 103 4.4 Putamen L 48 -32 0 0 3.5 Inferior Orbitofrontal Cortex L 47 -30 28 -16 13 4.3 Middle Frontal Gyrus R 6 44 -2 60 302 4.2 Restudy R > NR Medial Occipital L -30 -94 22 11 3.8 Occipital Area L 17 22 -102 -6 50 3.6 Cerebellum L -6 -42 -38 26 3.6 Anterior Cingulate Cortex R 2 24 16 17 3.6 Precentral L 6 -46 -16 62 36 3.5 Cerebellum R -2 -76 -42 21 3.3 Occipital Area L 17 -20 -104 -2 111 3.3 Inferior Occipital L -30 -98 -6 3.3 Calcarine R 17 20 -56 20 126 3.2 Anterior Cingulate Cortex L 32 -12 42 8 16 2.9

12

Table 2. Anatomical regions showing a Main effect of Condition for SMD, significant at p = 0.005 uncorrected. Bold bars were used to determine test and restudy differences. H, hemisphere; BA, Brodmann area; Size, cluster size (number of voxels); Ze, Z equivalent value for each local maxima. Coordinates are in MNI space. Region H BA x y z Size Ze Test > Restudy Inferior Parietal Lobe L 40 -52 -56 46 1735 4.8 Angular Gyrus L - -54 -62 38 4.5 Middle/Superior Frontal Gyrus L 8 -10 28 56 427 3.8 Superior Frontal Gyrus L 8 -16 20 66 3.4 Middle Frontal Gyrus L 8 -26 26 54 3.0 Inferior Parietal Lobe R 40 54 -56 48 293 3.7 Precuneus L - -6 -58 50 410 3.6 L - -4 -60 40 3.2 L - -12 -54 34 3.0 L - -2 -46 42 2.9 Superior Frontal Gyrus L 32 -14 54 24 285 3.4 L 9 -14 58 34 3.1 L 46 -24 52 26 2.9 Superior Frontal Gyrus R 9 16 44 40 148 3.4 R 10 16 52 12 57 3.4 Posterior Cingulate Gyrus L - -2 -34 44 45 3.2 Superior Frontal Gyrus L 9 -22 42 44 37 3.0 Anterior Cingulate Gyrus L 32 -8 38 12 40 2.9 Middle Occipital Gyrus R 19 42 -78 38 55 2.9 Angular Gyrus R 39 50 -74 32 2.9

Restudy > Test Medial Occipital Gyrus L 18 -40 -90 0 391 3.1 L 39 -38 -70 24 3.2 Cuneus R 18 22 -68 28 285 3.5 Precentral Gyrus L 6 -46 2 30 140 3.5 Fusiform Gyrus L 37 -38 -60 -10 168 3.5 Lingual Gyrus L 19 -20 -50 -4 164 3.2 L 18 -14 -62 0 2.9 Inferior Occipital Gyrus R 19 36 -86 -6 166 3.1 R 37 42 -64 -10 113 3.1

13 *

Table 3. Main effect of repetition for SMD, significant at p = 0.005 uncorrected. All reported regions were associated with higher activation as a function of repetitions (independent of Condition). H, hemisphere; BA, Brodmann area; Size, cluster size (number of voxels); Ze, Z equivalent value for each local maxima. Coordinates are in MNI space. Region H BA x y z Size Ze Caudate (Going into the ventricles) L -12 -10 26 41 3.1 Putamen L 48 -28 -12 16 34 3.0 Lingual R 37 24 -44 -4 16 2.9 Dorsal Anterior Cingulate Gyrus L 32 -16 40 4 10 2.8

The main effect of Condition (Table 2) showed higher SMD in the left inferior parietal lobe, superior and medial frontal gyrus, posterior and anterior cingulate gyrus along with precuneus for the test condition compared to the restudy condition. Conversely, stronger SMD for restudy compared to testing was found in the medial occipital gyrus, precentral and lingual gyrus. See Figure 3 for more detail. The main effect of repetition (Table 3) revealed clusters in the caudate, putamen (BA 48), lingual and dorsal ACC. By inspecting the bold bars for each region individually it was possible to conclude that the SMD increased with each repetition for all regions. An interaction effect between Condition and Repetition (Figure 4) was found around the left anterior cingulate cortex and superior frontal gyrus (local maxima: -16; 48; 6). Over repetitions the SMD increased in this region for restudied items and decreased for tested items (p = 0.01, uncorrected). Figure 3 shows the bold bars for conditions over time together with the corresponding brain region image. Because of the low number of participants, it is worth noting that a liberal p-value was chosen for the interaction. Some precaution should be taken in interpreting the data.

Discussion

The current study investigated neural activations for word pair associates (Swahili – Swedish) that were either practiced through repeated testing or repeated studying. The purpose was to examine brain areas differing in importance for testing and restudy. Test performance at day 7 was used in the analyzes to back- sort practiced items in the scanner in a so called subsequent memory paradigm. In the first step, this was used to contrast subsequently recalled and not recalled item-events in order to obtain a subsequent memory effect (SME) separately for test and restudy conditions. In the second step, the differences of beta-values from subsequently recalled and not recalled item-events were calculated to obtain subsequent memory differences (SMD). The current experiment adopted a novel approach by investigating differences in SMD, for examining the testing effect. ANOVA provided a method for isolating areas where testing and studying showed

14 differences in SMDs. By inspecting the BOLD bars for each significant area it was possible to conclude for which condition (testing or restudying) that particular area was more important. In other words, for areas where the main effect of testing was higher than restudy, one could conclude that these regions are more important for successful memory retention during testing compared to restudying, and vice versa. Possible weaknesses and reliability issues in the study concern the number of participants. Because of this low number, a liberal significance threshold was adopted (p = 0.005 uncorrected for all the analyzes except for the ANOVA interaction, for which p = 0.01 uncorrected, was used). This requires some caution in interpretation of the results. Another issue concerns the covert answers during the fMRI session. It is possible that covert answers are less effective than overt responses in regard of the testing effect. However, it was an arguable necessity in order to keep the study and test conditions as similar as possible. And furthermore, covert and overt answers have in a word and sentence generating- task been shown to produce similar activations (Partovi et al., 2012).

Figure 4. The SMD interaction of Condition and Repetition in anterior cingulate cortex (ACC). Bars depict SMD-values (difference in Beta-values for subsequently recalled/not recalled items) for Restudy (blue) and Test (red) over repetitions (1, 2, 3). Error bars denote standard error.

15 *

In the fMRI, when the participants were answering covertly, it was not possible to know with confidence if the participants were actually recalling the items correctly. In order to address this issue, a behavioral pilot was conducted. While sitting in front of a computer, participants were instructed to type their answers into a text box on all three test occasions. The results confirmed that a correct answer on the final test was 90% of the time associated with correct retrieval on all three test occasions. This justified the assumption that during the fMRI sessions, items categorized as Recalled actually were recalled during all three repetitions. Even though an advantage for study could be expected on day 1, based on earlier experiments (Roediger & Karpicke, 2006), no difference between test and restudy condition could be found. The pilot did however show significant difference between the proportions of remembered items from day 1 to day 7, in favor of the test condition. Even though the fMRI behavioral data did not show any significant advantage for testing, the effect is well established (Chan & McDermott, 2007; Pyc & Rawson, 2010; Roediger & Karpicke, 2006). The fact that participants performed equally well in both conditions, even though the word-pairs were only showed once in the Test condition, should testify for some kind of retention enhancing process during testing.

The first fMRI analysis aimed to compare earlier findings of encoding and retrieval with the current findings. For testing, SME was found in the bilateral supramarginal gyrus and post- and precentral regions. Left supramarginal gyrus is part of the inferior parietal cortex and has been suggested to be a key region for mental rehearsal of phonological information (Paulesu, Frith, & Frackowiak, 1993; Salmon et al., 1996). Because our participants were instructed to answer covertly, it should not be so surprising that this region was more activated during successful recall (compared to unsuccessful), even though it was not something that was anticipated. The precentral region, Brodmann area 6, is homologous to the premotor cortex/supplementary motor cortex and mostly known for its involvement in advanced motor planning (Hoshi & Tanji, 2000). Together with supramarginal gyrus, the premotor cortex also appears to be concerned with mental rehearsal (Salmon et al., 1996), as well as with suppression of information (Michael C Anderson et al., 2004), which could reflect that during successful trials target information is mentally rehearsed and competing information is attenuated. Surprisingly, no significant SME was found in the left VLPFC, which is thought to have a more controlling role in discriminating between relevant and non-relevant memories (Badre & Wagner, 2007). Similar to a recent meta-analysis by Spaniol et al. (2009), significant results were found in the right middle frontal gyrus, left superior frontal gyrus, left middle temporal gyrus, and insula. No significant activation was found in the hippocampus, which is a bit unexpected considering its clear role in episodic memory (Squire, 2004). However, hippocampus activation is easier to find with tasks targeting recognition/familiarity rather than recall (Spaniol et al., 2009). One of the reasons may be because of noncritical recollection, that is, familiarity leads putatively to a less precise activations in which related, contextual but noncritical details sum up to higher hippocampus activation (Spaniol et al., 2009). For the restudy condition, SME was found in left occipital areas, precentral (premotor cortex) and bilateral anterior cingulate

16 cortex. Since the participants were instructed to use visual imagery as a device during encoding, the occipital activation (which was also found in the test condition) may possibly reflect a more vivid mental imagery (Cui, Jeter, Yang, Montague, & Eagleman, 2007). In accordance with earlier findings (Kim, 2011), significant results were found in the premotor cortex. Kim (2011) hypothesizes its role, together with parietal regions, as regulating attention during encoding. In summation, these results partly reflect earlier research. The main concern with the current study is the number of participants. Also, whereas most research done on encoding and retrieval only have one repetition for each item, the current study collapses all three repetitions into one, which may lead to different results.

The second analysis implemented the SMDs into an ANOVA. For the main effect of Condition (Figure 3), testing showed higher SMD in the left inferior parietal lobe, left angular gyrus, left medial and superior frontal regions, and both left posterior and anterior cingulate cortex. Higher SMD was also found in the precuneus for testing, as anticipated by the encoding/retrieval flip (Huijbers et al., 2012). An additional interpretation relates to resting state processes in the so called default mode network (DMN: Raichle et al., 2001). DMN is associated with posterior cingulate cortex and precuneus (Fransson & Marrelec, 2008). It is possible that these regions express dual processes. That is, during encoding the activations in these regions may reflect resting state processes, whereas during testing they may reflect processes promoting consolidation. The high SMD for test in the left inferior parietal lobe may result from its richly reciprocal connections with hippocampus and parahippocampus (Clower, West, Lynch, & Strick, 2001; Lavenex, Suzuki, & Amaral, 2002). Wagner, Shannon, Kahn, and Buckner (2005) hypothesized that the left inferior parietal lobe is important in keeping attention to internal representations in episodic memory retrieval. In a related interpretation, the current findings of high SMD for testing in inferior parietal lobe/angular gyrus, superior frontal gyrus and precuneus fits very well in with what Binder, Desai, Graves and Conant (2009) proposes to be part of a distributed semantic network. This would mean that testing, when successfully promoting memory retention, leads to processing of semantic information to a higher degree than during restudying. In line with the interpretations of similar experiments (Van den Broek et al., 2013; Wing, Marsh, & Cabeza, 2013b), this could be construed as evidence for elaborate retrieval processing (Carpenter & DeLosh, 2006). Even if this was the case, it’s still a logical leap between the observations and to conclude that this is why testing is more efficient. This is merely an observation of dissimilarity between the two, as well as a clue to the different processes yielding memory retention. The main effect of repetition showed increasingly higher SMD in the striatal regions (caudate and putamen). Regional infusion of serotonin (Prado- Alcalá et al., 2003), GABA (Salado-Castillo, Díaz del Guante, Alvarado, Quirarte, & Prado-Alcalá, 1996) and dopamine (Kelley & Delfs, 1991) into striatum in rats, all modifies acetylcholinergic function and interfere with memory processes. Considering its role in consolidation (Prado-Alcalá et al., 2003) and the fact that striatal activation becomes more and more distinct between successful and unsuccessful trials, for both the test and study condition, it may possibly be understood as one of the processes leading to stronger memory traces. Perhaps a

17 *

procedural skill of binding perceptual and internal stimuli, as the nucleus of caudate has been theorized to do (Ashby, Alfonso-Reese, Turken, & Waldron, 1998). There was no significant main effect of repetition on frontoparietal SMD, as expected based on earlier studies (Kuhl et al., 2007). However, it is possible this does not even relate to SMD differences between testing and restudying.

Perhaps the most interesting finding in the analyzes was the interaction effect between Condition and Repetition. This result illustrates regions where the test and restudy conditions differ in SMD over time. In this interaction, SMD in ACC was already from the beginning high in the test condition, but initially low for study. Over time the SMD for the study condition approached the level of the test condition. Eriksson et al. (2011) found higher activation in ACC as function of number of previously successful recalls as well as that higher ACC-activation during recall predicted further recall success 5 months later. This finding was a bit surprising since ACC had previously been portrayed as a region involved with cognitive control and error/conflict monitoring (Carter & Van Veen, 2007). According to the authors, this interpretation of ACC was not appropriate in the context of these findings seeing as the activation predicted retrieval success, which reasonably would lead to decreased demand of error/conflict monitoring. However, recently Weible (2013) argued for the ACCs role in the retrieval of remote memories, being part of attending and recognizing those cues that are most relevant for successful retrieval. If recognizing internal cues is ACCs role in retrieval, it would seem that tests, already from the beginning, is taking advantage of the function while the study condition needs more time. Another plausible explanation of the testing effect can be found in retrieval induced forgetting paradigms (M C Anderson, Bjork, & Bjork, 2000), where it has been shown that recalling items impairs retention of neighboring semantic information resulting in reduced need of cognitive control mechanisms (Kuhl et al., 2007). It is possible this reduction of mnemonic conflict is due to stronger activation of the previously recalled item, which would interfere with related representations. However, Wimber et al. (2008) showed that recalling an item, automatically weakens related information via inhibition. Encoding on the other hand is associated with priming (Schott et al., 2006). That is, facilitation of neighboring information, which could possibly explain why studied items are more easily forgotten compared to tested items (Roediger & Karpicke, 2006).

To conclude, the current study investigated SMD differences between test and restudy conditions in a subsequent memory paradigm. Findings included higher SMD in left inferior parietal lobe, precuneus and superior frontal gyrus in testing compared to studying. This may imply that testing involves a more extensive semantic processing. Also, an interaction effect between Condition (Test, Restudy) and Repetition was found in anterior cingulate cortex which may reflect an early recruitment of memory enhancing control processes, leading to consolidation for testing. This experiment contributes with functional mapping of brain regions related to the testing effect. This may contribute to further understanding of its beneficial memory related processes.

18

References

Anderson, M C, Bjork, E. L., & Bjork, R. a. (2000). Retrieval-induced forgetting: Evidence for a recall-specific mechanism. Psychonomic bulletin & review, 7(3), 522–30. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11082860

Anderson, Michael C, Ochsner, K. N., Kuhl, B., Cooper, J., Robertson, E., Gabrieli, S. W., Glover, G. H., et al. (2004). Neural systems underlying the suppression of unwanted memories. Science (New York, N.Y.), 303(5655), 232–5. doi:10.1126/science.1089504

Arnold, K. M., & McDermott, K. B. (2012). Test-potentiated learning: Distinguishing between direct and indirect effects of tests. Journal of experimental psychology. Learning, memory, and cognition. doi:10.1037/a0029199

Ashby, F. G., Alfonso-Reese, L. a, Turken, a U., & Waldron, E. M. (1998). A neuropsychological theory of multiple systems in category learning. Psychological review, 105(3), 442–81. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/9697427

Baddeley, A. (1992). Working memory. Science, 255(5044), 556–559. doi:10.1126/science.1736359

Badre, D., & Wagner, A. D. (2007). Left ventrolateral prefrontal cortex and the cognitive control of memory. Neuropsychologia, 45(13), 2883–901. doi:10.1016/j.neuropsychologia.2007.06.015

Binder, J. R., Desai, R. H., Graves, W. W., & Conant, L. L. (2009). Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. Cerebral cortex (New York, N.Y. : 1991), 19(12), 2767–96. doi:10.1093/cercor/bhp055

Blaxton, T. a. (1989). Investigating dissociations among memory measures: Support for a transfer-appropriate processing framework. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15(4), 657–668. doi:10.1037//0278-7393.15.4.657

Blumenfeld, R. S., & Ranganath, C. (2007). Prefrontal cortex and long-term memory encoding: An integrative review of findings from neuropsychology and neuroimaging. The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry, 13(3), 280–91. doi:10.1177/1073858407299290

Cabeza, R. (2008). Role of parietal regions in episodic memory retrieval: The dual attentional processes hypothesis. Neuropsychologia, 46(7), 1813–27. doi:10.1016/j.neuropsychologia.2008.03.019

Carpenter, S. K. (2011). Semantic information activated during retrieval contributes to later retention: Support for the mediator effectiveness hypothesis of the testing effect. Journal of experimental psychology. Learning, memory, and cognition, 37(6), 1547–52. doi:10.1037/a0024140

Carpenter, S. K., & DeLosh, E. L. (2006). Impoverished cue support enhances subsequent retention: Support for the elaborative retrieval explanation of the testing effect. Memory &

19 *

cognition, 34(2), 268–76. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/16752591

Carpenter, S. K., Pashler, H., & Vul, E. (2006). What types of learning are enhanced by a cued recall test? Psychonomic bulletin & review, 13(5), 826–30.

Carter, C. S., & Van Veen, V. (2007). Anterior cingulate cortex and conflict detection: An update of theory and data. Cognitive, affective & behavioral neuroscience, 7(4), 367–79. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/18189010

Chan, J. C. K., & McDermott, K. B. (2007). The testing effect in recognition memory: A dual process account. Journal of experimental psychology. Learning, memory, and cognition, 33(2), 431–7. doi:10.1037/0278-7393.33.2.431

Clower, D. M., West, R. a, Lynch, J. C., & Strick, P. L. (2001). The inferior parietal lobule is the target of output from the superior colliculus, hippocampus, and cerebellum. The Journal of neuroscience : the official journal of the Society for Neuroscience, 21(16), 6283– 91. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11487651

Cui, X., Jeter, C. B., Yang, D., Montague, P. R., & Eagleman, D. M. (2007). Vividness of mental imagery: Individual variability can be measured objectively. Vision research, 47(4), 474–8. doi:10.1016/j.visres.2006.11.013

Dudai, Y. (2012). The restless engram: Consolidations never end. Annual review of neuroscience, 35, 227–47. doi:10.1146/annurev-neuro-062111-150500

Dunlosky, J., Rawson, K. a., Marsh, E. J., Nathan, M. J., & Willingham, D. T. (2013). Improving students’ learning with effective learning techniques: Promising directions from cognitive and educational psychology. Psychological Science in the Public Interest, 14(1), 4–58. doi:10.1177/1529100612453266

Eichenbaum, H. (2004). Hippocampus: Cognitive processes and neural representations that underlie declarative memory. Neuron, 44, 109–120. Retrieved from http://www.sciencedirect.com/science/article/pii/S089662730400529X

Eriksson, J., Kalpouzos, G., & Nyberg, L. (2011). Rewiring the brain with repeated retrieval: A parametric fMRI study of the testing effect. Neuroscience letters, 505(1), 36–40. doi:10.1016/j.neulet.2011.08.061

Fransson, P., & Marrelec, G. (2008). The precuneus/posterior cingulate cortex plays a pivotal role in the default mode network: Evidence from a partial correlation network analysis. NeuroImage, 42(3), 1178–84. doi:10.1016/j.neuroimage.2008.05.059

Friederici, A. D. (2002). Towards a neural basis of auditory sentence processing. Trends in cognitive sciences, 6(2), 78–84. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/15866191

Gabrieli, J. (2004). Cognitive neuroscience of human memory. Current Opinion in Neurobiology, 14(2), 135–138. doi:10.1016/j.conb.2004.03.018

20

Gardiner, F., Craik, F., & Bleasdale, F. (1973). Retrieval difficulty and subsequent recall. Memory & Cognition, I(3), 213–216. Retrieved from http://link.springer.com/article/10.3758/BF03198098

Glover, J. (1989). The “testing” phenomenon: Not gone but nearly forgotten. Journal of Educational Psychology, 392–399. Retrieved from http://www.gwern.net/docs/1989- glover.pdf

Graf, P., & Ryan, L. (1990). Transfer-appropriate processing for implicit and explicit memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16(6), 978– 992. doi:10.1037//0278-7393.16.6.978

Grill-Spector, K., Henson, R., & Martin, A. (2006). Repetition and the brain: Neural models of stimulus-specific effects. Trends in cognitive sciences, 10(1), 14–23. doi:10.1016/j.tics.2005.11.006

Halamish, V., & Bjork, R. a. (2011). When does testing enhance retention? A distribution- based interpretation of retrieval as a memory modifier. Journal of experimental psychology. Learning, memory, and cognition, 37(4), 801–12. doi:10.1037/a0023219

Hasegawa, I. (2000). Neural mechanisms of memory retrieval: Role of the prefrontal cortex. Reviews in the neurosciences, 11(2-3), 113–25. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/10718149

Hashimoto, T., Usui, N., Taira, M., & Kojima, S. (2011). Neural enhancement and attenuation induced by repetitive recall. Neurobiology of learning and memory, 96(2), 143– 9. doi:10.1016/j.nlm.2011.03.008

Hoshi, E., & Tanji, J. (2000). Integration of target and body-part information in the premotor cortex when planning action neurons that were active in the first or second phase ( Mann ±. Nature, 408(November), 466–470. doi:doi:10.1038/35044075

Huijbers, W., Vannini, P., Sperling, R. a, C M, P., Cabeza, R., & Daselaar, S. M. (2012). Explaining the encoding/retrieval flip: Memory-related deactivations and activations in the posteromedial cortex. Neuropsychologia, 50(14), 3764–74. doi:10.1016/j.neuropsychologia.2012.08.021

Johnson, C. I., & Mayer, R. E. (2009). A testing effect with multimedia learning. Journal of Educational Psychology, 101(3), 621–629. doi:10.1037/a0015183

Kelley, a E., & Delfs, J. M. (1991). Dopamine and conditioned reinforcement. I. Differential effects of amphetamine microinjections into striatal subregions. Psychopharmacology, 103(2), 187–96. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/2027920

Kim, H. (2011). Neural activity that predicts subsequent memory and forgetting: A meta- analysis of 74 fMRI studies. NeuroImage, 54(3), 2446–61. doi:10.1016/j.neuroimage.2010.09.045

Kuhl, B. a, Dudukovic, N. M., Kahn, I., & Wagner, A. D. (2007). Decreased demands on cognitive control reveal the neural processing benefits of forgetting. Nature neuroscience, 10(7), 908–14. doi:10.1038/nn1918

21 *

Lau, E. F., Phillips, C., & Poeppel, D. (2008). A cortical network for semantics: (De)constructing the N400. Nature reviews. Neuroscience, 9(12), 920–33. doi:10.1038/nrn2532

Lavenex, P., Suzuki, W. a, & Amaral, D. G. (2002). Perirhinal and parahippocampal cortices of the macaque monkey: Projections to the neocortex. The Journal of comparative neurology, 447(4), 394–420. doi:10.1002/cne.10243

McClelland, J. L., McNaughton, B. L., & O’Reilly, R. C. (1995). Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological review, 102(3), 419–57. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/7624455

Morris, C., Bransford, J., & Franks, J. (1977). Levels of processing versus transfer appropriate processing. Journal of verbal learning and verbal …, 533(1977), 519–533. Retrieved from http://www.sciencedirect.com/science/article/pii/S0022537177800169

Nadel, L., & Moscovitch, M. (1997). Memory consolidation, retrograde amnesia and the hippocampal complex. Current opinion in neurobiology, 7(2), 217–27. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/9142752

Partovi, S., Konrad, F., Karimi, S., Rengier, F., Lyo, J. K., Zipp, L., Nennig, E., et al. (2012). Effects of covert and overt paradigms in clinical language fMRI. Academic radiology, 19(5), 518–25. doi:10.1016/j.acra.2011.12.017

Paulesu, E., Frith, C., & Frackowiak, R. (1993). The neural correlates of the verbal component of working memory. Nature, 362(25 March), 342 – 345. doi:doi:10.1038/362342a0

Prado-Alcalá, R. a, Ruiloba, M. I., Rubio, L., Solana-Figueroa, R., Medina, C., Salado-Castillo, R., & Quirarte, G. L. (2003). Regional infusions of serotonin into the striatum and memory consolidation. Synapse (New York, N.Y.), 47(3), 169–75. doi:10.1002/syn.10158

Purves, D., M. Brannon, E., Cabeza, R., A. Huettel, S., S. LaBar, K., L. Platt, M., & G. Woldorff, Ma. (2008). Principles of Cognitive Neuroscience (1st ed., pp. 341, 346–347). Sinauer Associates, Inc.

Pyc, M. a, & Rawson, K. a. (2010). Why testing improves memory: mediator effectiveness hypothesis. Science (New York, N.Y.), 330(6002), 335. doi:10.1126/science.1191465

Pyc, M. a., & Rawson, K. a. (2009). Testing the retrieval effort hypothesis: Does greater difficulty correctly recalling information lead to higher levels of memory? Journal of Memory and Language, 60(4), 437–447. doi:10.1016/j.jml.2009.01.004

Raichle, M. E., MacLeod, a M., Snyder, a Z., Powers, W. J., Gusnard, D. a, & Shulman, G. L. (2001). A default mode of brain function. Proceedings of the National Academy of Sciences of the United States of America, 98(2), 676–82. doi:10.1073/pnas.98.2.676

Robbins, T. W. (1996). Refining the taxonomy of memory. Science (New York, N.Y.), 273(5280), 1353–4. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/8801630

22

Roediger, H. L., & Butler, A. C. (2011). The critical role of retrieval practice in long-term retention. Trends in cognitive sciences, 15(1), 20–7. doi:10.1016/j.tics.2010.09.003

Roediger, H. L., & Karpicke, J. D. (2006). The power of testing memory . Perspectives on Psychological Science, 1(3), 181–210. doi:10.1111/j.1745-6916.2006.00012.x

Salado-Castillo, R., Díaz del Guante, M. a, Alvarado, R., Quirarte, G. L., & Prado-Alcalá, R. a. (1996). Effects of regional GABAergic blockade of the striatum on memory consolidation. Neurobiology of learning and memory, 66(2), 102–8. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/8946403

Salmon, E., Van der Linden, M., Collette, F., Delfiore, G., Maquet, P., Degueldre, C., Luxen, a, et al. (1996). Regional brain activity during working memory tasks. Brain: a journal of neurology, 119 ( Pt 5(1996), 1617–25. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/8931584

Schott, B. H., Richardson-Klavehn, A., Henson, R. N. a, Becker, C., Heinze, H.-J., & Düzel, E. (2006). Neuroanatomical dissociation of encoding processes related to priming and explicit memory. The Journal of neuroscience : the official journal of the Society for Neuroscience, 26(3), 792–800. doi:10.1523/JNEUROSCI.2402-05.2006

Spaniol, J., Davidson, P. S. R., Kim, A. S. N., Han, H., Moscovitch, M., & Grady, C. L. (2009). Event-related fMRI studies of episodic encoding and retrieval: meta-analyzes using activation likelihood estimation. Neuropsychologia, 47(8-9), 1765–79. doi:10.1016/j.neuropsychologia.2009.02.028

Squire, L. R. (2004). Memory systems of the brain: a brief history and current perspective. Neurobiology of learning and memory, 82(3), 171–7. doi:10.1016/j.nlm.2004.06.005

Squire, L. R., & Bayley, P. J. (2007). The neuroscience of remote memory. Current opinion in neurobiology, 17(2), 185–96. doi:10.1016/j.conb.2007.02.006

Squire, L. R., & Wixted, J. T. (2011). The cognitive neuroscience of human memory since H.M. Annual review of neuroscience, 34, 259–88. doi:10.1146/annurev-neuro-061010- 113720

Stark, C. E. L., & Okado, Y. (2003). Making memories without trying: Medial temporal lobe activity associated with incidental memory formation during recognition. The Journal of neuroscience : the official journal of the Society for Neuroscience, 23(17), 6748–53. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/12890767

Tse, D., Takeuchi, T., Kakeyama, M., Kajii, Y., Okuno, H., Tohyama, C., Bito, H., et al. (2011). Schema-dependent gene activation and memory encoding in neocortex. Science (New York, N.Y.), 333(6044), 891–5. doi:10.1126/science.1205274

Tulving, E. (1967). The effects of presentation and recall of material in free-recall learning. Journal of Verbal Learning and Verbal Behavior, 6(2), 175–184. doi:10.1016/S0022- 5371(67)80092-6

23 *

Van den Broek, G. S. E., Takashima, A., Segers, E., Fernández, G., & Verhoeven, L. (2013). Neural Correlates of Testing Effects in Vocabulary Learning. NeuroImage. doi:10.1016/j.neuroimage.2013.03.071

Vannini, P., Hedden, T., Huijbers, W., Ward, A., Johnson, K. a, & Sperling, R. a. (2012). The ups and downs of the posteromedial vortex: Age- and amyloid-related functional alterations of the encoding/retrieval flip in cognitively normal older adults. Cerebral cortex (New York, N.Y. : 1991). doi:10.1093/cercor/bhs108

Vestergren, P., Nyberg, L. (2013). Testing potentiates encoding in paired-associate learning: Evidence from fMRI. Manuscript in preparation.

Wagner, a D., Koutstaal, W., & Schacter, D. L. (1999). When encoding yields remembering: Insights from event-related neuroimaging. Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 354(1387), 1307–24. doi:10.1098/rstb.1999.0481

Wagner, A. D., Shannon, B. J., Kahn, I., & Buckner, R. L. (2005). Parietal lobe contributions to episodic memory retrieval. Trends in cognitive sciences, 9(9), 445–53. doi:10.1016/j.tics.2005.07.001

Weible, A. P. (2013). Remembering to attend: The anterior cingulate cortex and remote memory. Behavioural brain research, 245C, 63–75. doi:10.1016/j.bbr.2013.02.010

Wimber, M., Bäuml, K.-H., Bergström, Z., Markopoulos, G., Heinze, H.-J., & Richardson- Klavehn, A. (2008). Neural markers of inhibition in human memory retrieval. The Journal of neuroscience: the official journal of the Society for Neuroscience, 28(50), 13419–27. doi:10.1523/JNEUROSCI.1916-08.2008

Wing, E. a, Marsh, E. J., & Cabeza, R. (2013a). Neural correlates of retrieval-based memory enhancement: An fMRI study of the testing effect. Neuropsychologia. doi:10.1016/j.neuropsychologia.2013.04.004

Wing, E. a, Marsh, E. J., & Cabeza, R. (2013b). Neural correlates of retrieval-based memory enhancement: An fMRI study of the testing effect. Neuropsychologia, 1–11. doi:10.1016/j.neuropsychologia.2013.04.004

Wissman, K. T., Rawson, K. a, & Pyc, M. a. (2011). The interim test effect: testing prior material can facilitate the learning of new material. Psychonomic bulletin & review, 18(6), 1140–7. doi:10.3758/s13423-011-0140-7

Xue, G., Dong, Q., Chen, C., Lu, Z., Mumford, J. a, & Poldrack, R. a. (2010). Greater neural pattern similarity across repetitions is associated with better memory. Science (New York, N.Y.), 330(6000), 97–101. doi:10.1126/science.1193125

Yassa, M. a, & Stark, C. E. L. (2011). Pattern separation in the hippocampus. Trends in neurosciences, 34(10), 515–25. doi:10.1016/j.tins.2011.06.006

24