A Neurocognitive Model for Predicting the Fate of Individual Memories Shannon M

A neurocognitive model for predicting the fate of individual memories Shannon M. Tubridy, David Halpern, Lila Davachi, & Todd M. Gureckis Department of Psychology, 6 Washington Place New York, NY 10003 USA Abstract viduals such as their own self-reported judgements of learning (JOLs)(Nelson & Dulosky, 1991). To foreshadow, we find One goal of cognitive science is to build theories of mental that the model utilizing fMRI signals recorded from individu- function that predict individual behavior. In this project we focus on predicting, for individual participants, which specific als while they learned allowed us to predict which items they items in a list will be remembered at some point in the future. would remember better than alternative approaches which uti- If you want to know if an individual will remember something, lized explicit assessments. The success of this approach alone commonsense approach is to give them a quiz or test such that a correct answer likely indicates later memory for an item. lows us to determine, without directly asking or testing partic- In this project we attempt to predict later memory without ex- ipants, whether an experienced event is likely to be forgotten plicit assessments by jointly modeling both neural and behav- in the future and therefore deserving of additional practice. ioral data in a computational cognitive model which captures the dynamics of memory acquisition and decay. In this paper, we lay out a novel hierarchical Bayesian approach for com- Prior work on predicting human learning and bining neural and behavioral data and present results showing memory how fMRI signals recorded during the study phase of a memory task can improve our ability to predict (in held-out data) Atkinson (1972) sought to optimize the acquisition of a for- which items will be remembered or forgotten 72 hours later. eign language vocabulary using a Markov model of memory Keywords: memory, joint modeling, cognitive neuroscience which algorithmically chose the sequence of words participants should study on each trial (Atkinson, 1972). In the Introduction critical test of the model, word-pairs were selected for study A number of approaches in cognitive science and education either randomly, by participants themselves, or using the fit- attempt to use computer models to algorithmically tailor in- ted model. formation presentation during study to the needs of individual The model-based approach assumed that each individ- learners (Smallwood, 1962; Atkinson, 1972; Fu et al., 2006; ual memory trace – memory for the association between Ritter, Anderson, Koedinger, & Corbett, 2007; Pavlik & An- words in a pair – could be in one of three mutually exclu- derson, 2008; Lindsey, Mozer, Cepeda, & Pashler, 2009; Raf- sive latent states S = fsU ;sT ;sPg representing (U)nknown, ferty, LaMar, & Griffiths, 2015). A core goal of these ap- (T)emporarily stored, and (P)ermanently stored memories, proaches is to leverage insights about the dynamics of learn- respectively. Each time a word was studied there was a prob- ing and memory in order to predict which materials are likely ability the memory trace would transition to the more fully to be forgotten and which will be remembered at future points learned T or P states according to the study transitions matrix in time. shown in Figure 1. Alternatively, on trials when a word was Most of these approaches adapt their recommendations not studied (e.g., another word was studied) there was a pos- to individuals based primarily on assessments (e.g., perfor- sibility of forgetting (i.e., moving from T to U) as reflected mance on quizzes and tests given during or after a learning in the decay transitions. Every time an item was presented to session)(Atkinson, 1972; Corbett & Anderson, 1995; Kha- the learner in the computer-aided condition, the model prob- jah, V. Lindsey, & Mozer, 2016). In this paper we attempt abilistically updated a posterior probability estimate of the to predict an individual’s memory (specifically which items state of each memory trace according to the transition proba- will be remembered and which will be forgotten after a de- bilities. In this way, the model used the history and dynamics lay) without using explicit assessments. Our approach builds of a trace (e.g., how long ago it was last studied, how many on research in cognitive neuroscience which has identified times it has been studied, etc.) to make a prediction about the several neuroimaging correlates of successful memory forma- status of a memory at any point in time. tion (Davachi, 2006). We use functional magnetic resonance The critical insight from Atkinson is that an explicit model imaging (fMRI) to “peer into the minds” of learners while of the dynamics of a memory trace can be exploited by they study and to use the resulting information to improve an adaptive computer algorithm to predict human learning. predictions about their memory tested at a multi-day delay. This influential finding inspired a line of work that has uti- We begin by laying out a novel Hidden Markov Model lized Bayesian Knowledge Tracing (BKT) to adapt instruc- (HMM, Rabiner, 1989) of memory which can learn to utilize tion to individual learners (Corbett & Anderson, 1995) and fMRI signals as helpful indicators about the mnemonic status generated a range of subsequent papers and modeling at- fremembered, forgotteng of individual memory traces. We tempts (Pavlik & Anderson, 2008; Lindsey et al., 2009; Kha- then compare the predictive ability of this model to a number jah, Lindsey, & Mozer, 2014). of simpler alternatives which lack access to the neural infor- Although these models show promise, they have several mation but which incorporate other information about indi- limitations. In particular, there is usually no information Transitions during study opportunities this approach is flexibility in incorporating diverse kinds of Transition probabilities observable emissions (Rabiner, 1989). Here we focus on U T P BOLD signals measured with fMRI (as well as metacognitive fMRI Other judgments in the form of JOLs) because such signals have, in Emissions Behavioral { responses past work, been been related to future memory performance Transitions between study opportunities but the framework could readily incorporate other observa- tions bearing on memory status (e.g., EEG, eye movements). U T P Our goal of combining neuroimaging data with a cognitive model is part of a broader recent initiative to find ways Forgetting to link behavioral and neural data together with hypotheses Figure 1: Structure of Atkinson’s (1972) three state Markov model about cognitive processes (Hawkins, Mittner, Forstmann, & showing latent memory states, the allowable transitions between Heathcote, 2017; Turner et al., 2013; Anderson et al., 2010; them during learning and forgetting, and the parameters governing the state transitions. Anderson, 2012; Anderson, Pyke, & Fincham, 2016; Turner, Rodriguez, Norcia, McClure, & Steyvers, 2016; Turner, Forstmann, Love, Palmeri, & Maanen, 2016). Although the about the knowledge or memory status of some material un- research topics in these reports are diverse, they share the per- til the learner completes an explicit assessment (e.g., test or spective that neural data can provide useful information about metacognitive rating of learning). However, it has long been latent states or cognitive processes beyond that available from recognized that repeated assessments interrupt the learning behavior alone and that the constraining influence of jointly context and may alter the nature of memory in consequen- modeling neural and behavioral data enables more accurate tial ways (Anderson, Bjork, & Bjork, 1994). In addition, description of cognition. the focus on explicit assessments runs the risk of ignoring other useful forms of information bearing on the underlying Structure of the model knowledge state of individuals (Anderson, Betts, Ferris, & We assume that a learner studies a list of N word pairs or Fincham, 2010; Anderson, 2012; Turner et al., 2013). pair-wise associations. We characterize each memory trace (i.e., each word pair) as a HMM with the structure shown in Signals of memory formation in the brain Figure 1. For computational reasons, we assume that indi- Following pioneering clinical and animal research (Scoville vidual memory traces are independent but in future work this & Milner, 1957; Mishkin, 1978) it is now common consensus assumption can be easily relaxed to model inter-item interac- that there are regions in the brain which are critically involved tions. in memory acquisition. One important finding is that neu- The core of each HMM is the set of discrete latent memory roimaging signals recorded during study episodes differ ac- states S = fsU ;sT ;sPg; a prior, Pt=0, over the initial states cording to whether those episodes are later remembered com- (before study) for each word pair; the transition probabili- pared to those that are forgotten (Paller & Wagner, 2002), a ties, T, which determine the probability of a trace moving phenomenon known as the subsequent memory effect (SME). between the different states at each point in time as set by the Of particular interest in the context of this paper is the fact parameters x, y, z, and f (Figure 1A); and the emission dis- that these differences are, on average, measurable at the time tributions, E, defining the probability of observing data (be- of the to-be-remembered experience, suggesting that the fu- havioral or physiological) given a latent mental state and an ture memory status of individual episodes could be predicted external event eliciting an observable signal. from the brain response to a single trial. That is, rather than We propose two sets of transition probabilities (Figure 1), retroactively using later memory performance to identify the the application of which depends on the type of event, et = neural patterns predicting that performance, we hope to use fstudy;decay;testg, occurring at time t for a particular word these signals in a prospective fashion to predict the probabil- pair.

A Neurocognitive Model for Predicting the Fate of Individual Memories Shannon M

A Distributed Computational Cognitive Model for Object Recognition

The Place of Modeling in Cognitive Science

A Bayesian Computational Cognitive Model ∗

Connectionist Models of Language Processing

Cognitive Psychology for Deep Neural Networks: a Shape Bias Case Study

Cognitive Modeling, Symbolic

Connectionist Models of Cognition Michael S. C. Thomas and James L

Artificial Intelligence and Cognitive Science Have the Same Problem

Models of Cognition: Neurological Possiblity Does Not Indicate Neurological Plausibility

Artificial Neural Nets

A Cognitive Model of Drivers Attention

Using Recurrent Neural Networks to Understand Human Reward Learning