Enhancing Human Learning Via Spaced Repetition Optimization
Total Page:16
File Type:pdf, Size:1020Kb
Enhancing human learning via spaced repetition optimization Behzad Tabibiana,b,1, Utkarsh Upadhyaya, Abir Dea, Ali Zarezadea, Bernhard Scholkopf¨ b, and Manuel Gomez-Rodrigueza aNetworks Learning Group, Max Planck Institute for Software Systems, 67663 Kaiserslautern, Germany; and bEmpirical Inference Department, Max Planck Institute for Intelligent Systems, 72076 Tubingen,¨ Germany Edited by Richard M. Shiffrin, Indiana University, Bloomington, IN, and approved December 14, 2018 (received for review September 3, 2018) Spaced repetition is a technique for efficient memorization which tion algorithms. However, most of the above spaced repetition uses repeated review of content following a schedule determined algorithms are simple rule-based heuristics with a few hard- by a spaced repetition algorithm to improve long-term reten- coded parameters (8), which are unable to fulfill this promise— tion. However, current spaced repetition algorithms are simple adaptive, data-driven algorithms with provable guarantees have rule-based heuristics with a few hard-coded parameters. Here, been largely missing until very recently (14, 15). Among these we introduce a flexible representation of spaced repetition using recent notable exceptions, the work most closely related to ours the framework of marked temporal point processes and then is by Reddy et al. (15), who proposed a queueing network model address the design of spaced repetition algorithms with prov- for a particular spaced repetition method—the Leitner system able guarantees as an optimal control problem for stochastic (9) for reviewing flashcards—and then developed a heuristic differential equations with jumps. For two well-known human approximation algorithm for scheduling reviews. However, their memory models, we show that, if the learner aims to maximize heuristic does not have provable guarantees, it does not adapt recall probability of the content to be learned subject to a cost to the learner’s performance over time, and it is specifically on the reviewing frequency, the optimal reviewing schedule is designed for the Leitner systems. given by the recall probability itself. As a result, we can then In this work, we develop a computational framework to derive develop a simple, scalable online spaced repetition algorithm, optimal spaced repetition algorithms, specially designed to adapt MEMORIZE, to determine the optimal reviewing times. We per- to the learner’s performance, as continuously monitored by mod- APPLIED form a large-scale natural experiment using data from Duolingo, a ern spaced repetition software and online learning platforms. MATHEMATICS popular language-learning online platform, and show that learn- More specifically, we first introduce a flexible representation ers who follow a reviewing schedule determined by our algorithm of spaced repetition using the framework of marked temporal memorize more effectively than learners who follow alternative point processes (16). For several well-known human memory schedules determined by several heuristics. models (1, 12, 17–19), we use this presentation to express the dynamics of a learner’s forgetting rates and recall probabilities for the content to be learned by means of a set of stochastic memorization j spaced repetition j human learning j marked differential equations (SDEs) with jumps. Then, we can find the temporal point processes j stochastic optimal control Significance Our ability to remember a piece of information depends crit- ically on the number of times we have reviewed it, the temporal Understanding human memory has been a long-standing distribution of the reviews, and the time elapsed since the last problem in various scientific disciplines. Early works focused review, as first shown by a seminal study by Ebbinghaus (1). The on characterizing human memory using small-scale controlled effect of these two factors has been extensively investigated in the experiments and these empirical studies later motivated the experimental psychology literature (2, 3), particularly in second design of spaced repetition algorithms for efficient memo- language acquisition research (4–7). Moreover, these empirical rization. However, current spaced repetition algorithms are studies have motivated the use of flashcards, small pieces of rule-based heuristics with hard-coded parameters, which do information a learner repeatedly reviews following a schedule not leverage the automated fine-grained monitoring and determined by a spaced repetition algorithm (8), whose goal is to greater degree of control offered by modern online learning ensure that learners spend more (less) time working on forgotten platforms. In this work, we develop a computational frame- (recalled) information. work to derive optimal spaced repetition algorithms, specially The task of designing spaced repetition algorithms has a rich designed to adapt to the learners’ performance. A large-scale history, starting with the Leitner system (9). More recently, natural experiment using data from a popular language- several works (10, 11) have proposed heuristic algorithms that learning online platform provides empirical evidence that the schedule reviews just as the learner is about to forget an item, spaced repetition algorithms derived using our framework are i.e., when the probability of recall, as given by a memory model significantly superior to alternatives. of choice (1, 12), falls below a threshold. An orthogonal line of research (7, 13) has pursued locally optimal scheduling by iden- Author contributions: B.T., U.U., B.S., and M.G.-R. designed research; B.T., U.U., A.D., A.Z., tifying which item would benefit the most from a review given and M.G.-R. performed research; B.T. and U.U. analyzed data; and B.T., U.U., and M.G.-R. a fixed reviewing time. In doing so, the researchers have also wrote the paper.y proposed heuristic algorithms that decide which item to review The authors declare no conflict of interest.y by greedily selecting the item which is closest to its maximum This article is a PNAS Direct Submission.y learning rate. This open access article is distributed under Creative Commons Attribution License 4.0 In recent years, spaced repetition software and online plat- (CC BY).y forms such as Mnemosyne (mnemosyne-proj.org), Synap (www. Data deposition: The MEMORIZE algorithm has been deposited on GitHub (https:// synap.ac), and Duolingo (www.duolingo.com) have become github.com/Networks-Learning/memorize).y increasingly popular, often replacing the use of physical flash- 1 To whom correspondence should be addressed. Email: [email protected] cards. The promise of these pieces of software and online This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. platforms is that automated fine-grained monitoring and greater 1073/pnas.1815156116/-/DCSupplemental.y degree of control will result in more effective spaced repeti- www.pnas.org/cgi/doi/10.1073/pnas.1815156116 PNAS Latest Articles j 1 of 6 Downloaded by guest on September 30, 2021 optimal reviewing schedule for spaced repetition by solving a i.e., E[dN(t)] = u(t)dt, and think of the recall r as their binary stochastic optimal control problem for SDEs with jumps (20–23). marks. Moreover, every time that a learner reviews an item, the In doing so, we need to introduce a proof technique to find a recall r has been experimentally shown to have an effect on solution to the so-called Hamilton–Jacobi–Bellman (HJB) equa- the forgetting rate of the item (3, 15, 25). Here, we estimate tion (SI Appendix, sections 3 and 4), which is of independent such an effect using half-life regression (25), which implicitly interest. assumes that recalls of an item i during a review have a multi- For two well-known memory models, we show that, if the plicative effect on the forgetting rate ni (t)—a successful recall learner aims to maximize recall probability of the content to be at time tr changes the forgetting rate by (1 − αi ), i.e., ni (t) = learned subject to a cost on the reviewing frequency, the solu- (1 − αi )ni (tr ), αi ≤ 1, while an unsuccessful recall changes the tion uncovers a linear relationship with a negative slope between forgetting rate by (1 + βi ), i.e., ni (t) = (1 + βi )ni (tr ), βi ≥ 0. In the optimal rate of reviewing, or reviewing intensity, and the this context, the initial forgetting rate, ni (0), captures the diffi- recall probability of the content to be learned. As a consequence, culty of the item, with more difficult items having higher initial we can develop a simple, scalable online spaced repetition algo- forgetting rates compared with easier items, and the parame- rithm, which we name MEMORIZE, to determine the optimal ters αi , βi , and ni (0) are estimated using real data (refer to SI reviewing times. Finally, we perform a large-scale natural exper- Appendix, section 8 for more details). iment using data from Duolingo, a popular language-learning Before we proceed farther, we acknowledge that several lab- online platform, and show that learners who follow a reviewing oratory studies (6, 27) have provided empirical evidence that schedule determined by our algorithm memorize more effec- the retention rate follows an inverted U shape, i.e., mass prac- tively than learners who follow alternative schedules determined tice does not improve the forgetting rate—if an item is in by several heuristics. To facilitate research in this area, we are a learner’s short-term memory when the review happens, the releasing an open-source implementation of our algorithm (24). long-term retention does not improve. Thus, one could argue for time-varying parameters αi (t) and βi (t) in our framework. Modeling Framework of Spaced Repetition. Our framework is However, there are several reasons that prevent us from that: agnostic to the particular choice of memory model—it provides (i) The derivation of an optimal reviewing schedule under a set of techniques to find reviewing schedules that are opti- time-varying parameters becomes very challenging; (ii) for the mal under a memory model.