Spaced Repetition: Towards More Effective Learning in STEM
Total Page:16
File Type:pdf, Size:1020Kb
1 RESEARCH DIRECTIONS Spaced Repetition: towards more effective learning in STEM Alison Voice1*and Arran Stirton2 1School of Physics & Astronomy, University of Leeds, Leeds, LS2 9JT 2School of Physics & Astronomy, University of Leicester, Leicester, LE1 7RH Corresponding author: [email protected] Keywords: Spaced repetition; Long term memory; Cramming; Web app Abstract Introduction The use of spaced repetition within a physics The benefit of spaced repetition on learning higher education thermodynamics module has has been recognised for over a century, with been analysed for: its pattern of use by Ebbinghaus (1885) making the first systematic students; its effect on memory and investigation of memory, developing a performance in the end of module exam; and ‘forgetting curve’. This shows exponential performance in a delayed test after the summer decay of information from the memory when no vacation. A custom-built web app with the effort is made to revisit that information. facility to generate a personalised repetition However, if the information is revisited the rate timetable was used to deliver practice of decay reduces, thus allowing for ever questions on the material throughout the increasing time intervals between repetitions to module. Just over a quarter of students, retain long term memory of it. This psychology spanning the whole ability range of the class, research was first applied to education by made use of the app in some way, about half Mace (1968) suggesting that regular revision of using it in a spaced manner and half using it for curriculum material would be much more massed practice just before the exam. effective than ‘massed’ study (all at once). He Students who engaged in a spaced manner proposed that revision should be spaced in had an adjusted mean exam score of 70%, gradually increasing intervals. compared to 64% for massed usage and 61% for non-usage. The spaced usage represents a In higher education long term memory is positive effect size of 0.47 over non-usage, needed for students to have a sound platform which is statistically significant ( = 0.000056). of understanding on which to build new For the delayed test the mean adjusted scores advanced knowledge, and to apply when for spacers and non-users were 45% and 34% solving problems. Indeed, the Benchmark respectively. Whilst less material had been Statements for Physics (Quality Assurance retained over the summer, this revealed a Agency, 2017) and Chemistry (Quality statistically significant ( = 0.021) positive Assurance Agency, 2014) specifically mention effect size of 0.54. This work provides evidence this skill. Students, however, are often prone to and mechanisms to involve students in studying material in a massed way, in a ‘cram repetitive practice during the learning phase of for the exam’ mentality despite the potential for a course to advantage their long term retention diminishing the long term benefit. A survey of of material. 189 physics undergraduates (Curtis et al., 2018) revealed that students in all years study predominantly by reading and writing notes during term time, with 69% leaving any form of New Directions in the Teaching of Physical Sciences, Volume 15, Issue 1 (2020) https://doi.org/10.29311/ndtps.v0i15.3376 2 Spaced Repetition: towards more effective learning in STEM self-testing until the month before the exam. Methods This learning strategy is hardly surprising since To examine the effects of spaced repetition in both students and teachers tend to focus on a university STEM module setting, trials were short term performance, and rarely consider conducted on three successive cohorts of first the effect on long term retention (Kornell & year physics students at the University of Bjork, 2007; Roediger & Karpicke, 2018). Leeds over the academic years 2016-17, 2017- 18, and 2018-19. For each trial, the students Further investigation into the nature of this were given access to the intervention (a spaced practice reveals that recall (testing) is bespoke spaced-repetition web app) to assist more beneficial than re-reading (Roediger & them with their semester two thermodynamics Karpicke, 2006a) even in the absence of module. marking or feedback. This is explained (Rawson et al., 2015) as the process of recall Outcomes were assessed according to a improving and increasing the retrieval controlled before-after design, comparing pathways in the brain. semester one and module-specific semester two exam performance between the control In STEM classroom situations recent studies and intervention groups. Additionally for the have demonstrated the benefit of spaced 2018-19 cohort, a delayed test on repetition on long term memory with Medical thermodynamics content was administered at students (Kerfoot et al., 2007), Maths students the beginning of their second year to assess (Rohrer & Taylor, 2007; Gallo & Odu, 2009; how well material was retained over the Rohrer, 2009), Engineering students (Hopkins summer. et al., 2015) and Natural Science students (Kapler et al., 2015). Student participation In each trial all first year students enrolled on In the 1970s Leitner (1972) developed a the thermodynamics module were invited to system of flash cards, sorted into five boxes, to use the web app to assist their learning. Each employ the spacing effect to develop long term cohort was informed about the app through an memory. The first box contained information initial in-lecture presentation and reminded of least known, the second box was known a little its availability at various points throughout the better, and so on. Cards in the first box were module. The exact timing of when the app was then reviewed daily, the second box every two introduced to students varied from year to year. days, etc. When confident that information on a particular flash card could be recalled it was Participation in the intervention group was promoted to the next box, but failing to recall voluntary (and thus not randomized) and could the information on any card in any box, commence at any point from the initial demoted it back to the first box. With the advent introduction of the software, right up to the of personal computers and mobile technology, exam. Faculty were blind to individual apps have been created to replicate this participation, but aware of overall usage and process, the most well-known probably being self-reported student usage. The work was Anki and Duolingo. conducted in line with ethical requirements and approval from the University of Leeds. Based on this wealth of evidence that regular repetition and recall of material promotes better Provision of the spaced repetition web-app long term memory, the research reported here A purpose-built web application was used for investigates how first year undergraduate (UG) the trials, consisting of a bank of questions with Physics students engage with practice the facility to deliver a personalized review questions supplied via a custom-built spaced- schedule. The questions were designed to give repetition app. The effect of app usage on students practice of the core laws, equations performance in the end of module exam, and and concepts of the module, while the an impromptu delayed test after the summer scheduling system informed the students vacation is examined. individually of the best time to review each question, in line with the spacing effect. New Directions in the Teaching of Physical Sciences, Volume 15, Issue 1 (2020) https://doi.org/10.29311/ndtps.v0i15.3376 3 Spaced Repetition: towards more effective learning in STEM As the questions were being developed controlled by their first year, semester one alongside the trials, the number and content of performance. the questions varied slightly from year to year; starting with 50 questions in 2016-17, rising to Statistical methods 67 questions in 2017-18 and dropping to 62 in Students that did not complete either of their 2018-19. Questions were both conceptual and first or second semester exams were dropped mathematical in nature, containing a mix of from the analysis. Scores for both the first self-assessed free-recall questions alongside semester physics exam and the computer-assessed mathematical ones. To thermodynamics exam were normalized to a mitigate the issue of students purely percentage score to facilitate a before-and- memorising the answers to particular after comparison. questions, the parameters in mathematical questions were randomised during each As the control and intervention groups were review, so that the information being recalled identified post-hoc, two analyses were carried was the process to solve and not the numerical out: an intention-to-treat analysis where all answer. For the self-assessed questions students that used the app at least once were students had to decide for each question assigned to the intervention group, regardless whether they were sufficiently familiar with that of how they used it; and a per-protocol analysis, part of the syllabus, or whether they needed where students that massed their usage of the more practice. app (i.e. all their usage was on a single day) were excluded from the analysis. The scheduling system was designed to present for repeat the next day, any wrongly Comparisons of the mean semester one answered mathematical question or a question physics exam performance between the self-assessed as not sufficiently familiar. control and intervention groups were Successful/familiar questions were set for undertaken using a two-tailed Welch’s -test (a review at ever increasing time intervals modification of Student’s -test for unequal according to the following pattern: 1 day after sample variances) and differences in initially completing the question, then 6 days performance were calculated. Analysis of after the first review, and then subsequently at Covariance (ANCOVA) was used to test for intervals of 2.5 times the previous interval (i.e. significant difference between intervention and 15 days, 37 days, etc.).