Supplementary Information For

Supplementary Information For

Supplementary Information for Supplementary Information for Enhancing Human Learning via Spaced Repetition Optimization Behzad Tabibian, Utkarsh Upadhyay, Abir De, Ali Zarezade, Bernhard Schölkopf, Manuel Gomez-Rodriguez Behzad Tabibian. E-mail: [email protected] This PDF file includes: Supplementary text Figs. S1 to S7 Tables S1 to S3 References for SI reference citations Behzad Tabibian, Utkarsh Upadhyay, Abir De, Ali Zarezade, Bernhard Schölkopf, Manuel Gomez-Rodriguez 1 of 11 www.pnas.org/cgi/doi/10.1073/pnas.1815156116 Supporting Information Text 1. Proof of Proposition1 According to Eq.1, the recall probability m(t) depends on the forgetting rate, n(t), and the time elapsed since the last review, D(t) := t − tr. Moreover, we can readily write the differential of D(t) as dD(t) = dt − D(t)dN(t). We define the vector X(t) = [n(t),D(t)]T . Then, we use Eq.2 and Itö’s calculus (1) to compute its differential: dX(t) =dt f(X(t), t)dt + h(X(t), t)dN(t) zol 0 f(X(t), t) = 1 −αn(t)r(t)+βn(t)(1−r(t)) h(X(t), t) = −D(t) Finally, using again Itö’s calculus and the above differential, we can compute the differential of the recall probability m(t) = e−n(t)D(t) := F (X(t)) as follows: dF (X(t)) = F (X(t + dt)) − F (X(t)) = F (X(t) + dX(t)) − F (X(t)) dt T = (f FX (X(t)))dt + F (X(t) + h(X(t), t)dN(t)) − F (X(t)) zol dt T = (f FX (X(t)))dt + (F (X(t) + h(X(t), t)) − F (X(t)))dN(t) zol = (e−(D(t)−D(t))n(t)(1+αri(t)−β(1−ri(t))) − e−D(t)n(t))dN(t) − n(t)e−D(t)n(t)dt = −n(t)e−D(t)n(t)dt + (1 − e−D(t)n(t))dN(t) = −n(t)F (X(t))dt + (1 − F (X(t)))dN(t) = −n(t)m(t)dt + (1 − m(t))dN(t). 2. Lemma1 Lemma 1 Let x(t) and y(t) be two jump-diffusion processes defined by the following jump SDEs: dx(t) =f(x(t), y(t), t)dt + g(x(t), y(t), t)z(t)dN(t) + h(x(t), y(t), t)(1 − z(t))dN(t) dy(t) =p(x(t), y(t), t)dt + q(x(t), y(t), t)dN(t) where N(t) is a jump process and z(t) ∈ {0, 1}. If function F (x(t), y(t), t) is once continuously differentiable in x(t), y(t) and t, then, dF (x, y, t) = (Ft + fFx + pFy)(x, y, t)dt + [F (x + g, y + q, t)z(t) + F (x + h, y + q, t)(1 − z(t)) − F (x, y, t)]dN(t), where for notational simplicity we dropped the arguments of the functions f, g, h, p, q and argument of state variables. Proof According to the definition of differential, dF := dF (x(t), y(t), t) = F (x(t + dt), y(t + dt), t + dt) − F (x(t), y(t), t) = F (x(t) + dx(t), y(t) + dy(t), t + dt) − F (x(t), y(t), t). Then, using Itö’s calculus, we can write dF =dt F (x + fdt + g, y + pdt + q, t + dt)dN(t)z + F (x + fdt + h, y + pdt + q, t + dt)dN(t)(1 − z) zol + F (x + fdt, y + pdt, t + dt)(1 − dN(t)) − F (x, y, t) [1] where for notational simplicity we drop arguments of all functions except F and dN. Then, we expand the first three terms: F (x + fdt + g, y + pdt + q, t + dt) = F (x + g, y + q, t) + Fx(x + g, y + q, t)fdt + Fy(x + g, y + q, t)pdt + Ft(x + g, y + q, t)dt F (x + fdt + h, y + pdt + q, t + dt) = F (x + h, y + q, t) + Fx(x + h, y + q, t)fdt + Fy(x + h, y + q, t)pdt + Ft(x + h, y + q, t)dt F (x + fdt, y + pdt, t + dt) = F (x, y, t) + Fx(x, y, t)fdt + Fy(x, y, t)pdt + Ft(x, y, t)dt using that the bilinear differential form dt dN(t) = 0. Finally, by substituting the above three equations into Eq.1, we conclude that: dF (x(t), y(t), t) = (Ft + fFx + pFy)(x(t), y(t), t)dt + F (x + g, y + q, t)z(t) + F (x + h, y + q, t)(1 − z(t)) − F (x, y, t)dN(t), 2 of 11 Behzad Tabibian, Utkarsh Upadhyay, Abir De, Ali Zarezade, Bernhard Schölkopf, Manuel Gomez-Rodriguez Specifically, consider x(t) = n(t), y(t) = m(t), z(t) = r(t) and J = F from equation6 in the above equation, then, dJ(m, n, t) = Jt(m, n, t) − nmJm(m, n, t) + [J(1, (1 − α)n, t)r(t) + J(1, (1 + β)n, t)(1 − r) − J(m, n, t)]dN(t). 3. Lemma2 Lemma 2 Consider the following family of losses with parameter d > 0, 1 ` (m(t), n(t), u(t)) = h (m(t), n(t)) + g2(m(t), n(t)) + qu(t)2, d d d 2 log(d) log(d) 1 + β g (m(t), n(t)) = 2−1/2 c − c + c m(t) log − c log(1 + β) , d 2 −m(t)2 + 2m(t) − d 2 1 − d 1 1 − α 1 + √ (−2m(t) + 2) log(d) h (m(t), n(t)) = − qm(t)n(t)c . [2] d 2 (−m(t)2 + 2m(t) − d)2 where c1, c2 ∈ R are arbitrary constants. Then, the cost-to-go Jd(m(t), n(t), t) that satisfies the HJB equation, defined by Eq.7, is given by: √ log(d) J (m(t), n(t), t) = q c log(n(t)) + c [3] d 1 2 −m(t)2 + 2m(t) − d and the optimal intensity is given by: log(d) log(d) 1 + β u∗(t) = q−1/2[c − c + c m(t) log − c log(1 + β)] . d 2 −m(t)2 + 2m(t) − d 2 1 − d 1 1 − α 1 + Proof Consider the family of losses defined by Eq.2 and the functional form for the cost-to-go defined by Eq.3. Then, for ∗ any parameter value d > 0, the optimal intensity ud(t) is given by ∗ −1 ud(t) = q [Jd(m(t), n(t), t) − Jd(1, (1 − α)n(t), t)m(t) − Jd(1, (1 + β)n(t), t)(1 − m(t))]+ log(d) log(d) 1 + β = q−1/2 c − c + c m(t) log − c log(1 + β) , 2 −m2 + 2m − d 2 1 − d 1 1 − α 1 + and the HJB equation is satisfied: ∂J (m, n, t) ∂J (m, n, t) 1 d − mn d + h (m, n) + g2(m, n) − q−1J (m, n, t) ∂t ∂m d d 2 d 2 − Jd(1, (1 − α)n, t)m − Jd(1, (1 + β)n, t)(1 − m) + √ (−2m + 2) log(d) 1 log(d) = qmnc + h (m, n) + g2(m, n) − c log(n) + c 2 (−m2 + 2m − d)2 d d 2 1 2 −m2 + 2m − d 2 log(d) log(d) −m c log(n(1 − α)) + c − (1 − m) c log(n(1 + β)) + c 1 2 1 − d 1 2 1 − d + √ (−2m + 2) log(d) √ (−2m + 2) log(d) = qmnc − qmnc 2 (−m2 + 2m − d)2 2 (−m2 + 2m − d)2 | {z } hd(m,n) 1 log(d) log(d) 1 + β 2 − c2 − c2 + c1m log( ) − c1 log(1 + β) 2 −m2 + 2m − d 1 − d 1 − α + 1 log(d) log(d) 1 + β 2 + c2 − c2 + c1m log( ) − c1 log(1 + β) 2 −m2 + 2m − d 1 − d 1 − α + where for notational simplicity m = m(t), n = n(t) and u = u(t). Behzad Tabibian, Utkarsh Upadhyay, Abir De, Ali Zarezade, Bernhard Schölkopf, Manuel Gomez-Rodriguez 3 of 11 Threshold Uniform Threshold Uniform Threshold Uniform Threshold Uniform Last Minute Memorize Last Minute Memorize Last Minute Memorize Last Minute Memorize 1.0 1.0 102 15 ) + 5) + 15) t ) t 0 ( t 10 t 10 ( ( ( 0.5 0.5 u t n 0 2 R 5 10− recall recall p p 0.0 0.0 0 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 t t t t (a) Forgetting rate (b) Short-term recall prob. (c) Long-term recall prob. (d) Reviewing intensity Fig. S1. Performance of MEMORIZE in comparison with several baselines. The solid lines are median values and the shadowed regions are 30% confidence intervals. Short-term recall probability corresponds to m(t + 5) and long-term recall probability to m(t + 15). In all cases, we use α = 0.5, β = 1, n(0) = 1 and tf − t0 = 20. −4 th Moreover, we set q = 3 · 10 for MEMORIZE, µ = 0.6 for the uniform reviewing schedule, tlm = 5 and µ = 2.38 for the last minute reviewing schedule, and m = 0.7 and c = ζ = 5 for the threshold based reviewing schedule. Under these parameter values, the total number of reviewing events for all algorithms are equal (with a tolerance of 5%). 4. Proof of Theorem3 Consider the family of losses defined by Eq.2 in Lemma2 whose optimal intensity is given by: log(d) log(d) 1 + β u∗(t) = q−1/2 c − c + c m(t) log − c log(1 + β) .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    11 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us