Implementing the repetition spacing neural network

Bartosz Dreger Piotr Wozniak May 29, 1998

See Neural Network SuperMemo for a brief introduction to repetition spacing neural network

Basic assumption

The state of memory will be described with only two variables: retrievability (R) and stability (S) (Wozniak, Gorzelanczyk, Murakowski, 1995). The following equation relates R and S:

(1) R=e-k/S*t


  • k is a constant
  • t is time

For simplicity, we will set k=1 to univocally define stability.

Input and output

The following functions are to be determined by the network:

(2) Si+1=fs(R, Si, D, G)

(3) Di+1=fd(R, S, Di, G)

The neural network is supposed to generate stability (S) and item difficulty (D) on the output given R, S, D and G on the input:

(4) (Ri, Si, Di, Gi) => (Di+1,Si+1)


  • Ri is retrievability before the i-th repetition
  • Si is stability before the i-th repetition
  • Si+1 is stability after the i-th repetition
  • Di is item difficulty before the i-th repetition
  • Di+1 is item difficulty after the i-th repetition
  • Gi is grade given in the i-th repetition

Error correction for difficulty D

Target difficulty will be defined as in Algorithm SM-8 as the ratio between second and first intervals. The neural network plug-in (NN.DLL) will record this value for all individual items and use it in training the network:

(5) Do=I2/I1


  • Do is guiding difficulty used in error correction (the higher the Do, the less the difficulty)
  • I1 is the first optimum interval computed for the item in question (same for all items)
  • I2 is the second optimum interval computed for the item

Important! The optimum intervals I1 and I2 are not the ones proposed by the network before its verification but the ones used in error correction after the proposed interval had already been executed and verified (see error correction for stability S)! 

The initial value of difficulty will be set to 3.5, i.e. D1=3.5. This is for similarity with Algorithm SM8 only. As initial difficulty is not known, it cannot be used to determine the first interval. After scoring the first grade the error correction is still impossible due to the fact that second optimum interval is not known. Once it is known, Do can be used for error correction of D on the output.

To avoid convergence problems in the network, the following formula will be used to determine the correct output on D:

(6) Dopt=0.9*Di+0.1*Do


  • Dopt is difficulty used in error correction after the i-th repetition
  • Di is difficulty before the i-th repetition
  • Do is guiding difficulty from Eqn (5)

The convergence factor of 0.9 in Eqn (6) is arbitrary and may change depending on the network performance.

Error correction for stability S

The following formula, derived from Eqn (1) for forgetting index equal 10% and k=1, makes it easy to convert stability and the optimum interval: I=-ln(0.9)*S

In the optimum case the network should generate the requested forgetting index for each repetition. Variable forgetting index can easily be used once the stability S is known (see Eqn (1)). For simplicity then we will use forgetting index equal 10% in further analysis.

To accelerate the convergence, the network will measure forgetting index for 25 classes of repetitions. These classes are set by (1) five difficulty categories: 1-1.5, 1.5-2.5, 2.5-3.5, 3.5-5, and over 5, and (2) five interval categories: 1-5, 5-20, 20-100, 100-500 and over 500 days. We will denote the forgetting index measurements for these categories as FI(Dm,In). Additionally, the overall forgetting index FItot will be measured and used in stability error correction.

The ultimate goal is to reach the forgetting index of 10% in all categories. The following formula will be used in error correction for stability:

(7) FIopt(m,n)=(10*FItot+Cases(m,n)*FI(m,n))/(10+Cases(m,n))


  • FIopt(m,n) is forgetting index used in error correction after a repetition belonging to category (m,n)
  • FItot is the overall forgetting index measured in repetitions
  • Cases(m,n) is the number of repetition cases used to measure the forgetting index in category (m,n)

The formula in Eqn (7) is supposed to shift the weight on error correction from the overall forgetting index to forgetting index recorded in given categories as soon as the number of cases in individual categories increases. Obviously, for Cases(m,n)=0, we have FIopt(m,n)=FItot. For Cases(m,n)=10 the weights for overall and category FI balance, and for a large number of cases, FIopt(m,n) is approaching FI(m,n).

The following table illustrates the assumed relationship between FIopt(m,n), grades and the interval correction applied:












no correction

no correction

no correction


no correction

no correction

no correction

no correction

no correction

no correction


no correction

no correction

no correction




In SuperMemo, grades less than 3 are interpreted as forgetting, while grades equal 3 or more are understood as sufficient recall. That is why no correction is used for passing grades in case of satisfactory FI, and no correction is used for failing grades if FI is greater than requested.
An exemplary correction for an excessive forgetting rate and grade=2 for applied interval of 10 days would be 80%. Consequently, the network will be instructed to assume Interval=8 as correct. Correct stability would then be derived from S=-8/ln(0.9) and used in error correction.
The values of interval corrections are arbitrary but shall not undermine the convergence of the network. In case of unlikely stability problems, the corrections might be reduced (note that the environmental noise in the learning process will dramatically exceed the impact of ineffectively choosing the correction factors!). Similar corrections used to be applied in successive SuperMemo algorithms with encouraging results.

Border conditions

The following additional constraints will be imposed on the neural network to accelerate the convergence:

  • interval increase in two successive repetition must be at least 1.1 (consequently, difficulty cannot be less than 1.1)
  • interval increase cannot surpass 8 after the first repetition, and 4 in later repetitions
  • the first interval must fall between 1 and 40 days
  • difficulty measure cannot exceed 8

These conditions will not prejudice the network as they have been proven beyond reasonable doubt as true in the practice of using SuperMemo and its implementations over the last ten years.


In the pretraining stage, the following form of Eqns (2) and (3) will be used:

(8) Di+1:=Di+(0.1-(5-G)*(0.08+(5-G)*0.02))

(9) Si+1:=Si*Di*(0.5+1/i)

With D1=3.5 and S1=-3/ln(0.9).

Eqn (8) has been derived from Algorithm SM-2 (see E-Factor equation).
Eqn (9) has been roughly derived from Matrix OF in
Algorithm SM-8.
D1=3.5 corresponds with the same setting in Algorithm SM-8.
S1=-3/ln(0.9) corresponds with the first interval of 3 days and forgetting index 10%. The value of 3 days is close to an average across a wide spectrum of students and difficulty of the learning material.

Pretraining will also use border conditions mentioned in the previous paragraph.