For over three decades, SuperMemo World has been the unequivocal leader in spaced repetition research. Our algorithms are the product of a continuous, scientifically-driven refinement process, grounded in data from millions of learning sessions. Recently, we have observed a concerning trend: the proliferation of false benchmarks and misleading claims in the public discourse, primarily surrounding the open-source FSRS algorithm. It is time to inject some rigorous, scientific clarity into this conversation. The current narrative is not just incorrect, it is methodologically flawed. You cannot win a race by moving the finish line.
The widely circulated claims of FSRS superiority, which even made their way to a PhD thesis, are built upon a foundation of flawed comparisons. These claims rely on machine learning metrics like Log Loss or AUC, which are useful for model training but are entirely inappropriate for measuring the calibrational precision of a spaced repetition algorithm.
We have spent decades developing the algorithms and now refining the only benchmark that matters: the Universal Metric.
This metric measures one thing, and one thing only: the difference between an algorithm’s predicted memory stability and the empirically observed stability. A perfect algorithm would score 0%. There are no excuses, no alternative interpretations. It is the uncompromising measure of truth in our field.
When we apply this universal metric to high-quality learning data, the results are unequivocal and speak for themselves:
- Algorithm SM-19: 1-3%
- Algorithm SM-20 (a new AI algorithm based on SuperMemo expert knowledge about the brain and memory, still work in progress): approaching SM-19, even close to 0% on well-structured material
- FSRS (optimized): 15-20%.
Let’s be perfectly clear: a 15-20% error rate is not “superior.” It is not “competitive.” It is a definitive indicator of a significant gap in predictive accuracy. The suggestion that these results are comparable is a scientific meme that needs to be retired, immediately.
The flaws in the prevailing comparisons are not minor quibbles; they are fundamental:
- They use the wrong baseline. Early comparisons benchmarked FSRS against ‘R(SM17)(exp)’, a theoretical exponential approximation, not the actual, dynamic predictions of the SM-17 algorithm. This is like claiming to beat a world-class athlete by racing their statue.
- They use a minuscule dataset. Drawing grand conclusions from 16 collections is statistically myopic. Our work is validated by the learning journeys of millions.
- They ignore real-time optimization. Modern SuperMemo algorithms instantly adapt their model with every repetition.
We do not dismiss the work behind FSRS. It is a commendable open-source effort and a marked improvement over ancient algorithms like SM-2. However, claiming it surpasses 40 years of dedicated research is not just an overreach, it is a disservice to learners seeking the most effective tools.
Call to Action: we challenge the entire spaced repetition community, researchers, developers, and serious learners, to adopt the universal metric as the sole, sound benchmark for algorithmic performance. Stop comparing apples to oranges. Stop using training metrics as performance indicators.
The above data from preliminary tests shows SM-19 outperforming the current version of FSRS by a significant margin. We are so confident in our position that we have committed to integrating FSRS into a future version of SuperMemo for a direct, side-by-side comparison under identical conditions, measured fairly by the universal metric. SM-20 will be part of SuperMemo for Windows, SuperMemo.com webservice and apps, as well as public API for global dissemination.
The mission of SuperMemo World has always been to minimize the time spent learning and maximize lifelong knowledge. That mission is built on a foundation of scientific integrity, not on winning popularity contests with flawed methodologies.
The path forward is clear. Embrace rigorous benchmarks, demand objective comparisons. Let the universal metric guide the future of spaced repetition, so we can all focus on what truly matters: quality of learning and its new applications in science. Deep understanding of the human memory formation processes is a key to overcoming the barriers of continuous learning in artificial neural networks and the first step to superintelligence.
Krzysztof Biedalak, CEO