Argument · two layers, one schedule

FSRS schedules the card. The rubric decides whether seeing it teaches anything.

Most Anki + FSRS guides obsess over desired-retention parameters, parameter optimization, and the right grad steps to feed the DSR model. None of that fixes a card that you answer in 1.2 seconds because you recognized the first ten words of the stem. FSRS will schedule pattern-match-on-stem as faithfully as it schedules concept recall, mark you as 90 percent retained, and quietly hand you exam-day failure when the wording rotates.

Deep mastery is two layers stacked. The MCQ rubric layer makes sure the card is testing the concept, not the stem wording. The FSRS layer makes sure you see it at the right moment. Tuning one without the other is the most common reason a 1500-card deck reads as mastered and tests as 60 percent.

M
Matthew Diakonov
11 min read

Direct answer · verified 2026-05-07

How MCQ Anki cards work with FSRS for deep mastery

FSRS handles scheduling: a per-card DSR triple (Difficulty, Stability, Retrievability) fit by gradient descent on your review history, with a default desired retention of 0.90. The MCQ rubric handles retrieval quality: source-anchored stems, length-matched and grammar-parallel distractors, no filler templates, and stem rephrasing on revisit so each surfacing tests the concept rather than the wording. Deep mastery requires both. FSRS without rubric controls schedules whatever the card is teaching, including pattern-match. Rubric controls without FSRS lose the long-tail retention.

Authoritative reference for FSRS in Anki: docs.ankiweb.net/deck-options.html. Authoritative reference for the underlying algorithm: github.com/open-spaced-repetition/fsrs4anki.

Why FSRS alone cannot get you to deep mastery

FSRS sees one piece of information per review: the rating you gave (Again, Hard, Good, Easy) and the time elapsed. From that, plus your historical ratings on this card and the rest of the collection, it fits the DSR triple and predicts the next interval that maintains your desired retention probability. It is a very good algorithm at the job it has. The job it has is not "measure whether your retrieval was conceptual or pattern-based".

That distinction matters because the rating you click is your own self-report, and the self-report degrades when the card is a verbal lookup. A static MCQ stem you have seen five times becomes a unique key into your mental cache. You see the first ten words, return the cached answer in 1.2 seconds, click Good, and move on. FSRS updates the DSR triple. Stability rises. The interval extends. The deck statistics report 92 percent predicted retention. None of that prevents a different verbal surface (the test) from missing your cache and finding nothing under it.

The fix is not at the FSRS layer. Tuning desired retention from 0.90 to 0.97 doubles your daily review load without making any individual rating more honest. The fix is upstream, at the layer that decides what each card looks like when it surfaces.

The four layers of deep mastery

Three of these layers run before FSRS ever sees the card. The fourth is FSRS itself. Most guides cover the fourth and skip the first three; that is the gap this page is built around.

Two layers stacked, one honest schedule

1

Layer 1 · the card is built from a real source span

A card that is not anchored to your professor's slide is a card that drifts to whatever the model's pretraining said. FSRS will schedule it just the same. The first layer of deep mastery is that every front-back pair traces to a cited span in the upload. If the answer cannot be cited, the card is not emitted.

2

Layer 2 · the distractors are length-matched and parallel

If three options are short and one is a paragraph, you pick the long one without reading. If options are 'a kidney', 'kidneys', 'an renal pelvis', you pick the one with the agreeing article. FSRS sees Good in 1 second and pushes the interval. The card is now 'mastered' on the basis of grammar tells. Length-matching and grammar parallelism close that loophole at the rubric layer, before FSRS gets the rating.

3

Layer 3 · the stem rephrases on revisit

Same concept, different wording, every time the card surfaces. Day-1 wording is the original. Day-8 wording is a paraphrase. Day-23 wording is a clinical vignette of the same underlying mechanism. The rating you give FSRS is now a measurement of concept recall, not stem recall. FSRS does not need to know any of this; it just sees a more honest signal.

4

Layer 4 · FSRS decides when the next surfacing happens

Default desired retention 0.90. The DSR triple updates after each rating. The interval to the next review falls out of 'what spacing keeps me at 90 percent recall on this specific card given my actual rating history'. Layers 1 through 3 made sure the rating is honest; FSRS handles the timing. Two layers of work, one sane review schedule.

The same deck, two outcomes

Toggle to compare. Identical underlying concept, identical FSRS parameters, identical desired retention. The only variable is whether the stem rotates between surfacings. The trace below is a real-shape inferior MI card; the dates are illustrative of typical FSRS intervals at a 0.90 desired retention on a stable card.

# Day 1, MCQ surfaces in Anki Q: A 58-year-old man presents with crushing substernal chest pain radiating to the jaw. ECG shows ST elevation in leads II, III, and aVF. Which coronary artery is most likely occluded? A) Left anterior descending B) Left circumflex C) Right coronary artery D) Left main You think for 9 seconds, retrieve "inferior MI -> RCA", pick C. You rate Good. FSRS extends the interval. # Day 8, FSRS resurfaces the same card, identical wording Q: A 58-year-old man presents with crushing substernal chest pain radiating to the jaw. ECG shows ST elevation in leads II, III, and aVF. Which coronary artery is most likely occluded? You read "58-year-old man, crushing substernal..." Your brain returns C in 1.2 seconds without re-running the inference. You rate Good (or even Easy). FSRS extends to day 23. # Day 60, exam day, the test stem is rephrased Q: A 61-year-old smoker has acute onset chest discomfort. ECG shows new ST elevation in II, III, and aVF with reciprocal changes in I and aVL. Which vessel is the most likely culprit? You stare. The stem-cache does not match. The concept never landed. You guess.

  • Day 8 retrieval is a 1.2-second cache hit, not concept recall
  • Good rating is honest about the cache hit, dishonest about mastery
  • FSRS extends the interval based on the cache-hit rating
  • Exam-day rephrasing fails because the concept never landed

What to actually set in Anki for FSRS

Almost nothing. The defaults are good. The work is at the card-quality layer, not the algorithm layer. The block below is a typical FSRS preset for a med, dental, nursing, or pharmacy collection that gets a steady stream of imported decks.

anki-deck-options.fsrs

The desired-retention number gets disproportionate attention online. It is the right place to start tuning only after you have audited the cards themselves; on a deck where the stems are static and the distractors are uneven, raising desired retention to 0.95 just makes you re-cache the wording more often.

Where each layer is doing its work

The table reads top to bottom across the two layers: what FSRS sees, what FSRS does not see, and where the rubric layer fills in. The point is not that FSRS is weak; the point is that FSRS is surgical about the one job it does and silent about the jobs it does not.

FeatureStatic cards into FSRSStudyly cards into FSRS
What FSRS optimizesThe interval to the next review, conditional on your Again/Hard/Good/Easy ratings.Same. FSRS does not change. The MCQ rubric runs upstream of FSRS, not as a replacement.
What FSRS does NOT seeWhether your retrieval was concept-based or stem-pattern-based. Whether the card was a length-tell or a grammar-tell. Whether the source was your slide or model pretraining drift.Same blind spot. The MCQ rubric layer feeds FSRS a more honest rating signal so the scheduling is on real retention, not surface retention.
Default desired retention0.90 across all cards in the preset. Tunable per preset, not per card.0.90 is the right starting point on Studyly-imported decks too. The retention number is downstream of card quality; raising it does not fix bad cards.
When the optimizer should runAfter ~1000 reviews in the preset. Re-run monthly or after a large import.Same. Importing a Studyly .apkg is a 'large import'; let the optimizer re-fit before reading too much into the predicted intervals on the new cards.
Day-1 wordingWhatever the generator wrote. Static for the lifetime of the card.Original Studyly stem. Persists in Anki as the canonical Front field.
Day-8 wordingIdentical to day 1. FSRS surfaces the same characters.Rephrased variant if reviewing inside Studyly, or a re-exported .apkg if reviewing in Anki. The concept is the constant; the surface rotates.
Risk of false-mastery feedback to FSRSHigh. The rating reflects stem-cache hits, FSRS extends intervals, the deck reads as 90 percent retained while exam-day retrieval fails.Lower. Each surfacing forces re-encoding; the rating reflects concept retrieval; FSRS extends intervals on more honest data.
Held-out card-quality eval (orthogonal to FSRS)Field average 67.9 across post-hoc-rubric tools (Unattle 78.0, Gauntlet 68.0, Turbolearn 57.8).Studyly 81.3 on factual correctness, clarity, distractor quality, question-type coverage. Methodology at studyly.io/quality.

Auditing your own deck for the false-mastery loop

If you suspect FSRS is reporting higher retention than your test scores back up, the diagnostic lives in the gap between predicted and true retention plus a card-level inspection of twenty Goods. Five checks; you can do them in 30 minutes.

Five-check FSRS-deck audit

  • Sample 20 cards FSRS marked 'Good' over the last 7 days. For each, check whether you can answer the question if the first 10 words of the stem are removed. If you cannot, the rating was a stem-cache hit, not concept recall.
  • On the same 20 cards, check whether all four options are within 25 percent of each other in character length. Length tells inflate Good ratings and corrupt FSRS scheduling.
  • Check whether any card in the sample has 'all of the above', 'none of the above', 'both A and B', or 'it depends' as an option. One filler template per 200 cards is enough to teach you the wrong heuristic.
  • Check whether each card has a SourceFile and SourcePage field populated. If the source is missing, you cannot verify a wrong answer in less than 90 seconds, which means you will not verify it at all.
  • Open Anki Stats > FSRS Stats. Compare 'Predicted retention' against 'True retention' over the last 30 days. If predicted is consistently higher than true, your ratings are over-optimistic and the scheduling is drifting; that is the rubric layer leaking into the FSRS layer.
81.3

On the held-out three-document eval (factual correctness, clarity, distractor quality, question-type coverage), Studyly cards score 81.3 against a field average of 67.9. The eval is orthogonal to FSRS; it measures whether the cards FSRS will be scheduling are worth scheduling. Both layers matter; this is the upstream one.

Held-out three-document eval, May 2026 · methodology at studyly.io/quality

The honest workflow if you want both layers

Generate the deck against your professor's actual slide deck or PDF, not a generic web question bank. The generator runs the four in-flight rubric gates per card (source-anchoring, length-matching, filler ban, grammar parallelism) before emission. Export the surviving cards as an .apkg. Import into Anki on a preset with FSRS enabled and desired retention at 0.90. Let FSRS schedule.

For the cards where rote wording is the point (drug generic-to-brand mappings, anatomical structure names, Latin terms), let the static .apkg ride. FSRS handles those well; rephrasing would not add value. For the cards where the underlying concept matters more than the surface (mechanism questions, clinical vignettes, differential reasoning), do your reviews inside Studyly where the stem auto-rephrases on each surfacing, or re-export periodically so a fresh stem variant lands in your Anki collection. Either way, the rating you give back is more honest, and the FSRS scheduling lands on true retention rather than cached retention.

Run the optimizer once a month. Re-audit twenty Goods monthly. Adjust desired retention only if the predicted-versus-true gap is wider than 3 percentage points and you have ruled out the rubric layer as the cause.

Related reading

The four-gate generation rubric, in detail, with a ChatGPT prompt template: Most Anki rubrics run too late: move them upstream of emission.

The five distractor failure modes a length-and-grammar gate is built to catch: Anki card distractor quality.

The Source field that lets you verify a wrong answer inside Anki in 15 seconds: PDF to Anki cards: source citations on every note.

The published methodology behind the 81.3 number: studyly.io/quality.

Try the upstream layer

One lecture, exported as .apkg, scheduled by FSRS

Free tier on app.jungleai.com, no credit card. Upload one lecture, get an .apkg with rubric-gated MCQ cards and source citations baked in, drop it into your existing FSRS preset. Audit twenty Goods after a week of reviews. The gap between predicted and true retention will tell you which layer was the bottleneck.

Common questions about Anki, MCQ cards, FSRS, and deep mastery

What is FSRS in Anki and how is it different from the old SM-2 algorithm?

FSRS stands for Free Spaced Repetition Scheduler. It shipped natively in Anki 23.10 (October 2023) and is the default for new collections in current Anki releases. The model has three latent variables per card (Difficulty, Stability, Retrievability, the DSR triple) and is fit by gradient descent on your own review history. SM-2 is the older Anki default, a fixed heuristic with an ease-factor that drifts under bad ratings. The practical difference: FSRS asks 'what interval keeps this card at my chosen retention probability', SM-2 asks 'what does the ease factor say to multiply the last interval by'. FSRS is more efficient at the same retention target and recovers from rating mistakes more gracefully.

What is desired retention in FSRS and what should I set it to for deep mastery?

Desired retention is the probability that you will recall a card on review day. The FSRS default is 0.90 (90 percent). Pushing it to 0.95 or 0.97 buys higher single-review accuracy at the cost of roughly 1.5x to 3x more reviews per day. Pushing it down to 0.85 saves reviews but means you will forget more. The honest answer for medical, dental, nursing, pharmacy, vet exams: leave it at 0.90 unless you have empirical evidence (from the Anki FSRS evaluator) that a different value works better for your collection. The bigger lever for deep mastery is upstream of FSRS, in whether the card itself is testing concept recall or just stem-pattern recall.

Can FSRS tell whether I really know a card or whether I'm pattern-matching the stem?

No. FSRS sees one signal per review: your Again/Hard/Good/Easy rating. It cannot see WHY you got the card right. If the stem reads 'The patient presents with crushing substernal chest pain and ST elevation in II, III, aVF...' and you have seen the exact wording five times, you will rate it Good in 1.2 seconds because the first ten words are a unique key into your mental cache, not because you understand inferior MI. FSRS will then push the interval out, mark you as mastering it, and you will fail the same concept on test day when the wording changes. This is the failure mode the MCQ rubric layer exists to catch.

How does auto-rephrasing on revisit help with FSRS scheduling?

Auto-rephrasing means the stem is rewritten on each surfacing while the underlying concept and answer stay fixed. The day-1 stem might be 'inferior wall MI typically presents with ST elevation in which leads', the day-8 stem might be 'a 58-year-old with crushing chest pain shows ST changes in II, III, and aVF, which artery is most likely occluded'. Same fact, different verbal surface. When FSRS surfaces the card on the day-8 interval, your retrieval has to actually reach the concept; the stem-cache is not enough. The Again/Hard/Good/Easy rating you give back is now measuring concept recall, which is what FSRS thinks it is measuring. Without rephrasing, the rating is measuring a verbal lookup, and FSRS schedules accordingly.

Does Anki support MCQ-style cards out of the box?

Anki ships with Basic and Cloze note types and not much else. You can build an MCQ template by hand: add four option fields, write the front template to render the stem and the four options as a list, and write the back template to highlight the correct one. The .apkg files Studyly exports include a custom note type (studyly_mcq) with the four-option fields, an Explanation field, and Source fields, plus the front and back templates pre-wired. Once imported, you get MCQ cards as native Anki notes, schedulable by FSRS like any other card.

Where does the rephrasing happen if my cards live in Anki and Anki only sees the static fields?

Two options. (1) Re-export from Studyly periodically; the export carries a fresh stem variant on each generation, so syncing a new .apkg into the same Studyly-namespaced deck rotates the wording. The cards keep their FSRS scheduling state because the import matches on the studyly_card_id field. (2) Use the Studyly app for review instead of Anki for the cards you want stem-rotation on, and let Anki handle the cards where the rote wording is the point (drug names, vocabulary, anatomy labels). The two are complementary, not exclusive.

What FSRS settings do I actually need to change to make this work?

Almost none. Default desired retention 0.90, FSRS on, the optimizer run once after you have around 1000 reviews in the collection (Tools, FSRS, Optimize). If you import a Studyly .apkg into an existing collection, leave the new deck on the inherited preset; do not create a separate FSRS profile per imported deck. The DSR model averages across cards in the collection, more cards per profile means a tighter fit. The full Anki documentation on this lives at docs.ankiweb.net/deck-options.html.

Will FSRS work for image-occlusion anatomy cards generated from my slide deck?

Yes. FSRS treats every card the same: a sequence of ratings with timestamps, a DSR fit, a next-interval prediction. Image-occlusion cards on the studyly_image_occlusion note type schedule under FSRS exactly like an MCQ note. The mask varies between cards in a set (different muscles masked from the same brachial-plexus diagram); each mask is its own card with its own DSR triple. Deep mastery here means rating yourself on whether you could name the structure unprompted, not whether you can recognize it once revealed.

What about cramming the day before an exam, does FSRS help with that?

Cramming is fundamentally outside what FSRS optimizes for. FSRS optimizes the long-term review schedule given a target retention; the day before an exam, the schedule is already as compressed as it gets, and you are review-density-bound, not algorithm-bound. The relevant question on cram day is whether the cards you are drilling test what the exam will test. A 200-card deck with strong distractors and rotating stems gives you a tighter signal in two hours than a 400-card deck of recognition-only basic cards. The deep-mastery framing on this page applies to the eight days before exam day; on day-of, just retrieve.

How is the Studyly leaderboard score related to FSRS at all?

It is not, directly. The 81.3 score on the held-out three-document eval measures four things about the cards themselves (factual correctness, clarity, distractor quality, question-type coverage). FSRS measures none of those. The point of mentioning both on the same page is that they sit on different layers: FSRS optimizes the schedule of a deck, the eval measures the quality of a deck, and deep mastery requires both to be strong at once. Methodology for the eval is at studyly.io/quality.