Mondrian Conformal Frame-Time Gating
What goes wrong with a naive approach
Vanilla conformal prediction assumes residuals are exchangeable across the whole stream. For frame time, that is blatantly false. A frame that rendered an inline prompt has a completely different cost profile from a frame that rendered a full-screen table. Pooling them gives a quantile dominated by the hard cases — the easy ones are massively over-covered and the hard ones remain barely-covered.
Three failure modes pop up immediately:
- Cross-regime pollution. A 20 ms outlier from a scroll burst raises the global quantile forever; steady-state frames are gated far too conservatively.
- Insufficient samples per regime. You have thousands of steady frames and ten scroll-burst frames. A per-bucket quantile over 10 points is useless.
- Silent mispicks. The diff-strategy picker changes which strategy it uses as the workload drifts; the conformal layer has no way to know that the calibration set just became obsolete.
The right answer is Mondrian conformal prediction (Vovk, 2003): partition the calibration set into buckets that are conditionally exchangeable, quote per-bucket quantiles, and fall back up a hierarchy when a bucket is too sparse.
Mental model
Partition frames by a tuple of context keys:
Keep a separate residual window per . On each frame:
- Look up the bucket’s window.
- If , use its quantile.
- Else fall back to — any size class.
- Else fall back to — any strategy.
- Else fall back to — global.
The fallback hierarchy is crucial: the system stays calibrated and always has a bound to quote, even in corner buckets that have never been hit.
Mondrian is conformal prediction plus a coarsening operator. Fine-grained buckets give precise bounds where you have data; the coarsening hierarchy gives you a valid bound where you don’t. The same theorem covers both levels.
The math
Per-bucket quantile
For each bucket with calibration residuals :
Otherwise fall through to the parent bucket.
Fallback hierarchy
At the root, if even the global window is short, quote the
fallback budget (default 16000 µs) — a hard engineering
constant beyond which we always degrade.
Upper bound on frame time
For a frame with predicted time in bucket :
Gating rule: if , trigger degradation. See control theory for what degradation looks like.
Defaults
| Parameter | Default |
|---|---|
| 0.05 | |
| 20 | |
| window size per bucket | 256 |
| fallback budget | 16 000 µs (60 fps) |
| hysteresis | 1.1 |
Worked example — a bucket fallback
Suppose you have the following bucket populations after 30 seconds of running the showcase:
| Bucket | Samples |
|---|---|
(alt, dirty, medium) | 512 |
(alt, full, large) | 23 |
(alt, full, small) | 3 |
(inline, dirty, medium) | 40 |
(inline, dirty, small) | 1 |
Per-frame resolution:
(alt, dirty, medium)— use bucket’s own (plenty of data).(alt, full, large)— 23 ≥ 20, use bucket’s own .(alt, full, small)— 3 < 20, fall back to(alt, full, ★), which has samples. Use that .(inline, dirty, small)— 1 < 20 and(inline, dirty, ★)has samples. Use that .
At each fall-back, the theorem still applies because exchangeability only needs to hold within the coarsened bucket.
Rust interface
use ftui_runtime::conformal_frame_guard::{
ConformalFrameGuard, FrameGuardConfig, BucketKey,
};
let mut guard = ConformalFrameGuard::new(FrameGuardConfig {
alpha: 0.05,
min_samples: 20,
window_size: 256,
fallback_budget_us: 16_000,
hysteresis: 1.1,
});
// After a frame, feed the context + residual:
guard.observe(
BucketKey { mode, diff_strategy, size_class },
y_hat_us,
observed_us,
);
// At the next frame, gate on the bound:
let decision = guard.decide(BucketKey { mode, diff_strategy, size_class }, y_hat_us);
if decision.exceeds_budget {
degradation.step_down();
}How to debug
Every decision emits a conformal_frame_guard line:
{"schema":"conformal_frame_guard","y_hat_us":14200,
"upper_us":15800,"budget_us":16000,
"exceeds_budget":false,
"bucket":{"mode":"alt","diff":"dirty","size":"medium"},
"fallback_level":0,"calibration_size":512}FTUI_EVIDENCE_SINK=/tmp/ftui.jsonl cargo run -p ftui-demo-showcase
# How often did we fall back, and to which level?
jq -c 'select(.schema=="conformal_frame_guard")
| .fallback_level' /tmp/ftui.jsonl | sort | uniq -cLarge counts at fallback_level=2 or 3 mean your bucketing keys
are too fine — consider coarsening size_class so buckets fill up
faster.
Pitfalls
Bucket keys must reflect cost regimes, not identifiers. Keying by pane ID or widget name produces dozens of identifier-level buckets that each need 20 samples. Key by what makes frames expensive: screen mode, diff strategy, size class. Leave widget identity out.
fallback_budget_us is a backstop, not a budget. It fires only
when even the global window is below . Tune the real
budget via the budget_us config and the degradation cascade;
don’t lean on the backstop.
Cross-references
/operations/frame-budget— the top- level document describing how the guard, degradation, and PI pacing cooperate.- Vanilla conformal — the theorem that underpins every bucket’s quantile.
- Control theory — PI + degradation
cascade that consumes the
exceeds_budgetsignal.