Bayesian Hint Ranking
What goes wrong with a naive approach
A help overlay usually sorts keybinding hints by raw usage count. Two problems show up almost immediately:
- Cold-start misery. Fresh installs have zero counts everywhere; the sort collapses to alphabetical. The first hundred keystrokes feel worse than random because the ordering changes on every single input.
- Oscillation. When two hints have comparable utility, a single click flips the order, users’ eyes chase the moving target, and the overlay becomes a stress toy. “Just add a threshold” doesn’t help because the boundary moves too.
The hint ranker needs to trade off three things: how useful does the hint look so far, how much more do we expect to learn by showing it, and how much screen real estate does it burn. And the ordering has to be stable under noise.
Mental model
Think of each hint as a tiny RL arm:
- Utility is Bernoulli: did the user actually use the hint when shown? Maintain a Beta posterior updated with conjugate counts.
- Exploration matters early. Add a VOI bonus proportional to the posterior standard deviation — this is the Bayesian cousin of UCB.
- Every hint row costs screen cells. Subtract a display cost per hint.
- Sort by net value . Swap only when a challenger’s net value exceeds the incumbent’s by a hysteresis margin — that is what kills oscillation.
The ranker is a stable Bayesian bandit. The Beta posterior gives calibrated expected utility, the VOI bonus gives exploration without tuning, and hysteresis makes the ordering a low-pass filter on the underlying scores.
The math
Net value
With :
Defaults: (uniform prior), , .
Conjugate updates
When a hint is shown and acted on:
When a hint is shown and ignored:
No hyperparameters, no learning rate — the update is just counting with smoothing built in.
Hysteresis swap rule
Let be the current displayed order. A swap of ranks and (with ) is allowed only if:
The ranker stores average posterior standard deviation, so the margin scales down as evidence accumulates. Cold-start hints need a big gap to shuffle; mature hints swap as soon as their expected utilities separate cleanly.
Why hysteresis matters for UI stability
Without hysteresis
frame 0: [Ctrl-F Ctrl-G Ctrl-H] V = [0.51, 0.50, 0.43]
frame 1: [Ctrl-G Ctrl-F Ctrl-H] V = [0.50, 0.51, 0.43] // flip!
frame 2: [Ctrl-F Ctrl-G Ctrl-H] V = [0.52, 0.50, 0.43] // flop!A single additional success for either command flips the order. The user’s eye tracks a moving target. Every overlay render looks like a slot machine.
Rust interface
use ftui_widgets::hint_ranker::{HintRanker, HintStats, RankerConfig};
let cfg = RankerConfig {
prior_alpha: 1.0,
prior_beta: 1.0,
lambda: 0.01, // display cost weight
w_voi: 1.0, // exploration bonus
hysteresis: 0.05, // minimum swap margin
};
let mut ranker = HintRanker::new(cfg);
// On every help-overlay render:
let visible: &[HintStats] = ranker.rank(&candidates);
// When the user acts on (or ignores) hint `id`:
ranker.record_outcome(id, acted_on);HintStats::expected_utility returns ; net_value returns ;
the ranker’s last_decision() exposes the ordered slice along with per-hint
scores for the debug overlay.
How to debug
The ranker emits hint-ranking-v1 lines to the evidence sink:
{"schema":"hint-ranking-v1","id":"editor.format","label":"Format",
"expected_utility":0.61,"cost":0.08,"net_value":0.55,
"voi":0.12,"rank":1}FTUI_EVIDENCE_SINK=/tmp/ftui.jsonl cargo run -p ftui-demo-showcase
# Trace a single hint's trajectory:
jq -c 'select(.schema=="hint-ranking-v1" and .id=="editor.format")' \
/tmp/ftui.jsonlIf you see rank oscillating across frames, the hysteresis margin is
too small for the posterior variance; raise hysteresis or increase
the priors.
Pitfalls
Conjugate priors can drown slow-moving truth. If prior_alpha
and prior_beta are large (say, 50), dozens of real uses barely move
the posterior. Keep priors at unless you
have strong evidence the population rate is not 0.5.
VOI bonus is not UCB. The bonus scales with , not . That means it shrinks only as the posterior sharpens, which is what we want for a TUI (we never have asymptotic data), but it also means you cannot prove standard UCB regret bounds here. Validate via the evidence log before deploying.
Cross-references
- VOI sampling — same Beta posterior, different consumer.
/widgets/composition— how the help overlay plugs into the widget tree./runtime/evidence-sink— where thehint-ranking-v1lines land.