SLO schema (slo.yaml)
slo.yaml at the repo root is FrankenTUI’s machine-readable service
level objective definition. Every budget the kernel honours —
frame render p99, layout compute p99, Bayesian posterior update
latency, heap RSS — is named and bounded here. CI validates the
schema on every push and runs a deterministic replay that exercises
safe-mode when breaches are injected.
Source: /slo.yaml.
Why a single SLO file
The benchmark gate (benchmark gate) provides the enforcement mechanism. The SLO file provides the intention — what the kernel promises its users. Keeping the two aligned in one place means:
- Budgets are reviewable in a single PR.
- A CI breach maps one-to-one to a documented promise.
- The runtime’s degradation cascade can key off the same metric names that appear in tests.
Top-level layout
# Global thresholds
regression_threshold: 0.10
noise_tolerance: 0.05
safe_mode_breach_count: 3
safe_mode_error_rate: 0.10
metrics:
render_frame_p99_us:
metric_type: latency
max_value: 4000.0
max_ratio: 1.25
safe_mode_trigger: true
# …many more metrics…Global thresholds
| Field | Meaning |
|---|---|
regression_threshold | Fractional overage above baseline that counts as a regression (default 10%). |
noise_tolerance | Measurement variance absorbed before alerting (default 5%). |
safe_mode_breach_count | How many safe_mode_trigger: true metrics must breach before the runtime enters safe mode. |
safe_mode_error_rate | Error-rate metric above which safe mode engages independently of latency. |
Per-metric fields
| Field | Type | Meaning |
|---|---|---|
metric_type | latency | memory | error_rate | Unit family. latency is microseconds; memory is bytes or count; error_rate is a fraction 0–1. |
max_value | f64 | Absolute ceiling. Exceeding it is a breach. |
max_ratio | f64 | Max ratio vs baseline. Exceeding it is a breach. Optional. |
safe_mode_trigger | bool | When true, breaching this metric counts toward safe_mode_breach_count. |
Breach semantics: a metric breaches if value > max_value or
value / baseline > max_ratio.
Metric categories
The schema groups metrics into two planes.
Data plane — frame rendering pipeline
Budgets on the hot path: every render has to meet these to hit the 60 Hz target.
render_frame_p50_us: { max_value: 500.0, max_ratio: 1.15 }
render_frame_p95_us: { max_value: 2000.0, max_ratio: 1.20 }
render_frame_p99_us: { max_value: 4000.0, max_ratio: 1.25, safe_mode_trigger: true }
render_frame_p999_us: { max_value: 8000.0, max_ratio: 1.50, safe_mode_trigger: true }
layout_compute_p50_us: { max_value: 200.0 }
layout_compute_p95_us: { max_value: 800.0 }
layout_compute_p99_us: { max_value: 1500.0 }
diff_strategy_p50_us: { max_value: 100.0 }
diff_strategy_p95_us: { max_value: 500.0 }
diff_strategy_p99_us: { max_value: 1000.0 }
ansi_present_p50_us: { max_value: 150.0 }
ansi_present_p95_us: { max_value: 600.0 }
ansi_present_p99_us: { max_value: 1200.0 }Memory and error budgets on the same plane:
heap_rss_bytes: { max_value: 104857600.0, max_ratio: 1.50, safe_mode_trigger: true }
allocations_per_frame: { max_value: 500.0, max_ratio: 1.30 }
false_positive_strategy_switch_rate: { max_value: 0.05, safe_mode_trigger: true }
malformed_ansi_rate: { max_value: 0.01 }Decision plane — intelligence layer
Budgets on the statistical kernels behind the runtime’s adaptive behaviour.
posterior_update_p99_us: { max_value: 500.0, safe_mode_trigger: true }
voi_computation_p99_us: { max_value: 400.0 }
conformal_predict_p95_us: { max_value: 100.0 }
eprocess_update_p50_us: { max_value: 10.0 }
bocpd_update_p50_us: { max_value: 25.0 }
cascade_decision_p99_us: { max_value: 100.0 }The decision plane’s budgets are deliberately tight — a sluggish posterior-update hurts every diff decision that follows. See intelligence overview for what these kernels actually do.
BreachResult
When the runtime evaluates a metric against the SLO, it produces a
BreachResult:
pub struct BreachResult {
pub metric_name: String,
pub metric_type: MetricType, // Latency | Memory | ErrorRate
pub value: f64,
pub max_value: f64,
pub max_ratio: Option<f64>,
pub baseline: Option<f64>,
pub breached: bool,
pub safe_mode_trigger: bool,
pub reason: BreachReason, // OverMaxValue | OverMaxRatio | None
}Breach results are emitted as events on the
evidence sink and counted toward
safe_mode_breach_count. Reaching that count flips the runtime into
safe mode — see frame budget for the
degradation cascade that kicks in.
CI validation
CI runs two gates against this file on every push:
- Schema validation. Every metric declares a known
metric_type, has numericmax_value, andsafe_mode_triggeris a bool. Unknown keys fail. - Deterministic safe-mode replay. A fixture injects breaches on
the
safe_mode_trigger: truemetrics and asserts the runtime transitions into safe mode. If the cascade doesn’t fire, CI fails.
Relationship to the benchmark gate
- SLO (
slo.yaml) is the promise — the outermost ceiling. - Benchmark gate (
tests/baseline.json) is the enforcement — the per-benchmark measured baseline with tolerance.
The gate’s budgets should always be ≤ the SLO’s max_value. If a
benchmark’s baseline creeps up past an SLO ceiling, either the SLO
must widen (deliberate promise change) or the gate has to fail.
See benchmark gate for the mechanics and telemetry events for the metric names in their canonical runtime form.
Adding a new metric
Pick a name consistent with existing conventions
Latency metrics end in _p{50,95,99,999}_us. Memory metrics are
either _bytes or _per_frame. Error rates are _rate.
Decide whether it should trigger safe mode
A metric should set safe_mode_trigger: true only if breaching it
means the kernel is genuinely unsafe for interactive use. A slow
posterior update is annoying; a 4 ms p99 frame render is user-visible
every single frame.
Add the metric and budget
metrics:
my_new_metric_p99_us:
metric_type: latency
max_value: 750.0
max_ratio: 1.30
safe_mode_trigger: falseWire a benchmark
Add a criterion benchmark that emits the same name and ensure the
tests/baseline.json percentile budget stays within the SLO ceiling.
Run the gates
./scripts/perf_regression_gate.sh --check-only
./scripts/bench_budget.sh --check-onlyConfirm both pass at the new budget.
Pitfalls
Don’t raise an SLO to hide a regression. The SLO is a promise. Document a relaxation in the PR description and in the commit history; reviewers should push back on silent widening.
safe_mode_trigger cascades. Flipping a metric to true
without understanding the degradation cascade may cause the runtime
to enter safe mode more eagerly than intended. Test with the
deterministic safe-mode replay before landing.
Percentile choice is load-bearing. If the SLO promises p99 and the benchmark gate measures p95, the two are unrelated. Keep the percentile consistent across SLO, gate, and telemetry.