E-Processes — Anytime-Valid Sequential Testing

What goes wrong with a naive approach

Every frame, the runtime asks: “is this observation a budget violation?” If you run a fixed- $\alpha$ test (say, a z-test at $\alpha = 0.05$ ) on every frame, the type-I error explodes. The familiar “don’t peek at your p-value” warning isn’t a polite suggestion — at 60 fps you are peeking 60 times per second, and after one second of unified $H_0$ you have an expected $60 \cdot 0.05 = 3$ false rejections.

Possible mitigations, all bad:

Bonferroni across frames — divide $\alpha$ by 60 and you’ll miss every real violation.
Only test every $N$ frames — delay the alert and lose responsiveness, the whole point of continuous monitoring.
Group-sequential tests — need pre-specified looks; TUI alerts are not pre-specified.

The right tool is the e-process: a test martingale whose expected value under $H_0$ stays bounded. You can peek every single frame and the false-alarm probability is still capped by $\alpha$ .

Mental model

Picture a betting game against $H_0$ :

You start with wealth $W_0 = 1$ .
Every frame you bet a fraction $\lambda_t$ of your wealth on the observation having mean greater than $\mu_0$ .
If the observation is above $\mu_0$ , your wealth grows; if below, it shrinks.
Under $H_0$ , the expected wealth is non-increasing (no strategy beats the null on average).
Alert when $W_t \ge 1/\alpha$ — “you made so much money that either you’re a genius or $H_0$ is wrong”.

Because $W_t$ is a non-negative supermartingale under $H_0$ , Ville’s inequality gives:

P_{H_0}\!\left(\exists t:\; W_t \ge 1/\alpha\right) \le \alpha

— any stopping time, any number of looks, any number of frames.

E-processes are the secret ingredient that makes “principled monitoring at 60 fps” tractable. Conformal prediction gives you calibrated bounds; the e-process gives you calibrated alerts that survive continuous peeking.

The math

Wealth recursion

W_t = W_{t-1} \left(1 + \lambda_t (X_t - \mu_0)\right)

with $\lambda_t \in [0, \lambda_{\max}]$ (typically $\lambda_{\max} = 1/\sigma_0$ to keep the factor positive).

Ville’s bound

P_{H_0}(\exists t:\; W_t \ge 1/\alpha) \le \alpha

for any adapted $\lambda_t$ — you are free to choose the betting fraction from the history as long as it doesn’t peek at the future.

GRAPA (gradient-of-log-wealth adaptive prior)

Fixing $\lambda$ is suboptimal: too small and the detector is slow, too large and the wealth crashes on noise. GRAPA adapts:

\lambda_{t+1} = \text{clip}\!\left(\lambda_t + \eta \cdot \nabla_\lambda \log W_t,\; 0,\; \lambda_{\max}\right)

Intuition: take a gradient step on the log-wealth with respect to $\lambda$ . If the current $\lambda$ is too conservative under a true alternative, the gradient points up; if too aggressive and wealth is decaying, the gradient points down.

Defaults

Parameter	Default	Meaning
$\alpha$	0.05	Global false-alarm bound.
$\mu_0$	0.1	Null mean (normalised).
$\lambda_0$	0.5	Initial betting fraction.
$\eta_{\text{GRAPA}}$	0.1	Gradient step size.

Why fixed- $\alpha$ tests fail under peeking

Fixed-α z-test at 60 fps


H0: μ = 0.1, observations i.i.d. ~ N(0.1, σ²)
Per-frame test at α = 0.05.
Expected false rejections per second: 60 * 0.05 = 3.0

→ alert log is unusable.

Uses in FrankenTUI

Throttle / flake detector (eprocess_throttle.rs) — detects adversarial input bursts or flaky subscriptions.
Allocation budget — paired with CUSUM so a slow leak is detected by one or both.
Conformal top-layer — the anytime-valid layer sitting above vanilla conformal.

Rust interface

crates/ftui-runtime/src/eprocess_throttle.rs


use ftui_runtime::eprocess_throttle::{EProcess, EProcessConfig};
 
let mut ep = EProcess::new(EProcessConfig {
    alpha: 0.05,
    mu_0: 0.1,
    initial_lambda: 0.5,
    grapa_eta: 0.1,
});
 
// On each observation:
let rejected = ep.observe(x_t);
if rejected {
    trigger_alert();
}

observe returns true the first time $W_t \ge 1/\alpha$ . After a rejection, the caller typically resets (or the subsystem takes a compensating action and the wealth drifts back down).

How to debug

Rejections emit eprocess_reject lines:


{"schema":"eprocess_reject","alpha":0.05,
 "wealth":24.6,"lambda":0.43,"x_t":0.18,
 "mu_0":0.1,"grapa_step":0.03}


FTUI_EVIDENCE_SINK=/tmp/ftui.jsonl cargo run -p ftui-demo-showcase
 
# Lambda trajectory over a session (GRAPA adaptation):
jq -c 'select(.schema=="eprocess_step")
       | [.frame, .lambda]' /tmp/ftui.jsonl | tail -40

A $\lambda$ stuck at 0 means GRAPA has decided the stream is indistinguishable from $H_0$ — not a bug. A $\lambda$ stuck at $\lambda_{\max}$ means the stream is far from $H_0$ and you should already have seen rejections.

Pitfalls

The wealth factor can go negative if $\lambda (X_t - \mu_0) < -1$ . That breaks the martingale. Clip $\lambda_t \le 1/(|X_t - \mu_0| + \epsilon)$ for bounded observations, or use the log version $W_t = \exp(\sum \lambda_s (X_s - \mu_0) - \text{var correction})$ .

GRAPA is not a free lunch. If the stream briefly looks adversarial and GRAPA spikes $\lambda$ , a subsequent benign shift can obliterate the wealth. Keep $\eta_{\text{GRAPA}}$ small (≤0.1) so adaptation is gradual.

Cross-references

Alpha-investing — the sequential-FDR layer that budgets multiple e-processes.
Vanilla conformal — the base for the conformal-over-e-process stack.
CUSUM — the cheap partner for allocation-budget alerts.

Where next

How this piece fits in intelligence.

Intelligence overview

The sequential-FDR layer that budgets multiple e-processes.

Alpha-investing