ADR-009: FrankenTerm WebGPU Renderer Architecture (Glyph Atlas, Batching, Present)

Status: Proposed

Context

We are replacing xterm.js with a first-party renderer (frankenterm-web) that:

Renders a terminal grid (cells, colors, attrs, hyperlinks) at interactive rates.
Supports resize/DPR/zoom changes with stable geometry.
Produces high-signal, machine-readable logs for correctness + perf gates.
Fits the determinism + golden trace strategy for FrankenTerm (bd-lff4p).

Constraints:

Safe Rust in-tree (#![forbid(unsafe_code)]).
Deterministic behavior when requested (seed + explicit time source; trace replay).
Correctness-first (no flicker, stable selection/hyperlinks).
Performance: 120x40 steady-state at 60fps on modern hardware.

Related specs:

Golden traces: docs/spec/frankenterm-golden-trace-format.md (bd-lff4p.5.1)
North Star architecture: docs/spec/frankenterm-architecture.md (bd-lff4p.6)

Decision

D1) WebGPU-first renderer

Implement the renderer on WebGPU (via wgpu) with a single primary pipeline:

Per-cell instanced quads.
Glyph alpha sampled from a cached atlas texture.
fg/bg/attrs applied in shader.

We do not rely on DOM text rendering or canvas text for the final path.

D2) Atlas-based glyph caching (monospace first)

Maintain an R8 (alpha) glyph atlas:

Glyph rasterization is done in WASM (pure Rust font rasterizer) to avoid browser-dependent glyph rendering.
Glyph cache is LRU-evicted under a fixed byte budget.

Monospace and a constrained shaping model are the initial target:

Terminal-class monospace font metrics.
Grapheme clusters mapped to glyph runs via a stable, deterministic shaping path.

D3) Patch-driven updates (dirty spans)

The renderer consumes patches (dirty spans or diff runs) from the engine:

Only changed cells update instance buffer ranges (queue.write_buffer slices).
Unchanged cells do not trigger GPU work beyond the draw.

This aligns with the engine’s determinism model: identical traces → identical patch stream.

Hard rule: frankenterm-web MUST NOT access the engine’s full Grid directly; it only consumes the patch stream (plus explicit geometry/config inputs). This is required for trace replay (renderer can be driven from recorded patches) and prevents nondeterministic reads from leaking into the present path.

D4) Correctness gates are patch/trace first; pixel gates are optional

Web pixel output can vary across GPU drivers. Therefore:

Primary correctness gates are engine state and patch/frame hashes from golden traces.
Pixel/framebuffer hashes are optional and must be bucketed by (DPR, size, renderer config) and treated as best-effort.

D5) Present model

Use a standard swapchain surface:

One draw call per frame (or a small constant number), plus optional overlay passes.
Avoid multi-pass compositing unless required (selection highlight, cursor, debug overlays can be done in-shader).

Architecture

Data flow

Engine produces a patch stream for the viewport:
- Dirty spans per row (preferred) or diff runs (compatible).
Renderer applies patch to CPU-side Vec<CellInstance> for the viewport.
Renderer uploads only modified instance slices to GPU.
Renderer issues instanced draw:
- Vertex shader: quad + position.
- Fragment shader: sample atlas alpha + apply fg/bg/attrs.

Instance format (sketch)

Per cell:

x, y (u16 or i32) in cell coords.
glyph_id (u32) into atlas metadata table.
fg_rgba, bg_rgba (packed u32).
attrs (bitfield: bold/italic/underline/reverse/dim/…).
link_id (optional u32) for hover/click mapping.

Geometry + DPR/zoom

The renderer owns a deterministic geometry model:

Inputs: container size, DPR, font metrics, user zoom.
Outputs: (cols, rows), cell pixel size, origin offsets.

Rules:

Cell size must be stable and reversible across resize storms.
Mouse hit testing must use the same geometry mapping as rendering.

Alternatives Considered

A1) Canvas2D / DOM text rendering

Pros: fast to prototype.
Cons (reject): inconsistent glyph rasterization across browsers/OS; harder to make deterministic gates; performance cliffs under large scrollback + frequent updates.

A2) WebGL2

Pros: widely supported.
Cons (reject): WebGPU is the strategic direction and provides better tooling and ergonomics.

A3) Precomposited textures per line / tile

Pros: can reduce per-cell instance work in some scenarios.
Cons (defer): higher complexity; introduce only if profiling proves instancing is insufficient.

Consequences

Positive

Deterministic patch-level correctness gates even when pixel output varies.
Scales with dirty spans: update cost proportional to changed area.
GPU-friendly design (single pipeline, instancing, stable buffers).

Negative / Costs

Need a deterministic font rasterization approach in WASM.
More up-front engineering than canvas-based approaches.
Pixel-perfect goldens are not the primary gate (by design).

Test Plan / Verification

Required (unit / wasm):

Atlas packer invariants (no overlap, bounded growth, LRU eviction correctness).
Geometry math tests (DPR/zoom/fit-to-container → cols/rows mapping).
Patch application invariants (apply patch stream → identical instance buffer).

Required (E2E / harness):

Web perf harness emitting JSON summary + JSONL detail (bd-lff4p.2.10).
Golden trace replay gate verifying patch/frame hashes (bd-lff4p.5.2).

Logging requirements:

Emit patch statistics per frame (dirty spans, bytes uploaded, draw calls).
Record DPR, font metrics, zoom, and renderer config in run header.

References

bd-lff4p.2.1 (this ADR)
bd-lff4p.2.4 (glyph rasterization + atlas cache)
bd-lff4p.2.10 (web perf harness)
bd-lff4p.5.1 (golden trace format)
bd-lff4p.5.2 (trace replayer + checksum gates)