Taos Engine ▦ Taos: Building a Modern WebGPU Game Engine

Terranaut — Technical Deep Dive

A spherical, fully editable marching-cubes planet you can walk around. Gravity always points toward the world origin, so the astronaut roams the whole sphere; the terrain is sculpted live by raycasting the camera's aim ray and editing the density volume under the hit point. Built on the new deferred render graph — cascaded shadows, GTAO, atmosphere sky, dynamic-sky IBL, spherical-shell volumetric clouds, TAA, bloom and auto-exposure — with a player-pinned sun so the whole planet stays lit no matter where you walk.

File map#

Everything lives in samples/terranaut.{html,ts} plus the support folder samples/terranaut/:

File Role
terranaut.html Page shell + crosshair + #info HUD + #loading splash.
terranaut.ts main(): assets, scene objects, controller, all passes, the per-frame render graph, player-pinned sun + IBL re-bake, terraforming.
terranaut/planet_density.ts CPU PlanetDensity: the 64³ density volume — generate (sphere SDF + ridged fbm + floating-island shell), edit (sphere brush), sampleDensity (trilinear), raycast (ray-march). The single source of truth for the terrain.
terranaut/planet_mc_pass.ts PlanetMarchingCubesPass: GPU density mirror, vertex/counter/indirect buffers, the march compute, the G-Buffer render pipeline, the depth-only shadow caster pipeline.
terranaut/mc_march.wgsl cs_march (classifies cells, interpolates edge crossings, emits triangles with density-gradient normals so the curved planet shades smooth) and cs_write_indirect.
terranaut/mc_gbuffer.wgsl Programmable vertex-pulling shader. vs_main / fs_main fill the G-Buffer (slope-blended grass / rock); vs_shadow is depth-only for the cascade pipeline.
terranaut/mc_tables.ts The standard 256-entry edge / triangle tables.
terranaut/astronaut_controller.ts Planet-gravity third-person controller — tangent-plane movement, jump + radial gravity, orbit camera in the local tangent frame, animation state machine, pointer-lock mouse-look.

Controls#

Action Key
Move WASD
Run hold Shift
Jump Space
Look mouse (the canvas captures the cursor on first click)
Add terrain left-mouse
Carve terrain right-mouse
Brush size [ / ]
Camera zoom mouse wheel
Toggle auto-exposure E
Regenerate planet R

A small crosshair in the screen center shows where the brush will act. A translucent green (add) or red (carve) sphere at the camera-ray hit shows the brush volume.

Pipeline — the per-frame render graph#

ProbeMC compute              cs_march + cs_write_indirect (only on edit frames)
ShadowPass                   skinned astronaut → cascade depth (clears)
  ↳ PlanetMC shadow ×N       MC terrain → each cascade depth (loads), drawIndirect
PlanetMC.addToGraph          MC → G-Buffer (clear), drawIndirect
SkinnedGeometryPass          astronaut → G-Buffer (load)
GTAOPass                     normals + depth → AO
AtmospherePass               sky → HDR (clear, view frame rotated so the
                             astronaut's local up is the zenith)
DeferredLightingPass         G-Buffer + shadow + AO + dynamic-sky IBL → HDR
                             (disableHorizonFade=1 — see §8)
AutoExposurePass             histogram of the lit HDR → exposure buffer
BrushGizmoPass               translucent sphere at the aim hit (inline pass)
PlanetCloudPass              spherical shell ray-march → HDR (overlay, see §7)
TAA → Bloom → CompositePass → backbuffer

PlanetMC imports its indirect buffer via graph.importExternalBuffer so the compute pass isn't culled; the G-Buffer and every shadow sub-pass b.read(indirect, 'indirect') to order the draws after the compute.

§1 — Density volume (CPU, single source of truth)#

The density volume lives on the CPU as a Float32Array of 64³ samples. Editing, ray-casting for grounding, and ray-casting for terraforming all run on the CPU array — exact, synchronous, no GPU read-back latency. After every edit the dirty volume is uploaded once via queue.writeBuffer and the GPU marcher re-polygonizes from it.

Density convention: negative = solid, positive = air, isosurface at 0. The planet is centered on the world origin.

PlanetDensity.generate() builds the field analytically:

density = r - PLANET_RADIUS
        - heightFbm(p) - ridgedMountains(p)    // surface relief in world units
        - islandMass(p) * shellMask(r)         // floating islands in the shell above

The outermost cell ring is forced to +40 (air) so the marched mesh is always closed even if you sculpt out to the volume bounds.

PlanetDensity.edit(center, radius, strength) applies a sphere brush with smoothstep falloff in place, marking the volume dirty. raycast steps along the ray at half-cell size, finds the first negative crossing, refines with five bisection steps, and returns the hit + the gradient normal — used for both astronaut grounding and the brush aim.

§2 — Marching cubes (mc_march.wgsl, planet_mc_pass.ts)#

The kernel is the standard pure-GPU pipeline: one compute invocation per cell classifies its 8 corners, looks up an edge mask + triangle list from the preloaded tables, interpolates the edge crossings, and atomicAdds into a shared vertex buffer. A tiny follow-up workgroup copies the counter into a 4-word drawIndirect args buffer so the host never reads the triangle count.

Two refinements over the previous flat-terrain sample:

  • Gradient normals. The crossing point on each cell edge gets its normal from the central-differences gradient of the density field at the two integer corners, interpolated by the same t. Density rises toward "air", so the gradient already points outward — exactly the surface normal. The planet shades smooth instead of faceted.
  • Vertex-buffer overflow guard. grid_size.w carries the maxVertices capacity; per-cell writes are skipped past the limit and cs_write_indirect clamps the indirect vertex count. The vertex shader cannot read out of bounds.

The render side keeps the original sample's programmable vertex pulling: the vertex buffer is bound as a read-only storage buffer (no vertex attributes declared on the pipeline) and drawIndirect supplies the vertex count.

The G-Buffer fragment is a low-poly stylized surface: rock on cliffs, grass on flats (dot(N, radial_up)), plus a coarse per-block hash tint so flat fields aren't dead-flat.

§3 — Astronaut#

AstronautSmooth.glb is loaded by the engine's GltfLoader. The model is an FBX2glTF export whose inverse-bind matrices omit the Armature node's 100× scale and -90° X rotation, so the skinned (animated) pose comes out 100× too big and sideways. The sample applies the same fix used by the glTF viewer — pre- multiply Mat4.rotationX(π/2) · Mat4.scale(0.01) onto skin.rootTransform — which corrects the animated pose without touching the bind-pose mesh.

The astronaut is rendered through the engine's existing SkinnedGeometryPass (GPU skinning into the G-Buffer) and cast into every shadow cascade through ShadowPass's ShadowSkinnedDraw path, so its shadow tracks the live pose.

§4 — Controller (planet gravity)#

up = normalize(pos - planetCenter) is recomputed every frame from the astronaut's world position; movement is in the tangent plane around that up; the astronaut's facing and the orbit-camera's forward are both carried as persistent tangent vectors parallel-transported each frame (re-projected onto the new tangent plane). That means there's no fixed world-up reference — crossing the planet's poles is seamless.

  • Movement is camera-relative: moveDir = camFwd*fwd + camRight*strafe, slewed along the surface, then re-projected.
  • Astronaut orientation is built from a full orthonormal basis (tangentRight, planetUp, heading) and converted to a quaternion via quatFromBasis (the standard matrix-to-quaternion).
  • The camera orbit lives in the local tangent frame: yaw around up, pitch off up; its world rotation is also built from a basis (not yaw·pitch, which silently assumes world up).
  • Jump integrates a radial velocity against gravity until the foot returns to the (live) ground radius.
  • Grounding raycasts the CPU density: origin pos + up·HEAD_HEIGHT, dir -up. The probe starts at head height (not several units above) so it stays in the air pocket under any overhang the astronaut physically fits under — otherwise the ray would punch through the overhang's bottom and snap the player onto its top. A headroom check (sampleDensity at head height) gates each horizontal step, so walking into an overhang with too little clearance is blocked instead. The same raycast powers the terraform aim ray from the camera.
  • Spawn scans 240 Fibonacci-sphere directions and picks the tallest dry-land point on the sunward side, so the first frame is a hilltop view in daylight.
  • Animation state machine: Idle / Walk / Run (clip sped up so paws don't slip) / Jump, driven by the gait the controller infers from input.

§5 — Terraforming#

Each frame the sample casts a ray from camera.position along camera.forward (the cross-hair direction) into the density volume. When the ray hits a surface and a mouse button is held, density.edit(hit, brushRadius, ±BRUSH_RATE*dt) modifies the volume — left-click adds material (positive strength → density pushed negative → solid), right-click carves (negative strength → density pushed positive → air). The volume's dirty flag is checked once per frame; if set, the CPU array is re-uploaded to the GPU density buffer and the marcher is told to re-march.

The brush gizmo is a translucent sphere drawn from an inline tiny pipeline into the HDR after lighting (so it never pollutes the G-Buffer), colored green for add and red for carve.

§6 — Sky#

AtmospherePass.setPlanet({ radius: PLANET_RADIUS, atmosphereHeight: 12 }) puts the pass into planet mode, where the shader's "zenith" is computed per-camera as normalize(cameraPos) rather than world +Y — see atmosphere.wgsl. No host-side view-frame rotation is required; the per-spot horizon, day-night fade and stars all track the camera's radial direction natively. The dynamic-sky IBL is left in world space — it's an ambient contribution, so any slight inconsistency between visible sky (local-up) and ambient (world-up) is invisible.

§7 — Volumetric clouds (spherical shell)#

Clouds use a new PlanetCloudPass (engine-side, under src/renderer/render_graph/passes/) that ray-marches a thin spherical shell between planetRadius + cloudBaseAlt and planetRadius + cloudTopAlt centered on the world origin. The flat-world CloudPass was unusable here — its slab assumes world +Y up and would slice across the screen as you walk.

Pass shape#

  • Fullscreen triangle. Inputs: scene depth (clip clouds against terrain) + the same CloudNoiseTextures (createCloudNoiseTextures(device)) used by the engine's flat clouds.
  • Always overlay mode: outputs premultiplied (cloud_color, 1 − total_trans) and blends over the lit HDR. The sky itself is drawn earlier by AtmospherePass.
  • Slotted in the graph after BrushGizmo and before TAA (so TAA resolves cloud aliasing along with everything else).

Shell intersection — shell_range(ro, rd, r_inner, r_outer)#

The single-range traversal handles three camera positions:

  • Below the shell (camera radius < r_inner — the surface case): leave the inner sphere at inner.y, exit the outer at outer.y.
  • Inside the shell: start at 0, exit at min(inner.x ahead, outer.y).
  • Above the shell: enter at outer.x, exit at the first hit ahead. (For flying high over the planet — not used by the on-surface astronaut.)

Planet-aware density — pc_sample_density(p, r_inner, r_outer, …)#

Same Perlin-Worley field as the flat clouds (large + medium passes, detail erosion), with one critical difference: the height gradient is driven by length(p) (radial altitude from the planet center) instead of world p.y. The noise pattern is still sampled in world space, so clouds at a given world position look the same to every observer — there's just no "Y is up" assumption baked into the density profile.

The two-tap light_march toward the sun is identical to the flat-world version, except it clips against the radial shell (r < r_inner || r > r_outer) instead of a Y range.

Adaptive step count#

A ray crossing the shell straight up traverses (cloudTopAlt − cloudBaseAlt) units; a grazing ray crosses several times that. The step count is clamp(span_len / (Δalt × 0.4), 16, 64), so short overhead rays get the minimum and long grazing spans get up to 64 taps. Step size is the same per pixel within a frame, jittered by an IGN hash so TAA resolves the cloud edge instead of dithering it.

Limitations (v1)#

  • No wind. Adding wind cleanly on a sphere needs a per-sample tangent frame; a world-space XYZ offset would slide clouds wrongly as the player walks.
  • World-axis noise grain. cd_rotate_xz rotates around world Y to break the Perlin grid; a player at "world poles" sees a slightly different grain than one at the equator. Re-sampling in a local-tangent frame would fix it.
  • Cloud shell vs atmosphere thickness. If the cloud shell is set above atmosphereHeight, clouds render against empty space (no scattering around them), which reads as a wall of gray lit only by ambientColor. Tune cloudBaseAlt / cloudTopAlt to live inside (or just at the top of) the atmosphere band for the prettiest result.

§8 — Lighting (player-pinned sun)#

There is no day/night cycle. The sun is pinned to the astronaut's local tangent frame every frame, so wherever you walk on the sphere, the sun is always overhead. Walking around the planet rotates the sun smoothly with the astronaut's up; turning in place does not.

Sun direction each frame#

const SUN_TILT = 0.45;                     // ~26° off zenith
const up        = astro.up;                // radial — planet center → astronaut
let tangentX    = worldX − (worldX · up) · up;   // azimuth reference
if (tangentX is degenerate)                // up nearly parallel to world +X
  tangentX = worldZ − (worldZ · up) · up;
toward_sun     = up * cos(SUN_TILT) + normalize(tangentX) * sin(SUN_TILT);
sun.direction  = −toward_sun;              // engine wants light-travel dir

The tilt reference is decoupled from the astronaut's heading on purpose — a heading-derived tilt would swing the sun every time you turn, which reads as flickering. A world-axis reference rotates the sun only when you actually translate across the sphere.

IBL re-bake on a sun-angle gate#

DynamicSky.bake() re-renders the equirectangular sky panorama and the convolved IBL cubes so they track the (astronaut-pinned) sun. The bake is ~1–2 ms, so it's gated on the toward-sun direction having moved past cos(0.02 rad) (≈1.1°) since the last bake — invisible at the IBL's effective angular resolution, but it keeps the panorama + IBL convolve off the per-frame hot path. The IBL bake exposure is set higher than other samples (DynamicSky.create(ctx, 0.7)) so shadowed pixels — which only see the IBL — don't crush to black against the bright lit side of the curved terrain.

disableHorizonFade — the bug that motivated the new flag#

DeferredLightingPass's default shader behavior multiplies both the direct sun and the IBL ambient by smoothstep(L.y, −0.05, 0.05) — a world-Y horizon fade designed so a setting sun (day/night) smoothly extinguishes lighting. On a sphere with a player-pinned sun this is catastrophic: walking onto the world's southern hemisphere puts the sun direction below y = 0 in world space — even though it's still high in the astronaut's local sky — and the fade silently zeros every contribution. The screen goes black.

The lighting pass now takes an optional disableHorizonFade flag that short-circuits the fade to 1.0. Terranaut sets it; flat-world samples leave the default.

Plumbing:

  • LightUniforms in deferred_lighting.wgsl repurposes the _pad_light slot as disableHorizonFade: u32.
  • DeferredLightingPass.updateLight(..., disableHorizonFade = false) is the new last argument.
  • The shader replaces horizon_fade = smoothstep(...) with select(smoothstep(...), 1.0, light.disableHorizonFade != 0u).

Auto-exposure on by default (toggle with E)#

autoExposurePass.enabled is true at startup, clamped to minExposure 0.8, maxExposure 2.6 so the dark-space limb doesn't drag the histogram percentile metering past anything useful.

The trade-off: with a player-pinned sun the scene's lighting is roughly constant, so letting the adapter run makes the perceived "ambient" change as the camera reframes — a shadow looks bright when you stare straight into it and dark when you pan off, because the histogram metering shifts with what's on screen. Press E to lock the tonemap to a fixed 1.5 (HUD shows exp: 1.5 vs exp: AUTO); ?noexp starts the page locked.

§9 — Notable techniques and gotchas#

  • CPU density + GPU march. A single source of truth for the field. Brush edits and astronaut grounding are exact and synchronous; the marcher just reads the uploaded mirror.
  • March on edit only. The compute pass dispatches cs_march / cs_write_indirect only when the volume is dirty; on idle frames the cached vertex buffer + indirect args are reused so the GPU does no extra work.
  • MC shadow caster via drawIndirect. ShadowPass only knows vertex-attribute meshes; the marching-cubes geometry is in a vertex-pulled storage buffer. PlanetMarchingCubesPass.addShadowToGraph adds a small per-cascade depth-only sub-pass that reads the shared indirect handle and rasterises the MC vertices into each cascade layer.
  • Astronaut armature fix. AstronautSmooth.glb's inverse-bind matrices are off by 100× scale + 90° X rotation; the sample pre-multiplies the correction onto skin.rootTransform (same fix as the glTF viewer).
  • Atmosphere planet mode handles per-spot up. atmosphere.wgsl computes its "zenith" as normalize(cameraPos) when in planet mode, so the host doesn't need to rotate the view frame to keep the horizon glued to the player. A previous version of this sample applied a shortest-arc rotation to invViewProj/camPos/sunDir; it's unnecessary and was removed.
  • World-horizon fade on a player-pinned sun. The deferred lighting pass multiplies both direct and IBL contributions by smoothstep(L.y, −0.05, 0.05) to fade a setting sun. With a sun pinned to the astronaut's local zenith, walking onto the world's southern hemisphere pushes L.y below 0 in world space and silently kills every contribution. Terranaut sets the new disableHorizonFade flag on DeferredLightingPass.updateLight to bypass it; see §8.
  • Auto-exposure has a known wobble on a sphere. Letting the adapter run gives a shadow one apparent brightness when you stare straight into it and a different one when you reframe — the histogram metering moves with the framing. The adapter is on by default (clamped 0.8..2.6 so the dark-space limb doesn't drag the metering past anything useful); press E to lock the tonemap to setFixedExposure(1.5) when the wobble shows up.
  • Spherical cloud shell, not a slab. The engine's CloudPass raymarches a flat-world Y slab, which slices across a sphere as you walk; terranaut uses the new PlanetCloudPass instead, which integrates between two concentric spheres. The density module is also reimplemented so the height gradient reads from length(p) (radial) rather than p.y; see §7. Place the cloud shell inside the atmosphere band — clouds above the atmosphere top render against empty space and read as a wall of gray.
  • ?noclouds URL toggle. Useful when profiling the rest of the pipeline or to confirm what's atmosphere vs cloud in a screenshot.