Volumetric Fluid — Technical Deep Dive

The Volumetric Fluid sample is a GPU grid-based (Eulerian) fluid simulator that renders fire, smoke and water as a participating medium. A 3D "stable fluids" solver advances velocity / density / temperature fields entirely in compute shaders, and a fullscreen ray-march composites the density field as volumetric emission and absorption over a procedural sky and ground. It exercises the engine's render graph (compute and render passes), the RenderContext device / timing plumbing, the Camera + CameraController fly camera, and the shared BloomPass / TonemapPass post-processing passes.

1. Overview#

You fly a free camera (WASD + mouse, no pointer lock) around a single simulation box that sits on a checkerboard ground plane. A control panel on the right switches between four templates and three grid resolutions, and a slider bar at the bottom tunes three live multipliers.

The four templates (all the same solver, re-tuned):

Key	Template	Render mode	Emitter	What you see
1	Campfire	`fire`	`plume`	Continuous hot plume, black-body emission, flicker
2	Fireball	`fire`	`burst`	Spherical bursts re-fired every 3.2 s
3	Smoke	`smoke`	`plume`	Cool buoyant plume, sun-lit and self-shadowed
4	Splashing Water	`water`	`drip`	Drops fall under gravity into a churning pool

Controls:

WASD + mouse — fly the camera (CameraController, pointerLock: false).
1–4 — select template; R — reset the current template; T — trigger (re-fire a burst/drip immediately).
G — toggle the render-graph visualization overlay.
Panel buttons mirror the keys, plus Low / Medium / High resolution buttons.
Sliders: Speed (0–2x), Opacity (0.3–2x), Detail (0–2x).

All panel state (template, resolution, three sliders) is persisted to localStorage under the key crafty.fluid.settings and restored on reload.

2. Architecture#

File map for the sample (samples/fluid_test.html, samples/fluid_test.ts, and everything under samples/fluid/):

File	Responsibility
`fluid_test.html`	Page shell: canvas, info HUD, the (empty) `#panel` populated by JS, the slider bar, and the `source_viewer.ts` include.
`fluid_test.ts`	App entry. Owns camera, UI, persisted settings, per-frame parameter packing, and wires the compute + render-graph passes each frame.
`fluid/presets.ts`	The four `FluidPreset` configs, the fixed `GRID` resolution, and the world-space box (`BOX_MIN` / `BOX_SIZE`).
`fluid/fluid_sim.ts`	`FluidSim` class — owns the field 3D textures and six compute pipelines, records the solve into its own command buffer.
`fluid/fluid_sim.wgsl`	The six solver compute kernels (advect, vorticity, forces, divergence, pressure, project).
`fluid/fluid_render_pass.ts`	`FluidRenderPass` — a render-graph `Pass` that ray-marches the density field into an HDR target.
`fluid/fluid_render.wgsl`	Vertex (fullscreen triangle) + fragment (volumetric ray-march) shader.

The compute simulation (FluidSim) is deliberately not a render-graph pass. It submits its own command buffer first; the render graph then consumes the density texture FluidSim produced. Only the ray-march, bloom and tonemap go through the graph.

3. The simulation#

3.1 The grid and its fields#

The reference grid is GRID = [64, 96, 64] cells (presets.ts) — tall in Y to give plumes vertical room. It maps to a world-space box BOX_MIN = [-6, 0, -6], BOX_SIZE = [12, 18, 12], sitting on the ground at y = 0.

FluidSim (fluid/fluid_sim.ts) allocates these 3D textures, each dimension: '3d' with TEXTURE_BINDING | STORAGE_BINDING usage:

Field	Format	Count	Holds
`_vel`	`rgba16float`	2 (ping-pong)	velocity xyz
`_den`	`rgba16float`	2 (ping-pong)	`r` = density, `g` = temperature
`_prs`	`r32float`	2 (ping-pong)	pressure
`_div`	`r32float`	1	velocity divergence
`_vort`	`rgba16float`	1	curl (vorticity) of velocity

Density and temperature share one rgba16float texture (.x / .y), so advection moves both in a single trilinear sample. Pressure and divergence use r32float — scalar fields needing the precision for the Jacobi relaxation.

The host owns the ping-pong: _vi and _di index the slot holding the current velocity / density state. Each step that produces a new field reads slot i and writes slot i ^ 1, then flips the index. _prs ping-pongs internally inside the pressure loop. _div and _vort are single-buffered (written then immediately consumed within the same pass).

3.2 The solver pipeline#

fluid_sim.wgsl is a single module with six @compute entry points, all @workgroup_size(4, 4, 4) (the constant WG = 4 in fluid_sim.ts; dispatch is ceilDiv(grid, 4) per axis). The order each step() records is:

advect  →  vorticity  →  forces/emit  →  divergence  →  pressure (×N)  →  project

This is a classic Stam "stable fluids" loop. Each kernel is wired by its own explicit GPUBindGroupLayout (LayoutSet), so the WGSL binding list (@binding(0)–@binding(11)) is a superset — any one entry point only touches the subset its layout declares.

1. Advection — cs_advect. Semi-Lagrangian: for cell c, trace backward back = pos - velocity * dt and trilinearly resample both velocity and density at that point. Unconditionally stable for any dt.

let back = pos - loadVel(c) * dt;
let newVel = sampleVelField(back);
let newDen = sampleDenField(back);   // r = density, g = temperature

sampleVelField / sampleDenField convert cell-center coordinates to texture UVW with (p + 0.5) / grid and use a linear-filtering, clamp-to-edge sampler.

2. Vorticity — cs_vorticity. Computes the curl of the (advected) velocity field via central differences of the six neighbors and stores it per cell into _vort. This is precomputed so the forces pass can apply vorticity confinement cheaply.

3. Forces & emission — cs_forces. Everything that changes the fields outside advection and projection:

Buoyancy: vel.y += (buoyancy * (temperature - ambientTemp) - weight * density) * dt. Hot fluid rises; smoke weight drags its mass back down a little.
Gravity: vel.y -= gravity * density * dt — pulls dense fluid down (the water preset's main force).
Vorticity confinement: if vorticityStrength > 0, takes the gradient of |vort|, normalizes it, and adds eps * cross(N, vort) * dt. This re-injects the small swirls that semi-Lagrangian advection numerically smears away.
Source injection: when the emitter is active, emitterFalloff(c) gives a soft weight (flat disk for shape=0 plumes, soft sphere for shape=1 bursts/drips). Density is added, temperature is max'd up to the emitter temperature, and velocity is mix'd toward the injected velocity (plus an optional outward radial component for fireballs).
Cooling / dissipation: temperature decays linearly (-cooling * dt, clamped at 0); density and velocity decay exponentially (density *= exp(-densityDissipation * dt), likewise velocity).
Top fade: a smoothstep over the top ~6 cells multiplies density down by up to 70%, so fluid softly leaves the open top of the box instead of piling up against the (closed-pressure) ceiling.

4. Divergence — cs_divergence. Central-difference divergence of the post-force velocity into _div: 0.5 * ((r-l) + (u-d) + (f-b)).

5. Pressure — cs_pressure. One Jacobi relaxation step of the pressure Poisson equation:

prsOut = (l + r + d + u + b + f - div) / 6.0;

The host runs this pipeline _iterations times in a loop, ping-ponging _prs[0] ↔ _prs[1] via two pre-built bind groups (pingA / pingB). The iteration count is forced even (iterations + (iterations & 1)) so the final relaxed pressure always lands back in _prs[0], which the projection pass then reads. iterationsFor() in fluid_test.ts scales it with grid size: round(20 * gridX / GRID[0]) — 20 at Medium, ~15 at Low, ~30 at High — because a bigger grid needs more relaxation sweeps to propagate pressure across it.

6. Projection — cs_project. Subtracts the pressure gradient to make the velocity field divergence-free:

var vel = loadVel(c) - 0.5 * vec3<f32>(r - l, u - d, f - b);

It then enforces boundary conditions: zero normal velocity on the four side walls (x and z extremes), and a floor that fluid cannot sink through (if (c.y == 0) { vel.y = max(vel.y, 0.0); }). The top is left open. Reads of out-of-domain cells everywhere in the shader go through clampCoord, giving Neumann-style "no flux through the wall" sampling.

3.3 The SimParams uniform#

fluid_test.ts packs SIM_PARAM_FLOATS = 28 floats (7 × vec4, 112 bytes) each frame and FluidSim.step() uploads them. The WGSL SimParams struct maps them as:

vec4	Contents
`grid`	grid x, y, z, _
`timing`	`dt`, `time`, `vorticityStrength`, _
`emitterPos`	emitter center xyz (grid space), radius
`emitterVel`	injected velocity xyz, radial burst speed
`emitter`	rate, temperature, shape (0=plume/1=sphere), active
`forces`	buoyancy, weight, gravity, cooling
`dissipation`	densityDissipation, velocityDissipation, ambientTemp, _

3.4 Reset and resize#

reset() destroys and re-allocates every field texture (all zeroed by WebGPU) — used on template switch and R. resize() additionally updates gridX/Y/Z, the dispatch dimensions, and the pressure iteration count, so the grid can change resolution at runtime.

4. Presets#

presets.ts defines four FluidPreset objects. Each is the same solver tuned differently, plus a renderMode for the ray-marcher. Key differentiators:

Param	Campfire	Fireball	Smoke	Water
`renderMode`	fire	fire	smoke	water
`emitterMode`	plume	burst	plume	drip
`buoyancy`	22	14	10	0
`weight`	0.9	0.6	0.4	0
`gravity`	0	0	0	30
`cooling`	1.15	0.9	0.6	0
`densityDissipation`	0.38	0.5	0.12	0.32
`velocityDissipation`	0.22	0.5	0.2	0.12
`vorticity`	9	13	6	3
`emitterRadius`	9	8	6	6
`emitterHeight`	9	36	8	80
`emitterRate`	2.7	16	1.9	24
`emitterTemp`	1.75	2.0	0.9	0
`emitterVelY`	16	7	9	−22
`emitterRadialSpeed`	0	27	0	0
`cycleSeconds`	0	3.2	0	0.85
`densityScale`	9	8	11	16
`emissiveStrength`	2.7	3.1	0	0
`tint`	warm gray	dark gray	light gray	blue

Reading the table:

Fire presets have high buoyancy and cooling, non-zero emissiveStrength, and the ray-marcher emits via the black-body fireRamp. Campfire is a steady mid-height plume; Fireball is a high (emitterHeight 36) spherical burst with strong outward emitterRadialSpeed and a short cycleSeconds re-fire.
Smoke has low dissipation (long-lived density, 0.12), no emission, a light-gray albedo, and relies on lightMarch self-shadowing for shape.
Water has zero buoyancy/cooling and a large gravity of 30; the emitter drips from emitterHeight 80 with a negative emitterVelY (−22, downward), and the blue tint plus high densityScale make a dense glossy medium.

GRID, BOX_MIN, BOX_SIZE are also exported here so both the sim and the renderer agree on the domain.

5. Rendering#

5.1 The pass#

FluidRenderPass (fluid/fluid_render_pass.ts) is a render-graph Pass<void, FluidRenderOutputs>. Its create() builds a render pipeline with a fullscreen-triangle vertex shader, a 256-byte uniform buffer, a linear sampler, and a bind-group layout of { uniform, texture_3d<float>, sampler }. The volume texture is not a graph resource — it is owned by FluidSim and bound directly in the execute callback via setVolume(sim.densityView).

addToGraph creates a transient FluidHDR texture (HDR_FORMAT = rgba16float, full canvas size), declares it as a cleared attachment, and in the execute callback binds the uniform + volume + sampler and issues enc.draw(3) — one fullscreen triangle.

5.2 The ray-march#

fluid_render.wgsl's fs_main does the volumetric integration:

Ray reconstruction. The fullscreen triangle's NDC is unprojected with invViewProj at depths 0 and 1; ro is the camera position, rd the normalized world-space ray.
Background. skyColor(rd) is a vertical gradient with a sharp sun disc (pow(dot, 250)) and soft halo (pow(dot, 6)). If the ray points down, it intersects the y = 0 plane and shades groundColor — a checkerboard with distance fade and a Lambert term from the sun. bgDist records that hit.
Box intersection. intersectBox is a slab test returning (tNear, tFar). If the ray misses, or the box is fully behind the ground, it returns just the background.
March. STEP_COUNT = 64 primary steps between tEnter and tExit = min(boxFar, bgDist), step size dh. The entry point is jittered by a static hash12(frag.pos.xy) dither to hide slice banding without TAA. The loop accumulates front-to-back and early-outs when transmittance < 0.01.
Per-sample shading. Density field.x becomes extinction sigma = density * densityScale. The shading branches on renderMode:
- Fire (mode 0): radiance = fireRamp(temperature) * emissiveStrength * density + scatter * sigma, where fireRamp is a 5-stop HDR black-body ramp (values up to ~3.2, so the hot core blows out past 1.0 for the bloom pass). scatter adds a little ambient + sun-lit smoke around the flame.
- Smoke (mode 1): a light-scattering medium — radiance = (ambient + sunColor * shadow) * tint * sigma, where shadow comes from lightMarch.
- Water (mode 2): the same scattering term, plus when the march first crosses a dense iso-surface (density > 0.4) it shades a one-off glossy splash highlight: a gradient-of-density surface normal feeds a Blinn-style spec (pow(dot(n,h), 56)) and a Fresnel-weighted sky reflection.
Self-shadowing — lightMarch. From each marched point it casts a short secondary ray of LIGHT_STEPS = 4 samples toward the sun, accumulates density, and returns exp(-sum * dh * densityScale * shadowDensity) — the light that survives to the sample. This gives smoke and water volumetric self-shadowing.
Composite. accum + bg * transmittance — the integrated volume over the remaining background visibility.

5.3 The RenderParams uniform#

fluid_test.ts packs RENDER_PARAM_FLOATS = 44 floats (a mat4 + 7 × vec4, 176 bytes; the buffer itself is 256 bytes). The mat4 is the camera's inverse view-projection; the trailing vec4s carry camPos/time, boxMin/renderMode, boxSize/STEP_COUNT, sunDir/densityScale, sunColor/LIGHT_STEPS, tint/emissiveStrength, and misc (SHADOW_DENSITY, AMBIENT). Fixed scene constants in fluid_test.ts: SUN_DIR = [0.413, 0.731, 0.543], SUN_COLOR = [1.15, 1.05, 0.9], AMBIENT = 0.09, SHADOW_DENSITY = 1.0.

5.4 Post-processing#

After FluidRenderPass produces the HDR target, the shared BloomPass blooms it (threshold 1.1, knee 0.6, strength 0.55 for fire vs 0.18 otherwise — so fire glows much harder), and TonemapPass applies exposure 1.1 with the ACES curve to the canvas backbuffer.

6. Editing / interaction#

The control panel is built procedurally in fluid_test.ts into the #panel div: a Template section (one button per PRESET_ORDER entry), a Resolution row of three buttons, and Trigger / Reset action buttons. refreshButtons() keeps the on/off CSS classes and the mode description in sync.

Emitter. computeEmitter(elapsed) builds the per-frame EmitterState (position, radius, velocity, rate, temperature, shape, active) from the active preset. All grid-space quantities are multiplied by scale (see §7). Behavior by emitterMode:

plume — a continuous source at the box base. For fire it also adds a flicker: two beat-frequency sines pulse the rate and temperature and wander the source position and injected velocity.
burst — a spherical pulse. Every cycleSeconds, emitStart is reset; the emitter is active only for the first 0.13 s of each cycle.
drip — like burst but picks a fresh random (dropX, dropZ) offset each cycle and is active for the first 0.1 s.

T / Trigger sets emitStart = -1e9. Because computeEmitter re-fires when elapsed - emitStart >= cycleSeconds, that huge negative value forces an immediate burst/drip on the next frame.

R / Reset calls selectPreset(currentKey), which calls sim.reset() (zeros all fields) and forces an immediate emit.

Sliders are multipliers layered on top of the active preset, read fresh each frame in frame():

Slider	Range	Effect
Speed	0–2x	Scales `dt`: `dt = min(deltaTime, 1/30) * speedMul`.
Opacity	0.3–2x	Scales the ray-march `densityScale` (`densityScale * opacityMul`).
Detail	0–2x	Scales the simulation `vorticity` strength.

Resolution buttons call setResolution(i), which sets activeGrid, recomputes scale = activeGrid[0] / GRID[0], calls sim.resize(...) with a new iteration count, and forces an emit. The three resolutions are Low [48,72,48], Medium GRID = [64,96,64], High [96,144,96].

7. Frame flow#

frame() in fluid_test.ts, once per requestAnimationFrame:

ctx.update() advances frame timing; on a canvas resize it returns true and cache.trimUnused() is called. The FPS HUD shows ctx.fps and the grid.
cameraController.update(...) then camera.updateRender(ctx); ctx.activeCamera = camera.
Read the three slider multipliers; clamp dt to 1/30 s and scale by speed.
computeEmitter(elapsed) → pack the 28-float simParams → sim.step(simParams). FluidSim.step() writes the uniform, records all six kernels (with the pressure loop) into a FluidSimEncoder compute pass, and submits its own command buffer immediately.
Pack the 44-float renderParams; fluidRenderPass.setVolume(sim.densityView) and updateParams(...). Update bloom and tonemap params.
Build the render graph for the frame:

const graph = new RenderGraph(ctx, cache);
const backbuffer = graph.setBackbuffer('canvas');
const volume = fluidRenderPass.addToGraph(graph);
const bloom = bloomPass.addToGraph(graph, { hdr: volume.hdr });
tonemapPass.addToGraph(graph, { hdr: bloom.result, backbuffer });
const compiled = graph.compile();
graphViz.setGraph(graph, compiled);
void graph.execute(compiled);

So each frame is two submissions: first the compute solve (outside the graph), then the graph's single command buffer for ray-march → bloom → tonemap. The render graph never sees the simulation; it only consumes the density texture the compute pass already produced.

The scale factor. Because the templates are tuned for the Medium grid, fluid_test.ts multiplies every cell-space quantity by scale = activeGrid[0] / GRID[0] before packing: emitter position/radius/velocity, buoyancy, weight, gravity and vorticity. World-space rendering quantities (BOX_MIN, BOX_SIZE) do not scale — the box is the same size at every resolution; only the cell count (and thus detail and cost) changes.

8. Notable techniques and gotchas#

Compute is not in the render graph. FluidSim records and submits its own command buffer. The graph only handles the screen passes. The density texture crosses that boundary as a plain GPUTextureView passed to FluidRenderPass.setVolume(), relying on WebGPU's automatic ordering between the two submissions on the same queue.
Even pressure iterations. The Jacobi loop ping-pongs _prs[0] ↔ _prs[1]. The iteration count is rounded up to even so the relaxed pressure always ends in _prs[0], which cs_project is hard-wired to read. iterationsFor() scales iterations with grid size so relaxation quality is roughly constant.
Shared density/temperature texture. Packing both into one rgba16float (.x / .y) means advection transports them together with a single sample, and the ray-marcher reads both at once via sampleDensity.
Unfilterable-float pressure. The pressure bind-group layout marks prsIn and divIn as unfilterable-float (tex(4, true) / tex(5, true) in fluid_sim.ts) because r32float is not filterable — those kernels only ever use integer textureLoad, never the linear sampler.
Open-top boundary. Projection closes the four sides and the floor but leaves the top open, and cs_forces additionally fades density in the top ~6 cells so plumes dissipate gracefully instead of stacking against the ceiling.
Vorticity confinement. Semi-Lagrangian advection is stable but numerically diffusive — it smears small eddies. cs_vorticity + the confinement term in cs_forces re-inject that lost rotational detail; the Detail slider scales its strength, so turning Detail down visibly smooths the motion.
HDR fire ramp drives bloom. fireRamp returns values up to ~3.2, so flame cores exceed 1.0 and the BloomPass (threshold 1.1) picks them up as glow. Bloom strength is raised to 0.55 for fire vs 0.18 for smoke/water.
Dither instead of TAA. A static hash12 jitter on the ray entry point hides the banding from only 64 march steps without needing temporal accumulation.
dt clamping. dt is clamped to 1/30 s before the speed multiplier, so a frame-rate hitch cannot blow up the simulation (semi-Lagrangian advection stays stable, but forces and emission would over-inject on a huge dt).
scale keeps templates resolution-independent. Cell-space forces and emitter parameters are multiplied by scale so a campfire looks like a campfire at Low, Medium or High — just with more or less detail and GPU cost.