Volumetric Fluid — Technical Deep Dive
The Volumetric Fluid sample is a GPU grid-based (Eulerian) fluid simulator that
renders fire, smoke and water as a participating medium. A 3D "stable fluids"
solver advances velocity / density / temperature fields entirely in compute
shaders, and a fullscreen ray-march composites the density field as volumetric
emission and absorption over a procedural sky and ground. It exercises the
engine's render graph (compute and render passes), the RenderContext device /
timing plumbing, the Camera + CameraController fly camera, and the shared
BloomPass / TonemapPass post-processing passes.
1. Overview#
You fly a free camera (WASD + mouse, no pointer lock) around a single simulation box that sits on a checkerboard ground plane. A control panel on the right switches between four templates and three grid resolutions, and a slider bar at the bottom tunes three live multipliers.
The four templates (all the same solver, re-tuned):
| Key | Template | Render mode | Emitter | What you see |
|---|---|---|---|---|
| 1 | Campfire | fire |
plume |
Continuous hot plume, black-body emission, flicker |
| 2 | Fireball | fire |
burst |
Spherical bursts re-fired every 3.2 s |
| 3 | Smoke | smoke |
plume |
Cool buoyant plume, sun-lit and self-shadowed |
| 4 | Splashing Water | water |
drip |
Drops fall under gravity into a churning pool |
Controls:
- WASD + mouse — fly the camera (
CameraController,pointerLock: false). - 1–4 — select template; R — reset the current template; T — trigger (re-fire a burst/drip immediately).
- G — toggle the render-graph visualization overlay.
- Panel buttons mirror the keys, plus Low / Medium / High resolution buttons.
- Sliders: Speed (0–2x), Opacity (0.3–2x), Detail (0–2x).
All panel state (template, resolution, three sliders) is persisted to
localStorage under the key crafty.fluid.settings and restored on reload.
2. Architecture#
File map for the sample (samples/fluid_test.html, samples/fluid_test.ts,
and everything under samples/fluid/):
| File | Responsibility |
|---|---|
fluid_test.html |
Page shell: canvas, info HUD, the (empty) #panel populated by JS, the slider bar, and the source_viewer.ts include. |
fluid_test.ts |
App entry. Owns camera, UI, persisted settings, per-frame parameter packing, and wires the compute + render-graph passes each frame. |
fluid/presets.ts |
The four FluidPreset configs, the fixed GRID resolution, and the world-space box (BOX_MIN / BOX_SIZE). |
fluid/fluid_sim.ts |
FluidSim class — owns the field 3D textures and six compute pipelines, records the solve into its own command buffer. |
fluid/fluid_sim.wgsl |
The six solver compute kernels (advect, vorticity, forces, divergence, pressure, project). |
fluid/fluid_render_pass.ts |
FluidRenderPass — a render-graph Pass that ray-marches the density field into an HDR target. |
fluid/fluid_render.wgsl |
Vertex (fullscreen triangle) + fragment (volumetric ray-march) shader. |
The compute simulation (FluidSim) is deliberately not a render-graph pass.
It submits its own command buffer first; the render graph then consumes the
density texture FluidSim produced. Only the ray-march, bloom and tonemap go
through the graph.
3. The simulation#
3.1 The grid and its fields#
The reference grid is GRID = [64, 96, 64] cells (presets.ts) — tall in Y to
give plumes vertical room. It maps to a world-space box BOX_MIN = [-6, 0, -6],
BOX_SIZE = [12, 18, 12], sitting on the ground at y = 0.
FluidSim (fluid/fluid_sim.ts) allocates these 3D textures, each
dimension: '3d' with TEXTURE_BINDING | STORAGE_BINDING usage:
| Field | Format | Count | Holds |
|---|---|---|---|
_vel |
rgba16float |
2 (ping-pong) | velocity xyz |
_den |
rgba16float |
2 (ping-pong) | r = density, g = temperature |
_prs |
r32float |
2 (ping-pong) | pressure |
_div |
r32float |
1 | velocity divergence |
_vort |
rgba16float |
1 | curl (vorticity) of velocity |
Density and temperature share one rgba16float texture (.x / .y), so
advection moves both in a single trilinear sample. Pressure and divergence use
r32float — scalar fields needing the precision for the Jacobi relaxation.
The host owns the ping-pong: _vi and _di index the slot holding the
current velocity / density state. Each step that produces a new field reads
slot i and writes slot i ^ 1, then flips the index. _prs ping-pongs
internally inside the pressure loop. _div and _vort are single-buffered
(written then immediately consumed within the same pass).
3.2 The solver pipeline#
fluid_sim.wgsl is a single module with six @compute entry points, all
@workgroup_size(4, 4, 4) (the constant WG = 4 in fluid_sim.ts; dispatch is
ceilDiv(grid, 4) per axis). The order each step() records is:
advect → vorticity → forces/emit → divergence → pressure (×N) → project
This is a classic Stam "stable fluids" loop. Each kernel is wired by its own
explicit GPUBindGroupLayout (LayoutSet), so the WGSL binding list
(@binding(0)–@binding(11)) is a superset — any one entry point only touches
the subset its layout declares.
1. Advection — cs_advect. Semi-Lagrangian: for cell c, trace backward
back = pos - velocity * dt and trilinearly resample both velocity and density
at that point. Unconditionally stable for any dt.
let back = pos - loadVel(c) * dt;
let newVel = sampleVelField(back);
let newDen = sampleDenField(back); // r = density, g = temperature
sampleVelField / sampleDenField convert cell-center coordinates to texture
UVW with (p + 0.5) / grid and use a linear-filtering, clamp-to-edge sampler.
2. Vorticity — cs_vorticity. Computes the curl of the (advected)
velocity field via central differences of the six neighbors and stores it per
cell into _vort. This is precomputed so the forces pass can apply vorticity
confinement cheaply.
3. Forces & emission — cs_forces. Everything that changes the fields
outside advection and projection:
- Buoyancy:
vel.y += (buoyancy * (temperature - ambientTemp) - weight * density) * dt. Hot fluid rises; smokeweightdrags its mass back down a little. - Gravity:
vel.y -= gravity * density * dt— pulls dense fluid down (the water preset's main force). - Vorticity confinement: if
vorticityStrength > 0, takes the gradient of|vort|, normalizes it, and addseps * cross(N, vort) * dt. This re-injects the small swirls that semi-Lagrangian advection numerically smears away. - Source injection: when the emitter is
active,emitterFalloff(c)gives a soft weight (flat disk forshape=0plumes, soft sphere forshape=1bursts/drips). Density is added, temperature ismax'd up to the emitter temperature, and velocity ismix'd toward the injected velocity (plus an optional outward radial component for fireballs). - Cooling / dissipation: temperature decays linearly (
-cooling * dt, clamped at 0); density and velocity decay exponentially (density *= exp(-densityDissipation * dt), likewise velocity). - Top fade: a
smoothstepover the top ~6 cells multiplies density down by up to 70%, so fluid softly leaves the open top of the box instead of piling up against the (closed-pressure) ceiling.
4. Divergence — cs_divergence. Central-difference divergence of the
post-force velocity into _div: 0.5 * ((r-l) + (u-d) + (f-b)).
5. Pressure — cs_pressure. One Jacobi relaxation step of the pressure
Poisson equation:
prsOut = (l + r + d + u + b + f - div) / 6.0;
The host runs this pipeline _iterations times in a loop, ping-ponging
_prs[0] ↔ _prs[1] via two pre-built bind groups (pingA / pingB). The
iteration count is forced even (iterations + (iterations & 1)) so the
final relaxed pressure always lands back in _prs[0], which the projection
pass then reads. iterationsFor() in fluid_test.ts scales it with grid size:
round(20 * gridX / GRID[0]) — 20 at Medium, ~15 at Low, ~30 at High — because
a bigger grid needs more relaxation sweeps to propagate pressure across it.
6. Projection — cs_project. Subtracts the pressure gradient to make the
velocity field divergence-free:
var vel = loadVel(c) - 0.5 * vec3<f32>(r - l, u - d, f - b);
It then enforces boundary conditions: zero normal velocity on the four side
walls (x and z extremes), and a floor that fluid cannot sink through
(if (c.y == 0) { vel.y = max(vel.y, 0.0); }). The top is left open. Reads of
out-of-domain cells everywhere in the shader go through clampCoord, giving
Neumann-style "no flux through the wall" sampling.
3.3 The SimParams uniform#
fluid_test.ts packs SIM_PARAM_FLOATS = 28 floats (7 × vec4, 112 bytes)
each frame and FluidSim.step() uploads them. The WGSL SimParams struct maps
them as:
| vec4 | Contents |
|---|---|
grid |
grid x, y, z, _ |
timing |
dt, time, vorticityStrength, _ |
emitterPos |
emitter center xyz (grid space), radius |
emitterVel |
injected velocity xyz, radial burst speed |
emitter |
rate, temperature, shape (0=plume/1=sphere), active |
forces |
buoyancy, weight, gravity, cooling |
dissipation |
densityDissipation, velocityDissipation, ambientTemp, _ |
3.4 Reset and resize#
reset() destroys and re-allocates every field texture (all zeroed by
WebGPU) — used on template switch and R. resize() additionally updates
gridX/Y/Z, the dispatch dimensions, and the pressure iteration count, so the
grid can change resolution at runtime.
4. Presets#
presets.ts defines four FluidPreset objects. Each is the same solver tuned
differently, plus a renderMode for the ray-marcher. Key differentiators:
| Param | Campfire | Fireball | Smoke | Water |
|---|---|---|---|---|
renderMode |
fire | fire | smoke | water |
emitterMode |
plume | burst | plume | drip |
buoyancy |
22 | 14 | 10 | 0 |
weight |
0.9 | 0.6 | 0.4 | 0 |
gravity |
0 | 0 | 0 | 30 |
cooling |
1.15 | 0.9 | 0.6 | 0 |
densityDissipation |
0.38 | 0.5 | 0.12 | 0.32 |
velocityDissipation |
0.22 | 0.5 | 0.2 | 0.12 |
vorticity |
9 | 13 | 6 | 3 |
emitterRadius |
9 | 8 | 6 | 6 |
emitterHeight |
9 | 36 | 8 | 80 |
emitterRate |
2.7 | 16 | 1.9 | 24 |
emitterTemp |
1.75 | 2.0 | 0.9 | 0 |
emitterVelY |
16 | 7 | 9 | −22 |
emitterRadialSpeed |
0 | 27 | 0 | 0 |
cycleSeconds |
0 | 3.2 | 0 | 0.85 |
densityScale |
9 | 8 | 11 | 16 |
emissiveStrength |
2.7 | 3.1 | 0 | 0 |
tint |
warm gray | dark gray | light gray | blue |
Reading the table:
- Fire presets have high
buoyancyandcooling, non-zeroemissiveStrength, and the ray-marcher emits via the black-bodyfireRamp. Campfire is a steady mid-height plume; Fireball is a high (emitterHeight 36) spherical burst with strong outwardemitterRadialSpeedand a shortcycleSecondsre-fire. - Smoke has low dissipation (long-lived density,
0.12), no emission, a light-gray albedo, and relies onlightMarchself-shadowing for shape. - Water has zero buoyancy/cooling and a large
gravityof 30; the emitter drips fromemitterHeight 80with a negativeemitterVelY(−22, downward), and the bluetintplus highdensityScalemake a dense glossy medium.
GRID, BOX_MIN, BOX_SIZE are also exported here so both the sim and the
renderer agree on the domain.
5. Rendering#
5.1 The pass#
FluidRenderPass (fluid/fluid_render_pass.ts) is a render-graph
Pass<void, FluidRenderOutputs>. Its create() builds a render pipeline with a
fullscreen-triangle vertex shader, a 256-byte uniform buffer, a linear sampler,
and a bind-group layout of { uniform, texture_3d<float>, sampler }. The volume
texture is not a graph resource — it is owned by FluidSim and bound
directly in the execute callback via setVolume(sim.densityView).
addToGraph creates a transient FluidHDR texture (HDR_FORMAT =
rgba16float, full canvas size), declares it as a cleared attachment, and in
the execute callback binds the uniform + volume + sampler and issues
enc.draw(3) — one fullscreen triangle.
5.2 The ray-march#
fluid_render.wgsl's fs_main does the volumetric integration:
Ray reconstruction. The fullscreen triangle's NDC is unprojected with
invViewProjat depths 0 and 1;rois the camera position,rdthe normalized world-space ray.Background.
skyColor(rd)is a vertical gradient with a sharp sun disc (pow(dot, 250)) and soft halo (pow(dot, 6)). If the ray points down, it intersects they = 0plane and shadesgroundColor— a checkerboard with distance fade and a Lambert term from the sun.bgDistrecords that hit.Box intersection.
intersectBoxis a slab test returning(tNear, tFar). If the ray misses, or the box is fully behind the ground, it returns just the background.March.
STEP_COUNT = 64primary steps betweentEnterandtExit = min(boxFar, bgDist), step sizedh. The entry point is jittered by a statichash12(frag.pos.xy)dither to hide slice banding without TAA. The loop accumulates front-to-back and early-outs whentransmittance < 0.01.Per-sample shading. Density
field.xbecomes extinctionsigma = density * densityScale. The shading branches onrenderMode:- Fire (mode 0):
radiance = fireRamp(temperature) * emissiveStrength * density + scatter * sigma, wherefireRampis a 5-stop HDR black-body ramp (values up to ~3.2, so the hot core blows out past 1.0 for the bloom pass).scatteradds a little ambient + sun-lit smoke around the flame. - Smoke (mode 1): a light-scattering medium —
radiance = (ambient + sunColor * shadow) * tint * sigma, whereshadowcomes fromlightMarch. - Water (mode 2): the same scattering term, plus when the march first
crosses a dense iso-surface (
density > 0.4) it shades a one-off glossy splash highlight: a gradient-of-density surface normal feeds a Blinn-stylespec(pow(dot(n,h), 56)) and a Fresnel-weighted sky reflection.
- Fire (mode 0):
Self-shadowing —
lightMarch. From each marched point it casts a short secondary ray ofLIGHT_STEPS = 4samples toward the sun, accumulates density, and returnsexp(-sum * dh * densityScale * shadowDensity)— the light that survives to the sample. This gives smoke and water volumetric self-shadowing.Composite.
accum + bg * transmittance— the integrated volume over the remaining background visibility.
5.3 The RenderParams uniform#
fluid_test.ts packs RENDER_PARAM_FLOATS = 44 floats (a mat4 + 7 × vec4,
176 bytes; the buffer itself is 256 bytes). The mat4 is the camera's inverse
view-projection; the trailing vec4s carry camPos/time, boxMin/renderMode,
boxSize/STEP_COUNT, sunDir/densityScale, sunColor/LIGHT_STEPS,
tint/emissiveStrength, and misc (SHADOW_DENSITY, AMBIENT). Fixed scene
constants in fluid_test.ts: SUN_DIR = [0.413, 0.731, 0.543],
SUN_COLOR = [1.15, 1.05, 0.9], AMBIENT = 0.09, SHADOW_DENSITY = 1.0.
5.4 Post-processing#
After FluidRenderPass produces the HDR target, the shared BloomPass blooms
it (threshold 1.1, knee 0.6, strength 0.55 for fire vs 0.18 otherwise — so fire
glows much harder), and TonemapPass applies exposure 1.1 with the ACES curve
to the canvas backbuffer.
6. Editing / interaction#
The control panel is built procedurally in fluid_test.ts into the #panel
div: a Template section (one button per PRESET_ORDER entry), a Resolution row
of three buttons, and Trigger / Reset action buttons. refreshButtons() keeps
the on/off CSS classes and the mode description in sync.
Emitter. computeEmitter(elapsed) builds the per-frame EmitterState
(position, radius, velocity, rate, temperature, shape, active) from the active
preset. All grid-space quantities are multiplied by scale (see §7). Behavior
by emitterMode:
plume— a continuous source at the box base. Forfireit also adds a flicker: two beat-frequency sines pulse the rate and temperature and wander the source position and injected velocity.burst— a spherical pulse. EverycycleSeconds,emitStartis reset; the emitter isactiveonly for the first 0.13 s of each cycle.drip— likeburstbut picks a fresh random(dropX, dropZ)offset each cycle and isactivefor the first 0.1 s.
T / Trigger sets emitStart = -1e9. Because computeEmitter re-fires when
elapsed - emitStart >= cycleSeconds, that huge negative value forces an
immediate burst/drip on the next frame.
R / Reset calls selectPreset(currentKey), which calls sim.reset() (zeros
all fields) and forces an immediate emit.
Sliders are multipliers layered on top of the active preset, read fresh each
frame in frame():
| Slider | Range | Effect |
|---|---|---|
| Speed | 0–2x | Scales dt: dt = min(deltaTime, 1/30) * speedMul. |
| Opacity | 0.3–2x | Scales the ray-march densityScale (densityScale * opacityMul). |
| Detail | 0–2x | Scales the simulation vorticity strength. |
Resolution buttons call setResolution(i), which sets activeGrid,
recomputes scale = activeGrid[0] / GRID[0], calls sim.resize(...) with a new
iteration count, and forces an emit. The three resolutions are Low [48,72,48],
Medium GRID = [64,96,64], High [96,144,96].
7. Frame flow#
frame() in fluid_test.ts, once per requestAnimationFrame:
ctx.update()advances frame timing; on a canvas resize it returns true andcache.trimUnused()is called. The FPS HUD showsctx.fpsand the grid.cameraController.update(...)thencamera.updateRender(ctx);ctx.activeCamera = camera.- Read the three slider multipliers; clamp
dtto1/30 sand scale by speed. computeEmitter(elapsed)→ pack the 28-floatsimParams→sim.step(simParams).FluidSim.step()writes the uniform, records all six kernels (with the pressure loop) into aFluidSimEncodercompute pass, and submits its own command buffer immediately.- Pack the 44-float
renderParams;fluidRenderPass.setVolume(sim.densityView)andupdateParams(...). Update bloom and tonemap params. - Build the render graph for the frame:
const graph = new RenderGraph(ctx, cache);
const backbuffer = graph.setBackbuffer('canvas');
const volume = fluidRenderPass.addToGraph(graph);
const bloom = bloomPass.addToGraph(graph, { hdr: volume.hdr });
tonemapPass.addToGraph(graph, { hdr: bloom.result, backbuffer });
const compiled = graph.compile();
graphViz.setGraph(graph, compiled);
void graph.execute(compiled);
So each frame is two submissions: first the compute solve (outside the graph), then the graph's single command buffer for ray-march → bloom → tonemap. The render graph never sees the simulation; it only consumes the density texture the compute pass already produced.
The scale factor. Because the templates are tuned for the Medium grid,
fluid_test.ts multiplies every cell-space quantity by scale = activeGrid[0] / GRID[0] before packing: emitter position/radius/velocity, buoyancy, weight,
gravity and vorticity. World-space rendering quantities (BOX_MIN, BOX_SIZE)
do not scale — the box is the same size at every resolution; only the cell
count (and thus detail and cost) changes.
8. Notable techniques and gotchas#
Compute is not in the render graph.
FluidSimrecords and submits its own command buffer. The graph only handles the screen passes. The density texture crosses that boundary as a plainGPUTextureViewpassed toFluidRenderPass.setVolume(), relying on WebGPU's automatic ordering between the two submissions on the same queue.Even pressure iterations. The Jacobi loop ping-pongs
_prs[0] ↔ _prs[1]. The iteration count is rounded up to even so the relaxed pressure always ends in_prs[0], whichcs_projectis hard-wired to read.iterationsFor()scales iterations with grid size so relaxation quality is roughly constant.Shared density/temperature texture. Packing both into one
rgba16float(.x/.y) means advection transports them together with a single sample, and the ray-marcher reads both at once viasampleDensity.Unfilterable-float pressure. The pressure bind-group layout marks
prsInanddivInasunfilterable-float(tex(4, true)/tex(5, true)influid_sim.ts) becauser32floatis not filterable — those kernels only ever use integertextureLoad, never the linear sampler.Open-top boundary. Projection closes the four sides and the floor but leaves the top open, and
cs_forcesadditionally fades density in the top ~6 cells so plumes dissipate gracefully instead of stacking against the ceiling.Vorticity confinement. Semi-Lagrangian advection is stable but numerically diffusive — it smears small eddies.
cs_vorticity+ the confinement term incs_forcesre-inject that lost rotational detail; the Detail slider scales its strength, so turning Detail down visibly smooths the motion.HDR fire ramp drives bloom.
fireRampreturns values up to ~3.2, so flame cores exceed 1.0 and theBloomPass(threshold 1.1) picks them up as glow. Bloom strength is raised to 0.55 for fire vs 0.18 for smoke/water.Dither instead of TAA. A static
hash12jitter on the ray entry point hides the banding from only 64 march steps without needing temporal accumulation.dtclamping.dtis clamped to1/30 sbefore the speed multiplier, so a frame-rate hitch cannot blow up the simulation (semi-Lagrangian advection stays stable, but forces and emission would over-inject on a hugedt).scalekeeps templates resolution-independent. Cell-space forces and emitter parameters are multiplied byscaleso a campfire looks like a campfire at Low, Medium or High — just with more or less detail and GPU cost.