Taos Engine ▦ Taos: Building a Modern WebGPU Game Engine

The Legend of Taos — Technical Deep Dive

A top-down action-adventure in the spirit of the original Zelda, built entirely on the Taos engine's render graph — using a single custom 2D pass.


1. The central idea: a 2D game through a 3D engine's render graph#

Taos is a deferred + HDR 3D WebGPU engine: G-buffers, cascaded shadows, PBR materials, the works. None of that is useful for a flat pixel-art game. But the engine's render graph is — it gives us virtual resources, pooled physical GPU objects, automatic barriers, and a clean per-frame execute hook.

So this sample keeps the engine and its render graph, but the graph contains exactly one pass: a 2D sprite batcher that clears the backbuffer and blits a stream of textured quads. Everything you see — tiles, the hero, enemies, the boss, items, the HUD — is one of those quads, sampled from a single procedurally-generated atlas texture.

The result is the smallest possible "2D mode" for a 3D engine: no materials, no lighting, no depth buffer, no scene graph traversal of meshes. Just clear → batch → draw.

Engine frame
 ├─ beforeFrame ……… game.update(dt)          // simulate
 ├─ feature.addPasses … game.draw(batch)       // queue quads + add the 1 pass
 ├─ graph.compile()
 └─ graph.execute() …… SpriteBatchPass         // upload vertex buffer, 1 draw call

2. File map#

File Responsibility
taos_quest.ts Entry point — boots the engine, wires the feature, the sim loop, the DOM overlay, and a dummy camera.
taos_quest/sprite_renderer.ts The 2D renderer: SpriteRenderer (quad accumulator), SpriteBatchPass (the render-graph pass), SpriteFeature (the RenderFeature that owns them).
taos_quest/atlas.ts Procedural pixel-art atlas: palette, char-grid templates, procedural tiles + Zia sun, shelf packer, texture upload.
taos_quest/rooms.ts Tile semantics, the room builder, and the overworld + dungeon map.
taos_quest/game.ts All gameplay: player, combat, enemies, boss, projectiles, pickups, puzzles, doors, screen transitions, HUD.
taos_quest/input.ts Unified keyboard / touch / gamepad input.
taos_quest/touch_controls.ts Lazily-built on-screen d-pad + buttons.

3. The sprite-batch renderer#

3.1 The pass#

SpriteBatchPass extends Pass<{ backbuffer }, void>. Its addToGraph declares a single write to the backbuffer as a cleared color attachment, then registers an execute callback:

graph.addPass(this.name, 'render', (b) => {
  b.write(deps.backbuffer, 'attachment', {
    loadOp: 'clear', storeOp: 'store', clearValue: [0, 0, 0, 1],
  });
  b.setExecute((pctx) => {
    // upload the CPU vertex staging, then one draw of quadCount * 6 verts
    this._device.queue.writeBuffer(this._vbuf, 0, data.buffer, 0, bytes);
    const enc = pctx.renderPassEncoder!;
    enc.setPipeline(this._pipeline);
    enc.setBindGroup(0, this._bindGroup);
    enc.setVertexBuffer(0, this._vbuf, 0, bytes);
    enc.draw(quads * 6);
  });
});

The clear (black) doubles as the letterbox fill — anything outside the scaled virtual screen stays black. The graph builds the GPURenderPassDescriptor from the attachment declaration; the pass never touches it directly.

3.2 The virtual screen and CPU-side NDC transform#

The game thinks in a fixed 256×240 "virtual screen" (NES-ish), regardless of window size. On each begin(), the renderer computes an integer scale that fits that into the device-pixel canvas and centers it:

const scale = Math.max(1, Math.floor(Math.min(canvasW / 256, canvasH / 240)));
this._offX = Math.floor((canvasW - 256 * scale) * 0.5);
this._offY = Math.floor((canvasH - 240 * scale) * 0.5);

Because the batch is rebuilt from scratch every frame, there is no transform uniform and no matrix in the shader. Each quad is transformed virtual-px → device-px → NDC directly in JS as it is pushed:

const px0 = x * scale + offX;          // virtual → device px
const nx0 = (px0 / canvasW) * 2 - 1;   // device px → NDC
const ny0 = 1 - (py0 / canvasH) * 2;   // (y flipped: virtual y is down)

The vertex shader is therefore a pass-through; the fragment shader samples the atlas (nearest filtering, set on the sampler — this is what keeps the pixels crisp) and multiplies by a per-vertex color used for tinting (hit-flash, invuln-flicker) and solid rects (HUD panel, screen flash):

@fragment fn fs(in: VSOut) -> @location(0) vec4<f32> {
  let c = textureSample(atlasTex, atlasSamp, in.uv) * in.color;
  if (c.a < 0.01) { discard; }
  return c;
}

3.3 Batching#

Vertices are 8 floats (pos.xy, uv.xy, rgba), 6 per quad (two triangles, no index buffer). They accumulate in a growable Float32Array; the GPU vertex buffer is created once and only reallocated if a frame exceeds its capacity. A whole frame — tiles, sprites, HUD — is a single draw call. Standard alpha blending is enabled so transparent sprite pixels composite over the tiles.

3.4 The feature#

SpriteFeature is the RenderFeature. Its async setup() builds the atlas (the engine awaits feature setup before the first frame). Its addPasses(frame) runs once per graph build — i.e. every frame — and does three things in order:

this.renderer.begin(ctx.width, ctx.height); // reset batch + compute scale
this._draw(this.renderer);                  // the game queues all its quads
this._pass.addToGraph(frame.graph, { backbuffer: frame.backbuffer });

this._draw is just game.draw. Since addPasses runs after beforeFrame (where the game simulates), the quads always reflect the current frame's state.


4. The procedural atlas#

No external images. Everything is drawn at runtime into one 256×256 RGBA8 buffer and uploaded with queue.writeTexture. Two techniques:

Char-grid templates for things that need a deliberate silhouette — the hero (4 facings × 2 walk frames), sword, enemies, items, and a digit font. Each pixel is one character indexed into a shared PAL palette; ./space are transparent:

tmpl('hero_down_a', [
  '......XXXX......',
  '.....XggggX.....',   // green cap
  '....XgSSSSgX....',   // face
  '....XSXSSXSX....',   // eyes
  '...XSGgggGSX....',   // arms + tunic
  // …
]);

Procedural drawing for things that read fine as patterns — tiles (grass noise, brick courses, water dashes) and the boss (concentric discs for the big eye). This is far more compact than a 16×16 template per tile. The Zia sun (the treasure, replacing the Triforce) is drawn this way: a gold disc with a red core and four cardinal ray bundles (two long inner rays + two shorter outer), matching the New Mexico Zia symbol.

Sprites are packed with a simple shelf packer (alloc(w, h) advances a cursor, wrapping to a new shelf row when full), and each name records its texel rect in lookup[name]. SpriteRenderer.drawSprite('octorok_a', x, y) looks up that rect, optionally flips/​tints, and pushes a quad.


5. Coordinates and the world model#

The virtual screen splits into a 64px HUD at the top and a 256×176 playfield below. The playfield is a 16×11 grid of 16px tiles. Entity positions live in playfield space (origin at the playfield's top-left); drawing adds the HUD offset.

VIRTUAL_W 256
┌───────────────────────────┐
│ HUD  (rupees, key, item, ♥)│ 64px
├───────────────────────────┤
│                           │
│   playfield 16×11 tiles   │ 176px   (COLS×ROWS, TILE=16)
│                           │
└───────────────────────────┘  VIRTUAL_H 240

Rooms are built, not hand-typed#

Authoring full 16×11 ASCII screens by hand is error-prone, so rooms.ts builds each room from a spec: fill the interior with the base tile (. grass / f floor), draw a border (T trees / W walls), carve doors into the edges that have a neighbor, then stamp feature tiles on top:

{
  id: 'd_mid', kind: 'dungeon',
  edges: {
    down: { to: 'd_entry' }, up: { to: 'd_key' },
    left: { to: 'd_block' }, right: { to: 'd_boss', lock: true },
  },
  features: [ /* statues, enemy markers… */ ],
}

Tile semantics live in a TILES table (sprite + default solid). Some tiles are dynamic and flip at runtime: a locked door L becomes passable once its doorId is in the unlocked set; a shutter S becomes floor when the room's puzzle is solved. Enemy/item/player markers (1 octorok, 2 moblin, 9 boss, k key, i item, P start, …) are extracted into entities at room-load and replaced with floor.

The full level: two overworld screens, a cave into a six-room dungeon (d_entry → d_mid hub → d_key, d_block, d_boss).


6. Gameplay systems#

6.1 Movement and collision#

The hero moves in continuous pixels, not on a grid. Collision is a swept AABB vs. the tile grid, resolved one axis at a time (moveBody): move on X, scan the tiles spanned by the box's leading edge, snap out on a hit; repeat on Y using the already-resolved X. The collision box ({ox:3, oy:6, w:10, h:9}) is smaller than the 16×16 sprite — the classic feel where your "feet" decide collision, so you can tuck under overhangs and slip through doorways.

moveBody takes a solid(tx, ty) predicate, so the same routine serves the hero (playerSolid — respects unlocked doors, open shutters, blocks), projectiles, and enemies (enemySolid — keeps them on floor/grass and inside the room).

6.2 Combat#

Pressing attack freezes the hero for the swing and spawns a short-lived sword hitbox in the facing direction (swordHitbox()), active only in the first part of the swing. Overlap tests against enemies apply damage with a brief per-enemy i-frame so one swing is one hit. At full health the White Sword fires a beam projectile — exactly the original's reward for staying topped up.

Taking a hit applies knockback (velocity away from the source, decayed in updatePlayer) plus ~1s of invulnerability (rendered as a flicker). Health is tracked in half-hearts and drawn as full/half/empty heart sprites.

6.3 Enemies and the boss#

  • Octorok — cardinal wander, periodically fires a rock in its facing dir.
  • Moblin — tougher cardinal wander, contact damage only.
  • Keese — fast, erratic free-flight that bounces off walls.
  • Boss — drifts horizontally with a vertical bob, bounces off the inner walls, and fires a 3-way fireball spread aimed at the hero. Its 32×32 sprite has two frames (eye narrow / wide). On death it drops the Zia sun and flashes the screen.

6.4 Projectiles, pickups, drops#

Projectiles carry a friendly flag: friendly beams hit enemies, enemy rocks and fireballs hit the hero; all die on walls. Pickups (key, rupee, heart, item, Zia sun) are AABB-collected. Slain enemies sometimes drop a heart or rupee.

6.5 The three "progression gates"#

These are what make it a dungeon and not just rooms:

  1. The key is hidden until the room is cleared. In d_key the key spawns into a gated list, not the live pickups. revealGatedIfCleared() promotes it (with a sparkle + flash) only once room.enemies.length === 0.

  2. The block puzzle. d_block's White Sword sits in a chamber sealed by a full-height shutter wall (col 5, rows 1–9 — fully enclosed, no way around). Pushing the movable block onto the floor switch (checkBlockSwitch) converts every S tile in the room to floor, opening the chamber. Block pushing is intentional: you must lean into a block for ~0.35s before it slides one tile, and only if the destination is clear.

  3. The locked door + small key. The d_mid → d_boss edge is a locked door. Walking into it with a key (tryUnlockAhead) consumes the key and records the door's shared doorId in the unlocked set, so both rooms treat it as open thereafter.

6.6 Screen transitions#

When the hero's center crosses a playfield edge that has a neighbor, the game enters a transition state for 0.5s. Both rooms are drawn offset and slid across the playfield (the HUD stays put), and the hero rides in with the incoming room — the NES "screen scroll" between rooms. Caves and stairs are instant cuts instead (checkTriggers).

Persistence is global, not per-room snapshot: unlocked doors stay open, takenPickups don't respawn, the boss stays dead — but ordinary enemies refill on re-entry, exactly like the original.


7. Input#

input.ts folds three sources into one tiny action set (up/down/left/right/attack/item/start):

  • Keyboard — arrows/WASD, Z/Space (sword), Enter/R (start). Edge presses are latched on keydown so a fast tap is never missed.
  • Gamepad — sampled each tick in poll(): left stick (with deadzone) or d-pad for movement, face buttons for sword/item, Start/Select for start. Edge presses come from diffing this frame's button set against last frame's.
  • Touch — on-screen buttons call setTouch(action, isDown).

down(a) is true if any source holds the action; pressed(a) is the latched edge for the frame, cleared by endFrame(). Movement reads down; one-shot actions (attack, start) read pressed.

Lazy touch UI. touch_controls.ts does not build the overlay just because the device reports touch support (a touchscreen laptop driven by mouse/keyboard shouldn't get a d-pad over the game). Instead it registers one-shot touchstart / pointerdown(type=touch) listeners and builds the DOM + CSS only on the first real touch, then removes the listeners. The controls support multi-touch (move + attack at once) and diagonals (two d-pad buttons).


8. Engine integration notes#

A few small things make a 2D game cooperate with a 3D engine:

  • Dummy camera. The engine's frame loop short-circuits if there is no active camera, so the entry adds one orthographic Camera. Nothing in the 2D pass reads it; it just satisfies the gate.
  • Sim in beforeFrame, draw in addPasses. This ordering guarantees the batch reflects the just-simulated state. The graph is rebuilt every frame (cheap — one pass), so addPasses runs every frame and re-queues all quads.
  • Async atlas in feature setup. Returning a promise from setup lets the engine await texture creation before the first frame.
  • DOM for big text. The in-canvas HUD draws hearts/keys/rupees/digits as sprites, but the large centered messages (title, item-get, game-over, win) are a DOM overlay — robust regardless of letterbox scaling, and no need for a full text layout system in the batcher.

9. Performance#

The hot path allocates almost nothing per frame: the vertex staging array and GPU buffer are reused, sprite rects are looked up by name, and a full frame is a single draw. The atlas is built once. The render graph is rebuilt each frame but that is a handful of JS objects for one pass. At 256×240 internal resolution the GPU is essentially idle — this would run on anything with WebGPU.


10. Where to take it#

  • More dungeon floors — the room builder + doorId model scale to a grid of levels; add a level field and a stairs-between-floors transition.
  • Items with verbs — the item action and HUD item-box are already wired; add a bow (reuse the projectile system) or a boomerang.
  • Animation polish — add attack-pose hero frames, enemy death animations beyond the sparkle, and a damage-number/​combo layer.
  • Audio — the engine ships an AudioEngine (spatial SFX, music, buses); a few sword/hit/pickup cues and a looping track would go a long way.

The whole thing is ~1.5k lines across seven files, and the only engine surface it touches is Pass / RenderFeature / RenderGraph. That's the takeaway: the render graph is general enough that "2D game" is just another pass.