Interfaces

An interface has 16.67 ms to draw the next frame and roughly 100 ms before a user calls it slow. Inside that envelope the browser parses HTML into a DOM, computes styles for every element, lays out the page, paints layers, composites them on the GPU, and ships pixels to the display. A 250 KB JavaScript bundle on a mid-range Android phone takes longer to parse than a frame budget allows. A button positioned three pixels off lands outside the thumb's natural arc. Interface work is the engineering of these constraints — the runtime, the render pipeline, layout, components, rendering strategies, platform APIs, performance budgets, accessibility, and the native and design surfaces that surround the web.

The interface stack — from user input through the browser runtime down to the siliconUser · eye · fingersDesign system · tokens · componentsFramework · React · Vue · Svelte · SolidWeb platform · DOM · CSSOM · Fetch · WorkersRender pipeline · parse · style · layout · paint · compositeJS engine · V8 · JSC · SpiderMonkeyOS · GPU · CPU · pixels on glassevery layer competes for the same 16.67 ms frame budget
The interface stack. Every layer is replaceable; none is optional. A slow choice anywhere shows up as a missed frame at the top.

The browser as a runtime

A native app ships with one runtime: the language VM, the OS's UI toolkit, and the libraries the developer chose. A web app ships into a runtime it does not control — a browser that another team built, on a device the user picked, configured for a network the developer never measured. The web runtime is the most heterogeneous deployment target in software, and every design decision starts from that fact.

The browser exposes three primary languages. HTML declares the document tree — elements, attributes, the parent/child structure that becomes the DOM. CSS describes how every element looks and lays out — colour, geometry, font, position. JavaScript runs as the imperative layer that mutates both at runtime: it adds nodes, changes styles, listens to user input, and talks to the network. The DOM is the shared data structure all three engines mutate; the main thread is the single CPU the JS engine runs on.

The JavaScript engine is itself a small compiler. V8 (Chrome, Edge, Node), JavaScriptCore (Safari, Bun), and SpiderMonkey (Firefox) all share a similar shape: bytecode interpreter for cold code, an optimising JIT (TurboFan in V8, FTL in JSC, WarpMonkey in SpiderMonkey) that recompiles hot functions to machine code based on observed types. Hidden classes and inline caches let property lookups on regular objects compile down to a single load instruction; the moment an object's shape changes unexpectedly, the cache invalidates and the lookup falls back to a slow generic path. Writing fast JS is mostly about keeping object shapes stable.

JS engine pipeline — parse, bytecode, baseline JIT, optimising JIT, deoptsource.jsbytecodebaselineoptimisedparse · ASTapproximately 1 ms / 100 KBIgnitioninterpretSparkplugnon-optimisingTurboFantype-specialiseddeopt · shape changed, type guard failedhot-path triggersfunction called approximately 100 timesloop with stable typesmonomorphic call sitedeopt triggersadding a property after constructionpassing a different type to a hot functiondelete obj.x on a shaped object
Tiering lets the engine spend optimisation budget only on hot code. A single deopt can drop a function from 10 ns/call back to 200 ns/call.

Three caveats sit underneath everything else. The main thread is single-threaded — long JS blocks every other engine. The event loop processes one task at a time; if a task runs for 200 ms, no scroll, no click, no animation lands during that window. Memory is managed by a generational garbage collector — most objects die young and get freed cheaply, but a long-lived data structure that gets mutated each frame can drag old-generation collections that pause for 30 ms or more. And the DOM is not a JS data structure — it's a C++ tree exposed across a binding layer, so each element.style.color = "red" is a cross-boundary call that the engine cannot optimise the way it optimises pure JS arithmetic.

The event loop — task queue, microtask queue, and the render stepTask queueMicrotask queueRender stepsetTimeout · I/O · postMessagePromise.then · queueMicrotaskrequestAnimationFrameclick handlerfetch completesetTimeout cbresolved promisemutation observerstylelayoutpaintcompositeone task per loop iterationdrained completely before next taskruns approximately every 16.67 msa microtask that spawns microtasks can starve the render step — the source of "the page froze"
Promises drain before paint. An infinite await Promise.resolve() chain looks like a hang because no render step fires.

The async primitives surface from that loop. Promise schedules a microtask; async / await is the same with sugar. requestAnimationFrame schedules a callback for the next paint, ideal for visual updates because it runs at the right moment in the pipeline. requestIdleCallback fires only when the main thread is idle — useful for background work like prefetch or analytics. scheduler.yield() (Chrome) and setImmediate (Node, partial browser support) let a long task voluntarily cede the thread between input events. MessageChannel.postMessage is the fastest way to yield without scheduling a full task, and is the trick behind several time-sliced library implementations.

The trade-off is the price of the web's reach. Every interactive site runs on top of three coupled engines you do not control, on hardware that might be a 16-core M3 or a five-year-old Android phone with thermal throttling. The mental model that survives both is: HTML and CSS are declarations the browser can optimise heavily; JavaScript is the imperative escape hatch that costs main-thread time. Spend the budget where it pays the user back.

The render pipeline

A web page is not painted once and forgotten. Each scroll, each animation tick, each setState call risks producing a new frame, and a new frame is a small pipeline the browser walks from the DOM down to the GPU. Engineering performant interfaces is mostly about knowing which mutations re-enter which stage of that pipeline, and how expensive the re-entry is.

The pipeline has six named stages. Parse converts raw HTML and CSS bytes into the DOM and CSSOM trees. Style recalc matches every element to its applicable CSS rules, producing a computed style. Layout (sometimes called reflow) computes the geometric box for every element — position, width, height — given the styles and the viewport. Paint turns each layer into a list of drawing commands (filled rect, glyph run, image blit). Composite decides how the painted layers stack and what transforms apply to each. Raster runs the drawing commands into actual GPU textures the compositor swaps into the next frame.

The render pipeline — six stages with rough timings on a mid-range laptopParseStyleLayoutPaintCompositeRasterHTML → DOM2–8 ms / pagematch rules0.5–4 msgeometry1–10 msdisplay list0.5–3 mslayer treeapproximately 0.5 msGPU bliton compositortriggerre-entry pointDOM insert · class changestyle → layout → paint → compositecolor / background changestyle → paint → composite (no layout)transform / opacitycomposite only · main thread skippedreading offsetTop after a writeforces synchronous layout — forced reflowthe cheapest mutation is the one that only re-enters the compositor
The pipeline is one-directional but re-entrant. Knowing which property reaches which stage is the difference between a 60 fps animation and a 12 fps slideshow.

A worked trace makes the costs concrete. On a 2023 MacBook Air rendering a typical news article — around 2,400 DOM nodes, 8,000 CSS rules across three stylesheets, three above-the-fold images — Chrome's DevTools Performance tab shows roughly: HTML parse 6 ms, style recalc 3 ms, layout 7 ms, paint 2 ms, composite 0.4 ms. Total first-frame work on the main thread: around 18 ms, plus image decode that runs on a worker thread. Subsequent frames during a scroll cost almost nothing — composite-only — because scroll position is just a transform on the composited document layer.

The pipeline is engineered around one optimisation: composited layers. Properties on the GPU-friendly list — transform, opacity, filter, will-change — can be applied entirely on the compositor thread without re-running layout or paint. A 60 fps card-flip animation that uses transform: rotateY(180deg) runs on the compositor and never touches the main thread. The same animation written as width and left mutations forces layout and paint every frame and chokes on any device under a flagship phone. Force a layout in the middle of a JS task and the whole frame budget collapses — reading offsetTop or getBoundingClientRect after a write flushes pending style and layout work synchronously, a pattern called layout thrashing.

Layout thrashing — interleaving reads and writes forces a synchronous layout per pairThrashing · 50 msBatched · 4 msfor (let el of items) {el.style.width = "200px"let h = el.offsetHeight // sync layoutel.style.height = h + 10 + "px"let w = el.offsetWidth // sync layout}// 100 items × 2 reads × 0.25 ms// = 50 ms · drops 3 frames// read phaseconst ws = items.map(e => e.offsetWidth)const hs = items.map(e => e.offsetHeight)// write phaseitems.forEach((e, i) => {e.style.width = ws[i] + "px"e.style.height = hs[i] + 10 + "px"})// 1 layout total · 1 paintread-then-write batching is the cheapest performance win on a list-heavy page
The browser tries to defer layout until the next frame. Reading a geometry property mid-write defeats the deferral and pays the full cost per iteration.

The frame budget is the honest limit. At 60 fps the browser has 16.67 ms between vsync ticks to produce a new frame, and the OS compositor itself eats one or two milliseconds. At 120 fps — common on flagship phones and recent laptops — the budget shrinks to 8.33 ms. The compositor itself never blocks; the main thread does, and the main thread is where your JS, your style recalc, and your layout all live. Every frame you ship is a contract with the main thread to finish in under 10 ms.

CSS as a layout engine

CSS is the most underestimated programming language on the platform. Every selector is a pattern that matches against the DOM, every property cascades from parent to child unless reset, every rule competes against every other rule for the same slot on the same element. The browser does all of this on every style invalidation. Getting it right is the difference between an interface that scales to a million DOM nodes and one that stutters at ten thousand.

The cascade is the algorithm that picks a winner when multiple rules set the same property on the same element. It walks four ladders in order: origin and importance (user-agent vs author, plus !important), then specificity (a tuple counting ID selectors, class/attribute/pseudo-class selectors, and type selectors — so #nav .item beats .menu .item), then source order (later rules win ties), then inheritance (some properties — color, font-family, line-height — inherit from the parent if unset; most do not). Cascade layers (@layer) added in 2022 let you bound that fight: rules in an earlier layer always lose to rules in a later layer regardless of specificity, which makes a design-system base layer reliably overridable by app-level rules.

The cascade — how the browser picks a winner from competing CSS rulesOriginSpecificitySource orderInheritanceuser !importantauthor !importantauthor normaluser-agentlater wins#id.class · :hovertag · pseudo-ela / b / c tuple(1,2,1) > (0,3,5)stylesheet 1stylesheet 2inline stylelast declarationlast write winscolorfont-familyline-heightvisibilityothers resetcascade layers@layer reset, base, components, applater layer always wins, regardless of specificitymoves the fight out of selector arms-races into named order
Specificity is the most common source of "why doesn't my style apply." Layers turn the answer from a count to a name.

Layout has three engines worth knowing. Flexbox lays out children along one axis, with growing, shrinking, and alignment along the cross axis — perfect for toolbars, button rows, single-row card lists. Grid lays out children in two dimensions with named tracks and explicit areas — perfect for full-page layouts, dashboards, and any design where alignment crosses rows and columns. Containment (contain: layout style paint) tells the browser that a subtree's layout, style, or paint cannot affect the rest of the document, so the browser can skip work on the rest of the page when the contained subtree changes. Container queries (@container) finally let a component respond to its parent's size rather than the viewport — the missing piece that made truly reusable design-system components possible.

Flexbox versus Grid — one-dimensional flow versus two-dimensional placementFlexbox · 1DGrid · 2Dmain axis + cross alignmenttracks + named areasflex: 1flex: 2automain axisheadernavmainasidewhenrows of buttonscard lists wrappingtoolbar alignmentwhenpage-level layoutdashboards with cross-row alignmentphoto galleries with grid-auto-flow: dense
Flexbox aligns along one axis at a time; Grid pins both axes simultaneously. Most pages use both — Grid for the page shell, Flexbox inside cells.

CSS also reshaped colour over the last few years. oklch() specifies colour in a perceptually uniform space — oklch(70% 0.15 240) means 70% lightness, 0.15 chroma, 240° hue, and a 10% lightness step looks like the same brightness change anywhere on the wheel. color-mix(in oklch, var(--brand), white 20%) blends two colours predictably without manually computing intermediates. Relative colour syntax lets you derive a colour from another (oklch(from var(--brand) calc(l + 0.1) c h)), turning a design system's tokens into a self-consistent system instead of a flat palette. These features remove the need for a Sass build step for most teams.

The trade-off is that CSS executes on every change. Style invalidation propagates from a changed element up, down, and across the DOM depending on which selectors might now match. A descendant combinator (.modal .button) forces a re-match across the modal subtree on any structural change inside it. A :has() selector — useful but expensive — can invalidate ancestors when descendants change. Avoiding deep selectors, scoping work with containment, and preferring class toggles over inline-style writes are not micro-optimisations on a complex app; they decide whether style recalc fits in the frame budget. CSS earned its place as a serious engineering surface the same way SQL did: people kept trying to escape it, and kept ending up writing worse versions of it.

The component model and state

Once an app grows past a few hundred elements, raw DOM manipulation stops being viable. Every interaction risks updating the wrong node, leaving stale state, or stacking event handlers. The component model is the response: split the UI into named, reusable units, each owning a slice of state and a render function, each receiving inputs (props) from its parent and emitting events upward. Data flows down, events flow up; nothing else crosses the boundary.

That contract sounds simple and isn't. The hard problem is reactivity — when a piece of state changes, which components need to re-render, and how does the framework know? Three families of answers compete, and the choice has real performance consequences past 10,000 components.

Virtual DOM (React) re-runs the component function on state change, produces a new tree of JSX-described nodes, and diffs it against the previous tree to compute a minimal patch list applied to the real DOM. Easy to reason about — your render function is a pure description of "what the UI looks like at this state" — but the diff costs CPU on every update, and the developer is responsible for telling React when to not re-run (memoisation via useMemo, memo, and the new React Compiler). At scale, careless React apps spend more time diffing than the patches themselves.

Fine-grained reactivity (Vue 3, Solid, Svelte 5) flips the model. Each piece of state is a reactive primitive — a signal in Solid, a ref in Vue, a $state rune in Svelte 5 — and the framework tracks at compile or run time which DOM nodes depend on which signal. When the signal changes, only those nodes update. No diff, no re-execution of the component function. Solid's component function runs exactly once per mount; the closures it creates are the subscriptions. The cost is a sharper conceptual model — you can't freely destructure or read state outside a tracking scope — and a smaller ecosystem.

Compiled reactivity (Svelte) takes the same idea further. Svelte's compiler reads your component at build time and emits imperative DOM-mutation code directly — no virtual DOM, no runtime reconciler beyond a tiny scheduler. Bundle sizes drop by tens of kilobytes; updates are pointer-fast. The trade is a heavier compiler and the need to learn Svelte's template syntax rather than plain JSX.

Three reactivity models — vDOM diff, signals, and compiled updatesVirtual DOMSignalsCompiledReactSolid · Vue 3 · Svelte 5 runesSvelte (template)state.x = 5component() returns vDOMdiff(old, new)apply patchessignal.set(5)notify subscribers text node 7update node 7x = 5compiler emitted: textNode.data = 5donecost per updateO(component subtree)memoise to scopecost per updateO(subscribers)tracked at runtimecost per updateO(1) for known writestracked at compilemain riskover-render on big treesmain risklosing reactivity scopemain riskbespoke template syntaxall three converge on the same goal — update only what changed — by different routes
vDOM is "describe everything, diff." Signals are "subscribe what reads, notify what writes." Compiled is "emit the writes statically." The right choice depends on team size, app shape, and where the bottleneck actually is.

State scope matters as much as state mechanism. Local state lives in one component and disappears with it. Lifted state moves up to a common ancestor when two siblings need to share. Global state — a store, a context, a signal exposed at module scope — survives the component tree. Three layers, each with a cost: local is cheap but doesn't share; lifted forces re-renders down the whole subtree; global needs an explicit subscription model so that not every component re-renders on every change. Frameworks differ on the global story: Redux made reducers explicit; Zustand and Jotai went smaller; signal-based libraries (Solid, Vue) make global state just another signal.

Data flows down, events flow up — three state scopesAppHeaderSidebarFormListuser · onLogoutuser · onNavlocal form statesubscribes to storeprops ↓props ↓events ↑events ↑localform input · disclosure openliftedshared between siblings · still tree-localglobalcross-tree · explicit subscription · selector-scoped
The three scopes are not stylistic — picking the wrong one is the most common cause of "the whole app re-renders on every keystroke."

The honest limit: state architecture is the lever that decides whether the app scales. A 200-component app survives anything; a 20,000-component dashboard with a single global store and no memoisation will drop frames on every keystroke. The fix is rarely "pick a different framework" — it's usually "narrow the subscription so the right components, and only the right components, re-render on the right writes."

Rendering strategies

A web page exists at one of four times: at build, on a server, at the edge, or in the browser. Picking which one renders the HTML decides what TTFB looks like, what LCP looks like, what the first interactive moment looks like, and how big the JavaScript bundle has to be. The named strategies are the recognisable points on that curve.

Client-side rendering (CSR) ships an empty HTML shell and a JavaScript bundle that builds the whole page in the browser. The bundle parses, the framework hydrates, the data loads, and only then does the user see content. Great for app-shaped interfaces behind a login, where SEO does not matter and the user is already committed; brutal for content sites, where Time-to-First-Byte arrives in 100 ms and the page is still blank at 3 seconds.

Server-side rendering (SSR) generates the HTML on each request, sends it down, then ships the JavaScript that hydrates the static markup into an interactive page. First paint arrives early because the server already shaped the content; interactivity arrives later because the hydration JS still has to run. Next.js, Remix, Nuxt, SvelteKit all default here.

Static site generation (SSG) does the same render at build time and serves the pre-rendered HTML from a CDN. Cheapest TTFB on the planet (10–30 ms from an edge POP), best caching story, no server cost per request — at the price of a build that takes longer as the site grows and a content-update lag bounded by the build pipeline. Blogs, docs, marketing sites live here.

Streaming SSR pipelines the render. The server flushes the HTML shell first, then streams in chunks of content as data resolves — using <Suspense> in React, await blocks in SvelteKit, <Suspense> in Vue. The user sees pixels at 100 ms even if the slowest data dependency takes 800 ms; the JS hydrates each chunk as it arrives. The improvement on perceived speed is large and the cost on engineering is mostly understanding which boundaries to draw.

Islands (Astro, Qwik, Fresh) ship mostly static HTML with small interactive components — islands — hydrated independently. A blog post might be 100% static except for a comment widget and a like button, each its own small bundle. Total JS shipped: 5–20 KB instead of 200 KB. The model trades the convenience of "the whole page is one app" for a hard split between static and interactive zones.

Rendering strategies on the TTFB / LCP / TTI trade-offCSRSSRSSGStreaming SSRIslandsTTFBLCPTTIJS bundle: 200 KBJS bundle: 200 KBJS bundle: 0–80 KBJS bundle: 200 KBJS: 5–20 KBno SEO contentSEO goodSEO bestSEO goodSEO bestbar height ≈ time on a fast 4G connection · darker = main-thread blocking
No strategy is universally best. Islands win for content; SSR with streaming wins for dynamic pages; CSR survives for behind-login app shells where the first paint is a loading state anyway.

Hydration is the cost shared by every strategy except pure SSG. The server-rendered HTML is dead until the JS engine runs the framework, attaches event handlers, and rebuilds the component tree in memory. On a mid-range Android phone with 200 KB of JS to parse, that's 400–800 ms after the HTML arrived — a window during which clicks land on a page that looks interactive but isn't. Partial hydration (Astro), resumability (Qwik, which serialises the entire framework state into HTML so the client never re-runs setup), and progressive hydration (hydrate above-the-fold first, defer the rest) are the responses. React Server Components add a new shape — components that run only on the server, never ship JS for themselves, and stream their rendered output into a client tree — letting the client bundle skip everything that doesn't need interactivity.

Hydration timeline — SSR full hydrate versus partial hydrate versus resumableSSR (full hydrate)Islands (partial)Resumable (Qwik)0 ms200 ms500 ms1 s2 sHTMLJS download + parse + hydrateinteractiveHTMLislandislandinteractive on first paintHTML + serialised statelisteners wired lazily on first interactionfirst paint
The same first paint, three different paths to interactivity. The gap between "looks ready" and "is ready" is the bug users describe as "the site froze for a second."

Incremental Static Regeneration (ISR) sits between SSG and SSR — pages are served from a static cache, regenerated on a schedule or on demand when stale. Cheap reads, fresh content, no per-request server cost during steady state. Edge rendering moves the server to the CDN's edge POPs (Cloudflare Workers, Vercel Edge, Netlify Edge) — the server is closer to the user, so even SSR can hit 50 ms TTFB anywhere in the world. The trade is a constrained runtime (no Node-specific APIs, limited execution time per request) and a different deployment model.

The trade-off across the table: every shift left on the rendering curve (toward static / server) cuts JS bundle and improves first paint; every shift right (toward client) buys richer interactivity per byte. Pick the cheapest strategy that meets your interactivity floor. Most marketing and content sites should be SSG with islands; most logged-in app surfaces should be SSR with streaming; pure CSR survives only in app-shell scenarios where SEO does not exist.

The web platform

A browser used to be a renderer that ran a sandboxed scripting language. The shape now is closer to a portable OS: storage, networking, threading, peer-to-peer media, low-level graphics, and background execution are all exposed through standardised JavaScript APIs. Building a sophisticated interface is partly knowing which capability lives behind which API.

Networking. fetch() superseded XMLHttpRequest as the request primitive — promise-based, with explicit Request and Response objects, streaming bodies, and AbortController for cancellation. Streams (ReadableStream, WritableStream, TransformStream) let you process a 100 MB response incrementally instead of buffering it. Server-Sent Events (EventSource) deliver a one-way stream of text events over HTTP — perfect for token streams from an LLM. WebSockets open a full-duplex, message-oriented channel — perfect for chat, collaborative editing, live game state. WebTransport runs on HTTP/3 with QUIC underneath, giving you unreliable datagrams and reliable streams on the same connection.

Storage. localStorage is synchronous, string-only, and capped around 5 MB — fine for a theme preference, wrong for anything structured. sessionStorage is the same but cleared on tab close. IndexedDB is the structured client database — async, indexed, supports binary blobs, capped by available disk minus a safety margin (often gigabytes). The Cache API stores Request/Response pairs and is what Service Workers use to implement offline. Origin Private File System (OPFS) gives you a sandboxed POSIX-ish filesystem accessible only to your origin — used by SQLite-WASM and other in-browser databases.

Workers. The main thread is single-threaded, but the browser is not. Web Workers run JS in a separate OS thread with no DOM access, communicating over postMessage with structured-clone serialisation. SharedWorker is one worker shared across tabs of the same origin. Service Workers are network proxies — installed once, they intercept fetch requests and can serve them from cache, modify them, or synthesise responses entirely. The Service Worker is what makes an app installable, offline-capable, and push-notification-capable.

The browser as an operating system — capability APIs grouped by domainNetworkingStorageComputefetch()StreamsWebSocketEventSource (SSE)WebRTCWebTransportBeaconService WorkerlocalStoragesessionStorageIndexedDBCache APIOPFSCookie StoreFileSystem APIstorage quotaWeb WorkersSharedWorkerService WorkerWebAssemblyWebGPUWebGL2Canvas / OffscreenCanvasWeb AudioMedia · Input · DevicegetUserMediaPointer EventsWeb MIDIWebHIDWebUSBGeolocationMediaStreamKeyboard LockGamepadWebNFC (Android)VibrationWake Lockavailability varies by browser and platform — check caniuse.com before reaching for the obscure ones
The capability surface is wider than any single team uses. The shape of an app is largely a choice of which corners to live in.

Media and peer-to-peer. getUserMedia() opens camera and microphone (with user permission). WebRTC adds peer-to-peer audio, video, and data channels — the substrate behind Google Meet, Discord voice, and most browser-based video calls. The signalling is left to you (typically WebSocket); WebRTC handles the NAT traversal, the SRTP encryption, and the codec negotiation.

Low-level graphics. WebGL2 exposes OpenGL ES 3.0 — adequate for most 3D needs. WebGPU, shipping in stable browsers from 2023 onward, exposes the explicit-graphics model used by Vulkan, Metal, and D3D12 — bind groups, compute pipelines, command encoders — at performance levels competitive with native. Used for browser-based ML inference, GPU-accelerated image processing, and the next generation of in-browser games.

WebAssembly sits alongside JS as a second compile target. Compiled C, C++, Rust, Go, AssemblyScript, and an increasing list of other languages produce .wasm modules the browser executes at near-native speed. The boundary with JS is explicit: shared memory through SharedArrayBuffer, function calls through imports and exports, no DOM access without a JS bridge. WebAssembly earns its place when CPU-bound work is the bottleneck — Photoshop on the web, Figma's vector engine, in-browser SQLite (sql.js, OPFS-backed wa-sqlite), AV1 decoders, ML inference runtimes. The ergonomic cost is the boundary itself; nothing is free across it, and a chatty call pattern can be slower than pure JS.

The trade-off is fragmentation. Capability detection (if ('serviceWorker' in navigator)) is non-negotiable; a feature that works in Chrome may be stage-2 in Safari and unimplemented in Firefox. Bridge APIs (WebUSB, WebHID, WebNFC) ship in Chromium-derived browsers and not in the others by design — Apple has chosen not to implement them. The practical rule: lean on the platform where it's broadly supported (Fetch, Streams, IndexedDB, Service Workers, WebSockets, WebRTC are universal), and treat the long-tail capability APIs as enhancements that gracefully degrade.

Performance for users

User-perceived performance is not a feeling. It is a measurable distribution of milliseconds across the page lifecycle, and Google's Core Web Vitals are the three numbers most teams now treat as the contract. Largest Contentful Paint (LCP) measures when the largest above-the-fold image or text block becomes visible — target under 2.5 seconds at the 75th percentile across real users. Interaction to Next Paint (INP) replaced First Input Delay in 2024 and measures the worst tap-to-paint latency across the session — target under 200 ms. Cumulative Layout Shift (CLS) measures how much visible content jumps around during load — target under 0.1.

Three thresholds are worth holding in mind beneath those numbers. Under 200 ms feels instantaneous. Under 1 second feels responsive and keeps attention. Above 10 seconds breaks the user's task — by then they have switched tabs, lost the goal, or left. The Core Web Vitals thresholds are tuned to keep most interactions inside the first two windows.

Core Web Vitals — LCP, INP, CLS — with target rangesLCPINPCLSlargest painttap → next paintlayout shiftgoodneeds workpoor≤ 2.5 s2.5–4 s> 4 sgoodneeds workpoor≤ 200 ms200–500> 500goodneeds workpoor≤ 0.10.1–0.25> 0.25leverspreload LCP imageAVIF / WebPfont swapcut render-blocking JSleversbreak long tasksdebounce expensive workworkers for heavy computeCSS containmentleversexplicit width / heightreserve space for adssize-adjust for fontsmin-height skeletonsmeasured at the 75th percentile across real users — synthetic numbers do not count
Three numbers, three thresholds. Every product team should know its current values and the worst-performing route in the app.

A worked Core Web Vitals budget turns the targets into concrete decisions. Consider a product landing page: hero image, headline, paragraph, CTA, three feature cards. On a mid-range Android phone over a 4G connection (RTT 100 ms, throughput 5 Mbps after slow-start), the budget walks like this.

LCP target 2.5 s. The largest element is the hero image. Network goes: DNS 30 ms, TCP+TLS 200 ms, server TTFB 200 ms — 430 ms before any byte of HTML. The HTML itself is 30 KB gzipped: 50 ms. Critical CSS inline in the head: 0 ms extra. The browser sees the <img> tag at 480 ms; the image is 80 KB AVIF: 130 ms to download. Decode: 30 ms. LCP arrives around 640 ms — well under budget. AVIF over JPEG is the single biggest LCP lever — typical 50% size reduction at equal quality. WebP gives most of the win with broader Safari support pre-2023. Preloading the hero (<link rel="preload" as="image">) saves another 50–100 ms.

INP target 200 ms. Every interactive element — the CTA, the cards, the menu — must respond within 200 ms of tap. The bundle is the constraint. A 250 KB gzipped JS bundle parses in roughly 100 ms on a mid-range phone; if the framework runtime is 60 KB of that, only the rest is your code. Long tasks (over 50 ms on the main thread) block input handling; scheduler.yield() and requestIdleCallback break work into chunks that fit between input events. Heavy synchronous work — hashing, JSON parsing of a 10 MB blob, image manipulation — belongs in a Worker.

INP — input → next paint with and without a long task in the wayGood · 80 ms INPBad · 340 ms INP0100 ms200 ms300 ms400 mstapdelayhandler 40 mspainttapblocked by long task 144 mshandlerstyle + layout 96 mspaint
INP measures the worst tap-to-paint of the session. One pathological interaction on a 5-year-old phone defines the field score.

CLS target 0.1. Every image and embed needs explicit width and height (or aspect-ratio) so the browser can reserve space before the asset loads. Ads and embeds need their own reserved slots — a 250 px ad that arrives after the page rendered will push 250 px of content down, eating CLS budget instantly. Custom fonts shift text when they swap: font-display: swap makes the swap visible (better LCP, risks CLS), optional blocks the swap (worse LCP, no CLS); size-adjust and the @font-face overrides line the metrics up so the swap is invisible.

Worked LCP budget — request through paint on a mid-range phone over 4GLCP walk · target 2.5 s · arrived 640 ms0100200400600800 msDNS 30 msTCP + TLS 80 msTTFB 80 msHTML 50 msimage download 130 msdecode 50 mspaint 15 msrequestLCP firesevery box is a lever — TTFB, format, preload, decode priority
Walked end to end, the LCP budget is mostly network and decode, not framework code. Optimising hero delivery is more leverage than any framework swap.

The toolbox underneath those targets is consistent across teams. Code-splitting with dynamic imports keeps the initial bundle small — route-level chunks load only the JS the current route needs. Tree-shaking in current bundlers (Vite, esbuild, Rollup) drops unused exports automatically. Lazy loading with loading="lazy" on images defers below-the-fold work to scroll time. Font subsetting drops glyphs you do not use — a Latin-only subset of a typical Google Font drops from 400 KB to 30 KB. fetchpriority="high" on the LCP image tells the browser to schedule the download ahead of others. HTTP/2 server push is dead; 103 Early Hints is the surviving mechanism for telling the client to preload critical assets before the main response arrives.

The honest limit is JavaScript. Every kilobyte of JS costs parse, compile, and execution time on the device's CPU, not just download time on the network. On flagship hardware the cost is invisible; on the 5-year-old Android phone in a developing market, 500 KB of JS is the difference between a 2-second LCP and a 9-second one. Performance work is mostly the discipline of shipping less code, later, on fewer threads.

Accessibility as engineering

About 16% of the world's population — over a billion people — lives with some form of disability. The fraction of users who navigate your site by keyboard, by screen reader, by voice, or with adapted input is larger than any browser's market share except Chrome. Accessibility is the engineering work that lets one codebase serve all of them. Done right, it improves the interface for every user; done wrong, it locks out a measurable fraction and exposes you to legal risk in most jurisdictions.

The foundation is semantic HTML. A <button> is keyboard-focusable, has a default Enter/Space activation, announces as a button to screen readers, has correct focus styling, and participates in the form-submission contract. A <div onclick> has none of these. Every native element — <a>, <button>, <input>, <label>, <select>, <form>, <table>, <dialog> — encodes a contract with assistive technology that an arbitrary div does not. The first accessibility rule is: use the native element if one exists.

ARIA (Accessible Rich Internet Applications) is the patch when the platform falls short. role tells assistive tech what an element is when the HTML can't (role="tab" on a custom tab). aria-label provides an accessible name when no visible text exists. aria-live="polite" makes a region announce its updates to screen readers. aria-expanded, aria-selected, aria-checked describe state. The hard rule, drilled into every accessibility engineer, is "no ARIA is better than bad ARIA" — a wrong role or stale aria-expanded is worse than no attribute at all because it lies to the user.

The accessibility layer — semantic HTML, ARIA, focus, keyboard, screen readerDOM + accessibility treeDOMAccessibility tree<button>Save</button>native semantics + state<a href="/x">link</a>link role · href required<input aria-label="search">accessible name patched in<div role="tab" tabindex="0">custom · keyboard + role neededbutton "Save"focusable · pressable · enabledlink "link" → /xfocusable · activates URLtextbox "search"editable · single-linetab "Overview" (selected)in tablist · aria-selected=truescreen readers read the accessibility tree, not the DOM — every visual element needs a tree counterpart
The accessibility tree is what screen readers see. Native HTML maps cleanly; custom components need ARIA to build their tree node.

A worked accessibility audit makes the decisions concrete. Consider a sign-up form: email, password, confirm password, marketing-opt-in checkbox, submit button. Across the form, the engineering questions cascade.

Each input needs a programmatic label. <label for="email">Email</label><input id="email" type="email"> ties them with for/id. Screen readers will announce "Email, edit text" when the input gets focus. A placeholder is not a label — it disappears the moment the user types, leaving the user unable to recall the field's purpose. Visually-hidden labels (<label class="sr-only">) work when design dictates a label-less look, but they must exist.

Focus order follows reading order. Tab moves Email → Password → Confirm → Checkbox → Submit. The tabindex attribute defaults to 0 for natively-focusable elements; using positive tabindex values (tabindex="3") breaks the order and is almost always wrong. The checkbox is a real <input type="checkbox">, not a styled div with a click handler — Space toggles it, and screen readers announce "checked" or "not checked."

Focus order on a sign-up form — tab moves through fields in reading orderEmail · type=email · aria-describedby="email-err"Password · type=passwordConfirm password · type=passwordEmail me about updates · checkboxCreate account12345"Email, edit text, required""Password, edit text""Confirm password, edit text""Email me, checkbox, not checked""Create account, button"tab order matches visual order matches screen-reader order — three views of the same sequence
Right-side annotations are what a screen reader announces. The first regression test is "can a sighted keyboard user complete this form without a mouse?"

Inline errors after a failed submit need to be announced. Three pieces work together. The input gets aria-invalid="true". The error message gets an id, and the input gets aria-describedby="email-error" pointing at it. A live region (<div aria-live="assertive" role="alert">) announces the summary error once submission fails — "Sign-up failed. 2 fields have errors." Focus moves programmatically to the first invalid field so the user lands on the problem instead of hunting for it.

Colour contrast must hit WCAG 2.2 AA: 4.5:1 for body text against background, 3:1 for large text and UI components. A subtle "Forgot password?" link at #999 on white background is around 2.8:1 — fails. The fix is one design-token swap. Errors marked only in red colour also fail — colour alone cannot carry meaning for users with red-green colour blindness or screen-reader users. Add an icon, the word "Error:", or both.

Keyboard navigation must complete the task. A user with no mouse must reach Submit, fill every field, and submit. A keyboard-only walkthrough catches the bugs visual testing misses: focus traps in modals (Tab cycles within an open dialog and Esc closes it), skip-to-content links (a <a href="#main"> first in the tab order lets keyboard users skip the nav), and visible focus rings (:focus-visible styling that's clearly visible against the background).

WCAG 2.2 AA is the minimum level most legal frameworks treat as "accessible." It defines 55 testable criteria across four principles — Perceivable, Operable, Understandable, Robust. Modern testing tools (axe-core, Lighthouse, Pa11y) catch around a third of WCAG violations automatically; the rest need manual review with a screen reader (NVDA on Windows, JAWS for the enterprise market, VoiceOver on macOS and iOS, TalkBack on Android). The engineering practice is to run automated checks on every PR (axe-core in unit tests) and manual screen-reader testing on every flow before release.

The trade-off is engineering time, and the honest answer is: less than you think, if you start with semantic HTML; far more than you think, if you bolt accessibility on after launch. Every framework component you write is either accessible by default or accessible by retrofit — and retrofits are always more expensive than getting the contract right the first time.

Mobile, native, and design as a discipline

The web reaches further than any other distribution channel, but it is not the only surface. Phones are the dominant general-purpose computing device, and the constraints there — thermal, battery, network, screen size, touch input, app-store rules — shape interface engineering in ways the web does not. Three paths reach a phone.

Native means writing once per platform in the platform's language and toolkit. iOS: Swift with SwiftUI (declarative, modern) or UIKit (imperative, mature). Android: Kotlin with Jetpack Compose (declarative, Compose-style) or the older XML-layout + View system. Native delivers the best performance, lowest battery use, and full access to every OS API the day it ships — Live Activities, Dynamic Island, ARKit, Health, CarPlay, WidgetKit. The cost is two codebases, two engineering tracks, and two release cycles. Native pays off when the app is the product, the user is in it daily, and the surface justifies the duplication.

Cross-platform toolkits target the duplication directly. React Native runs JS in a separate engine (Hermes on Android, JSC on iOS) and bridges to native UIKit / Android Views; the new architecture (Fabric + TurboModules) shrinks the bridge cost. Flutter does not use the native widgets at all — it ships its own rendering engine (Skia, now Impeller on iOS) and draws every pixel from a Dart codebase, achieving consistent visuals at the cost of platform-feel divergence. Both let one team ship to both phones; both also lag the platforms by months when new OS features arrive.

Progressive Web Apps (PWAs) bring the web to phones via the Service Worker, the manifest, and install-to-home-screen. Installable, offline-capable, push-notifiable on most platforms (Apple is selectively restrictive). No app-store gatekeeper, instant updates, and the same codebase as the web. The trade is reduced access to OS capabilities — file pickers, Bluetooth, NFC, contacts work in pockets of the spec, and the experience on iOS is more constrained than on Android.

Mobile delivery paths — native, cross-platform, PWA — on the cost / capability trade-offNativeCross-platformPWASwift · KotlinReact Native · Flutterweb · Service Workercapabilityfull · day-1 OS APIsperformancebest · native widgetsteam cost2 codebases · 2 teamsship velocityapp-store review (1–7 days)when it winsflagship app, daily usecapabilitymost · plugin gapperformancenear-native (Fabric)team cost1 codebase · platform shimsship velocityapp-store review still applieswhen it wins2 platforms, small teamcapabilityweb platform subsetperformancegood · main-thread JSteam cost1 codebase · web teamship velocityinstant — no reviewwhen it winscontent, light interactionmost teams ship a web product first, then a native app if engagement justifies it
The phone surface is three routes to the same screen. Pick the one whose constraint shape matches the team you have.

Across all three surfaces, design as a discipline decides whether the interface succeeds. Design is not the colour of a button; it is the set of decisions about what the product does and how the user moves through it. A designer who only makes screens hands a developer pictures; a designer who decides product behaviour hands a developer a spec for how the system responds to a user goal. The second one is harder to hire and unambiguously more valuable.

A working design system centralises decisions before each surface re-decides them. Design tokens are the foundational primitives — --color-bg-primary, --radius-md, --space-4, --font-size-lg — that propagate across web, iOS, and Android in a single source (often JSON via Style Dictionary, Tokens Studio, or the W3C Design Tokens spec). A token change updates every surface in one PR. Above the tokens sit component primitives — Button, Input, Modal, Toast — each accessible by default, each with documented props and states. Above those sit patterns — Sign-up flow, Empty state, Error recovery — that compose primitives into recognisable interaction shapes.

Design tokens flow — one source propagates to every platformtokens.jsoncolor.bg.primaryspace.4 · radius.mdWebiOSAndroidCSS custom props--bg-primarySwift constantsColor.bgPrimaryCompose tokensMaterialThemeReact · Vue · SvelteSwiftUI · UIKitJetpack ComposeStyle Dictionary or similar emits per-platform output from one JSON source of truth
A rebrand becomes a one-PR change. Without tokens it becomes a six-month migration touching every surface.

Motion deserves its own engineering treatment. A 200 ms ease-out transition tells the user "this thing came from that thing" — a property panel sliding in from the right hints at the spatial relationship between the trigger and the result. A 600 ms transition feels slow; a 100 ms one feels abrupt. Spring physics (overshoot, settle) produces motion that matches how physical objects move and signals tactility. Motion that does not communicate — a sparkle effect after every click, a fade-in on every page load — is noise that costs frame budget. Every animation should answer "what does this teach the user about the system?"

Perception research provides the constants. Fitts's law — formalised by Paul Fitts in 1954 — says the time to point at a target grows with distance and shrinks with target size: T = a + b * log₂(D / W + 1). The practical version: bigger targets are faster to hit, and targets at screen edges (corners, edges of a phone) are effectively infinite size because the cursor or finger can't overshoot. Apple's minimum tap target is 44×44 pt; Google's is 48×48 dp. The 200 ms attention threshold is the rough window during which a user's gaze remains on an interaction's result before moving on; feedback that arrives after 200 ms feels disconnected from the action. Hick's law says decision time grows with the number of options — a menu of 50 items is not 5 times slower to use than one of 10, it is closer to 8 times slower.

Typography on screens is a precise engineering surface. Type ramp sets a small number of sizes (12 / 14 / 16 / 20 / 24 / 32 / 48) and forbids the rest — five sizes carry every screen instead of fifty ad-hoc ones. Line height rises with font size for body text (1.5×) and falls for display text (1.1×). Measure — characters per line — stays in the 45–75 range for sustained reading. Variable fonts (woff2 files with wght, wdth, opsz axes) collapse what used to be six weight files into one, cutting font payload by 60–80% while letting design vary weight continuously.

The honest limit on design as engineering: most teams either underinvest (the designer makes pictures, the engineer makes ad-hoc decisions on every spec gap) or overinvest (a design system big enough to need its own engineering team consumes more capacity than it returns). The sweet spot is a small system — tokens, 15–25 components, a handful of patterns — owned by a designer-engineer pair and used by every product team. Past that, every shipped product earns its surface back in months instead of weeks.

Standards

The web platform is the most heavily specified surface in software. Most of the standards below are either WHATWG living specs (continuously updated), W3C Recommendations (snapshot standards), or vendor docs that have become de facto canon.

Web platform specs:

  • WHATWG HTMLhtml.spec.whatwg.org. The living spec for HTML, the DOM, parsing, the event loop, and most of what a browser implements at the document level.
  • WHATWG DOMdom.spec.whatwg.org. The tree model, mutation observers, custom elements, shadow DOM.
  • ECMAScripttc39.es/ecma262 and the TC39 proposals tracker. The JavaScript language spec; proposals advance through stages 0–4 before becoming part of the annual edition.
  • CSS Working Group specsdrafts.csswg.org and the canonical CSS Snapshot. Cascade, selectors, flexbox, grid, container queries, colour, and every module published or in draft.
  • URL Standardurl.spec.whatwg.org. The parsing and serialisation rules every fetch and link obeys.
  • Fetchfetch.spec.whatwg.org. Defines Request, Response, CORS, and the network fetching semantics behind fetch() and Service Workers.
  • Streamsstreams.spec.whatwg.org. ReadableStream, WritableStream, TransformStream; the substrate behind streaming responses and pipe chains.
  • Service Workerw3.org/TR/service-workers. The installable network proxy that powers offline and PWAs.
  • IndexedDBw3.org/TR/IndexedDB. The async structured client database.
  • WebSocketsRFC 6455 plus websockets.spec.whatwg.org. The full-duplex protocol layered on HTTP/1.1 upgrade.
  • WebRTCw3.org/TR/webrtc. Peer connection, data channels, getUserMedia integration.
  • WebGPUgpuweb.github.io/gpuweb. Modern explicit-graphics API for the web; shading language is WGSL.
  • Web Componentsw3c.github.io/webcomponents. Custom elements, shadow DOM, HTML templates; the browser-native component model.
  • WebAssemblywebassembly.github.io/spec. The portable bytecode that runs alongside JS in every major browser.

Accessibility:

  • WCAG 2.2w3.org/TR/WCAG22. The current Web Content Accessibility Guidelines; AA is the practical legal floor in most jurisdictions, AAA is aspirational.
  • WAI-ARIA 1.3w3.org/TR/wai-aria-1.3. Accessible Rich Internet Applications: roles, states, properties.
  • ARIA Authoring Practices Guide (APG)w3.org/WAI/ARIA/apg. Reference patterns for combobox, dialog, listbox, tabs, treegrid — the worked examples every component library cribs from.
  • ATAG 2.0w3.org/TR/ATAG20. Authoring Tool Accessibility Guidelines; for tools that generate web content.
  • Accessible Name and Description Computation (AccName)w3.org/TR/accname. How browsers compute the string a screen reader announces.

Performance:

  • Core Web Vitalsweb.dev/vitals. LCP, INP, CLS definitions, thresholds, and field-measurement methodology.
  • web-vitals JS librarygithub.com/GoogleChrome/web-vitals. Reference implementation for measuring Core Web Vitals in production.
  • Performance Timelinew3.org/TR/performance-timeline. The PerformanceObserver API and entry types (navigation, resource, paint, largest-contentful-paint, event).
  • HTTP Archive Web Almanacalmanac.httparchive.org. Annual report on the state of the web platform from real-site crawls.
  • caniuse.comcaniuse.com. The reference for browser-feature support tables; check before relying on any spec less than three years old.

JavaScript engines:

  • V8v8.dev. Chrome, Edge, Node, Deno, Bun-on-server use V8 (Bun-on-client uses JSC). Deep blog posts on TurboFan, Sparkplug, Maglev, Liftoff.
  • JavaScriptCore (JSC)webkit.org/blog and the JSC wiki. Safari, Bun (client), Tauri (macOS).
  • SpiderMonkeyfirefox-source-docs.mozilla.org/js. Firefox; WarpMonkey is the current optimising tier.

Frameworks and libraries:

  • Reactreact.dev. The reference for React 19+, including Server Components and the React Compiler.
  • Vuevuejs.org/guide. Vue 3 docs cover the Composition API, reactivity (ref, reactive), and SFCs.
  • Sveltesvelte.dev/docs. Svelte 5 documentation with the runes ($state, $derived, $effect) reactivity model.
  • Soliddocs.solidjs.com. Fine-grained signal-based reactivity; no virtual DOM.
  • Angularangular.dev. The full-framework stack; recently added signals alongside the older zone-based change detection.
  • Astrodocs.astro.build. Islands architecture for content-led sites.
  • Qwikqwik.dev. Resumable framework that serialises the framework state into HTML so the client never re-bootstraps.
  • Next.jsnextjs.org/docs. The React full-stack framework that popularised the App Router and Server Components.
  • SvelteKit / Nuxt / Remix — official docs at kit.svelte.dev, nuxt.com/docs, remix.run/docs.

Mobile platforms:

Design references:

  • Refactoring UI — Adam Wathan and Steve Schoger, 2018, refactoringui.com. Practical visual-design heuristics for engineers.
  • The Design of Everyday Things — Don Norman, MIT Press, revised edition 2013. The foundational text on affordances, signifiers, and feedback in interface design.
  • Fitts (1954) — Paul M. Fitts, "The information capacity of the human motor system in controlling the amplitude of movement," Journal of Experimental Psychology 47(6). The original derivation of the relationship between target size, distance, and pointing time.
  • Designing Interfaces — Jenifer Tidwell, Charles Brewer, Aynne Valencia, O'Reilly, 4th ed. 2020. The pattern-language reference for UI interactions.
  • Inclusive Design Principlesinclusivedesignprinciples.org. The Paciello Group's seven principles for designing for the full range of users.

Cross-act references:

  • Image, audio, and font encoding — every JPEG, AVIF, WebP, WOFF2, and Unicode glyph the browser renders is a byte pattern decided in Act I. Performance work at the interface layer often becomes encoding work upstream.
  • Browser tabs are processes; tabs use threads; the OS schedules them. The reality underneath the main-thread metaphor lives in Act IV — virtual memory, file descriptors, the scheduler that decides when your tab gets CPU.
  • HTTP/2, HTTP/3, TLS, DNS — every request that produces a pixel travels through the protocol stack documented in Act Va. Latency at the interface is mostly network latency in disguise.
  • Caching, CDNs, observability, and the back-end performance story — the systems that sit behind every API call from the browser — are Act Vc.
  • The team practice that ships the interface and keeps it improving — version control, code review, testing, CI/CD, decisions on paper — is Act IXb. Once the interface is live, keeping it alive under load — capacity, profiling, on-call, incident response, SLOs — is Act IXc.
Going deeper

Branches that earn their own article.

  • Browser engine internals (Blink, WebKit, Gecko).
  • JavaScript engine deep dives (V8 hidden classes, inline caches, TurboFan).
  • CSS engine internals (style invalidation, layout algorithms).
  • Framework internals (React fiber, Vue reactivity, Svelte compiler, Solid signals).
  • Mobile platform deep dives (UIKit/SwiftUI, Jetpack Compose).
  • WebGPU and graphics on the web.
  • WebAssembly in production.
  • Accessibility testing tooling and audits.
  • Design tokens and design-system architecture.
  • Typography for screens.