Performance & atlas
flowgl is built for graphs that overflow SVG-based libraries. The benchmarks live in PERFORMANCE.md; this page is the working guide for hitting them in your own app.
Headline numbers
| Graph size | Target | Floor | Current (SwiftShader headless) |
|---|---|---|---|
| 1,000 nodes | 60 fps | 60 fps | 120 fps |
| 5,000 nodes | 60 fps | 30 fps | 113.6 fps |
| 10,000 nodes | 30 fps | 30 fps | 114.1 fps |
SwiftShader is the floor of what's available — real GPUs do substantially better. We publish the SwiftShader numbers because that's what a CI runner without a discrete GPU produces, and we'd rather under-promise.
Eight critical optimizations
The current numbers come from the following stack, in roughly decreasing-impact order:
- Instanced rendering for nodes — one draw call, an
InstancedArrayfor per-node attributes. Adding a node is abufferSubDatacall, not a new draw. - Per-frame VBO packing for edges — bezier tessellated on CPU, triangle strip uploaded as one packed VBO. Edge count drives memory, not draw calls.
- Frustum culling — nodes and edges outside the viewport are dropped before upload. Zoomed-in graphs pay the cost of what's visible, not what exists.
- Text atlas with shelf packing — every unique (label, font, color) cached in a 2048×2048 atlas, looked up in O(1) per frame.
- Pass-1 atlas pre-warm + generation re-check —
text-programwrites all entries to the atlas before computing any quads. Eviction re-checked at the end of pre-warm; if it fired, every quad rebuilds so no UV is stale (the 0.4.1 fix). - Per-entry OffscreenCanvas + drawImage for atlas writes — every glyph is rasterized in a fresh isolated canvas and
drawImage'd into the live atlas. Isolates the write from any cumulative ctx state (the 0.2.6 structural fix for the CJK glyph drop). - Render-time caches —
cachedTextNodes,cachedHasAnimated,nodeEdgeIndexeliminate per-framefilter/somecalls. - 2-pass text render — pre-warm fills the atlas, then a second pass generates vertices. Prevents mid-frame atlas eviction from showing a half-baked frame.
Performance Tenet (T6) — what we promise
From PRODUCT.md:
A PR that drops any tier below its floor is a regression, regardless of feature value.
There is no "we'll optimize it later" — a PR that ships at 25 fps for 10k nodes does not merge until it's back above the floor or has an explicit Tenet-exception entry approved by the maintainer and recorded in HISTORY.md.
What you can do to keep fps up
- Don't drive thousands of nodes through React reconciliation every frame. The framework wrappers diff their props at the chart's level, not through the framework's render tree. Incremental mutations via
chart.addNode()/chart.updateNode()/chart.removeNode()are cheaper than rewriting an entirenodesarray. - Set
animated: trueonly on edges that actually need to animate. Animated edges fall outside the cache and re-tessellate each frame. - Keep
historyLimitreasonable (default 100). Each undo snapshot is a full graph copy — at 50k nodes that's expensive memory. - Use
batchUpdate(fn)for multi-mutation operations. Layout, import, import-merge — wrap them inbatchUpdateso they create one history entry and one render flush instead of many.
Atlas eviction (the thing that bit 0.4.0)
The atlas is 2048×2048 and holds however many entries fit. When you add a new unique label that doesn't fit, the atlas evicts every entry and starts over. Eviction is correct (the rendered output stays right) but costs a full text re-pack.
A reasonable rule: if your visible graph has more than a few hundred distinct labels with distinct (font, color, line-height) combinations, profile to confirm eviction isn't the bottleneck. The DevTools Performance tab will show repeated getImageData + drawImage calls in the same frame.
When in doubt, the atlas-cjk-diag.mjs script overflows the atlas on purpose and verifies every label still maps correctly. It's the same script the release gate runs.
Reporting a performance regression
Open a bug. Include:
chart.getNodes().length/getEdges().length- Browser + GPU (in Chrome, paste
chrome://gpusummary) - Frame timeline from DevTools Performance (a screenshot is fine)
- Whether the same scene is fine in an older version (if so, which)
pnpm dev + the bundled demo/benchmark.html is a deterministic starting point — we can usually narrow a regression by running it on the same revision.