There is a class of feature that used to be impossible to ship for free: anything that needed a language model. You wired up an API key, you ate the per-token bill, and every prompt your users typed went off to someone else's server. For a small public tool, that math usually killed the idea before it started.
That changed. Recent versions of Chrome ship a language model, Gemini Nano, and expose it to any web page through the Prompt API. The model runs on the user's own machine. No API key. No inference bill. No data leaving the browser.
We put this into a real, live tool, a free Mermaid diagram editor where you describe a diagram in plain English and the browser writes the Mermaid code for you. This post is the developer's version of that story: how the API actually works, the code that makes a small on-device model trustworthy, and an honest accounting of what you gain and what you give up.
What "AI in the browser" means in 2026
The important word is built-in. This is not WebGPU plus a 4 GB model you download and run yourself. The model ships with Chrome, and you talk to it through a small standard-track JavaScript API.






