TL;DRAI

Chrome 148 ships Gemini Nano model on-device via Prompt API, eliminating per-token inference costs. Enables free AI features for small tools without backend infrastructure; on-device execution reduces compliance risks and shifts AI economics from per-inference billing to zero-cost innovation.

There is a class of feature that used to be impossible to ship for free: anything that needed a language model. You wired up an API key, you ate the per-token bill, and every prompt your users typed went off to someone else's server. For a small public tool, that math usually killed the idea before it started.

That changed. Recent versions of Chrome ship a language model, Gemini Nano, and expose it to any web page through the Prompt API. The model runs on the user's own machine. No API key. No inference bill. No data leaving the browser.

We put this into a real, live tool, a free Mermaid diagram editor where you describe a diagram in plain English and the browser writes the Mermaid code for you. This post is the developer's version of that story: how the API actually works, the code that makes a small on-device model trustworthy, and an honest accounting of what you gain and what you give up.

What "AI in the browser" means in 2026

The important word is built-in. This is not WebGPU plus a 4 GB model you download and run yourself. The model ships with Chrome, and you talk to it through a small standard-track JavaScript API.

dev.to

Supercharge your web app with free AI that runs in your users' browser

There is a class of feature that used to be impossible to ship for free: anything that needed a...

sabato 20 giugno 2026 New tab

TL;DRAI

1,971 words~9 min read

What "AI in the browser" means in 2026

The important word is built-in. This is not WebGPU plus a 4 GB model you download and run yourself. The model ships with Chrome, and you talk to it through a small standard-track JavaScript API.

Supercharge your web app with free AI that runs in your users' browser

Supercharge your web app with free AI that runs in your users' browser

Related reading

Chrome Put a 4GB AI Model on Your Computer: What Gemini Nano Means for Privacy

I Ran AI Models Directly in the Browser and Measured What It Did to Core Web…

Google brings more Gemini AI features to Chrome browser

# Zero-Cost AI Agent Stack: Cloudflare Workers + Gemini Web = 24/7 Free AI

I Built a Local AI Gateway That Talks to Claude, ChatGPT, DeepSeek and Gemini —…

I Ran a 2-Billion Parameter AI Model in a Browser Tab. No Server.

Related reading

Chrome Put a 4GB AI Model on Your Computer: What Gemini Nano Means for Privacy

I Ran AI Models Directly in the Browser and Measured What It Did to Core Web…

Google brings more Gemini AI features to Chrome browser

# Zero-Cost AI Agent Stack: Cloudflare Workers + Gemini Web = 24/7 Free AI

I Built a Local AI Gateway That Talks to Claude, ChatGPT, DeepSeek and Gemini —…

I Ran a 2-Billion Parameter AI Model in a Browser Tab. No Server.