The headline from Apple’s developer conference was a reborn Siri. The more interesting story sits underneath it: the AI models Apple built to run the thing, one of which is far too big to fit in an iPhone’s memory, yet runs on the device anyway.

In a technical post published alongside WWDC, Apple detailed the third generation of its Apple Foundation Models, a family of five models it describes as “custom-built in collaboration with Google.”

Two run on-device: AFM 3 Core, a 3-billion-parameter model for everyday tasks, and AFM 3 Core Advanced, its most powerful on-device model. Three more run in the cloud: AFM 3 Cloud, a server workhorse; ADM 3 Cloud, an image model behind Image Playground and Genmoji; and AFM 3 Cloud Pro, the heavyweight built for agentic tool use and complex reasoning.

The clever engineering is in Core Advanced.

The 💜 of EU techThe latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!It is a 20-billion-parameter, natively multimodal model, the kind of size that normally lives in a data centre, not a phone. Apple’s trick is to keep the entire model in flash storage rather than the much smaller pool of working memory. Using a technique its researchers call Instruction-Following Pruning, the model makes routing decisions once per prompt, loading only a small set of “expert” parameters into memory, between 1 and 4 billion at a time, while keeping a core of shared experts always on.