During the WWDC26 keynote, Apple announced its third generation of Apple Foundation Models (AFM), comprising five models, some of which are local, some of which are cloud-based, and one of which lives in Google’s servers running on Nvidia chips. Here’s a breakdown of how that will work.

A bit of background

When Apple first announced its foundation models in 2024, the lineup included an on-device language model with roughly 3 billion parameters, and “a larger server-based language model available with Private Cloud Compute and running on Apple silicon servers,” as the company put it at the time.

Private Cloud Compute was an ambitious undertaking, as it aimed to deliver cloud-based AI capabilities while preserving the same privacy guarantees users expect from on-device processing.

For this reason, keeping everything in-house was essential. Private Cloud Compute ran in Apple data centers, on servers powered by Apple silicon. Even so, its privacy guarantees could be independently verified by third-party security researchers.