After GTC 2026, one thing is basically settled: the hardware layer for on-device AI is no longer the bottleneck.

NVIDIA's RTX Spark packs Blackwell GPU + Grace CPU + 128GB unified memory into a desktop form factor. Apple's M-series chips with unified memory architecture and efficiency-first design let 4B and even 7B parameter models run smoothly on a MacBook. Two different approaches, same destination: consumer hardware now has the compute foundation for running on-device AI agents.

Chip vendors have done their part. The next question is: how many layers are still missing between "chip can run an AI model" and "an on-device agent can actually complete useful tasks"?

This post maps out the full technology stack for on-device AI agents, examining each layer's maturity, identifying gaps, and tracking what the open-source community has built so far.

Layer 1: Silicon (Ready)