Nvidia just dropped a chip that wants to make cloud-dependent creative workflows feel like dial-up internet. The RTX Spark, announced at GTC Taipei during COMPUTEX on May 31, 2026, is an Arm-based superchip that fuses Nvidia’s Grace CPU with a Blackwell RTX GPU, delivering up to 1 petaflop of AI compute in a package thin enough to fit inside a 14mm laptop.

The pitch is straightforward: take the kind of agentic AI that currently lives in data centers, and run it locally on your desk. That means connecting design tools like Rhino and Blender through AI agents that can turn rough architectural sketches into photorealistic renders, all without pinging a server farm in Virginia.

What’s under the hood

The headline number is 128 GB of unified memory. That’s enough to support local inference for models with up to 120 billion parameters and large context windows.

Nvidia is also deploying its full CUDA/RTX software ecosystem on the chip. That matters because CUDA compatibility means the enormous library of GPU-accelerated tools that developers and creators already use will work natively.