Jun 1, 2026

Nano Banana Pro prompted by THE DECODER

Nvidia used GTC Taipei to launch a series of models for robots, autonomous vehicles, and video systems. The centerpieces are the new world model Cosmos 3, a significantly scaled-up driving model called Alpamayo 2 Super, and an open reference platform for humanoid robots.

Cosmos 3 is Nvidia's next version of its open "omnimodel," which processes text, images, video, ambient audio, and action data in a single system. Developers building robots, autonomous vehicles, and video surveillance systems can use it to generate synthetic training data, interpret scenes, and predict future world states without having to painstakingly recreate those situations in the real world.

Nvidia names three use cases. As a vision-language model, Cosmos 3 analyzes video, for example to detect traffic anomalies in smart cities, as partner Linker Vision is already doing.