NVIDIA has released Nemotron 3 Ultra, a 550B total (55B active) open Mixture-of-Experts hybrid Mamba-Transformer for long-running agents. It pairs a 1M-token context with up to ~6x higher inference throughput than comparable open LLMs at on-par accuracy, and ships with open weights, training data, and recipes under OpenMDW-1.1.

NVIDIA unveiled Nemotron 3 Ultra at Computex 2026: 550B MoE model, 55B active, 1M token context, 300+ tok/s, 48 AA Intelligence Index. Complete developer deploy

Enterprises can now leverage AibleClaw to securely and cost effectively run long-running AI agents or claws with NVIDIA Nemotron 3 Ultra for frontier-class planning and use its…

Single-turn chatbots are evolving into long-running agents that can reason, maintain context, use tools, and run efficiently across many turns to complete complex workflows.…

Deploy NVIDIA Nemotron 3 Ultra on Amazon SageMaker JumpStart. Get 5x faster inference and 30% lower cost for agentic AI workloads with this frontier reasoning model.

NVIDIA has released Nemotron 3 Ultra, a 550B total (55B active) open Mixture-of-Experts hybrid Mamba-Transformer for long-running agents. It pairs a 1M-token context with up to…