Nvidia just built a system that lets AI agents walk into a robot lab and figure out how to teach the robots themselves. No human hand-holding, no step-by-step programming. Just coding agents, robotic arms, and what the researchers describe as a “generous token budget.”
The framework is called ENPIRE, short for Environment, Policy Improvement, Rollout, Evolution. It was developed by Nvidia’s GEAR (Generalist Embodied Agent Research) lab in collaboration with Carnegie Mellon University and UC Berkeley. The result: robots that learned to cut zip ties and insert GPUs into thin motherboard sockets with a 99% success rate.
How ENPIRE actually works
The framework has four distinct modules. The Environment module handles automated resets and verification, so a robot can fail at a task and immediately set itself up to try again without someone walking over to press a button. Policy Improvement is where the agent analyzes performance data and adjusts the robot’s behavioral code. Rollout handles parallel physical evaluation across multiple robots simultaneously. And Evolution is the agent-driven code refinement layer that ties the whole cycle together.
The research team deployed eight dual-arm robots running parallel policy rollouts. The researchers tracked efficiency using metrics they call Mean Robot Utilization (MRU) and Mean Token Utilization (MTU), essentially measuring how much useful work each robot and each AI token produced.










