Developing autonomous vehicle (AV) policies requires bridging an important gap between training and deployment. Vision-language-action (VLA) models that can reason over more complex driving scenes and produce richer intermediate reasoning are predominantly trained in open-loop, where model outputs are directly compared to ground-truth behaviors without considering their effect on the environment.

In deployment, however, a driving policy runs in closed-loop, where every braking, steering, and navigation decision affects the environment, and small errors can compound over time.

A systematic means to address this challenge is provided by NVIDIA Alpamayo, an open portfolio of AI models, simulation frameworks, and physical AI datasets for AV development. Alpamayo includes the AlpaSim AV simulation platform and the AlpaGym closed-loop training framework (coming soon).

This post explains how to train AV models in closed-loop with NVIDIA Alpamayo. Specifically, it walks through how to:

Install and configure AlpaGym