Back to Articles
Published on June 28, 2026
Introduction
Embodied navigation is not only a perception problem, and it is not only a control problem. A robot must understand what the user wants, observe a changing scene, follow a target or route, avoid obstacles, and continuously correct its next movement as new observations arrive.
VLX-Go is designed for this practical middle layer. It is a lightweight vision-language waypoint planner that receives recent monocular frames, the current observation, and a natural-language instruction, then predicts short-horizon local waypoints for a downstream controller or simulator.







