Building autonomous robots requires robust, low-latency visual perception for depth, obstacle recognition, localization, and navigation in dynamic environments. These capabilities demand heavy compute. NVIDIA Jetson platforms offer powerful GPUs for deep learning, but increasing AI complexity and the need for real-time performance can lead to GPU oversubscription. Relying solely on the GPU for all perception tasks can result in bottlenecks, increased power consumption, and thermal challenges, especially in power-sensitive and thermally constrained environments common in mobile robotics.

The NVIDIA Jetson platform addresses these challenges by combining powerful GPUs with dedicated hardware accelerators. Jetson devices like NVIDIA Jetson AGX Orin and NVIDIA Jetson Thor house specialty hardware accelerators designed to execute image processing and computer-vision tasks with high efficiency. That frees up the GPU for more demanding deep-learning workloads. The NVIDIA Vision Programming Interface (VPI) unlocks the full potential of these diverse hardware accelerators.

In this blog, we explore the benefits of using these accelerators and explain how developers can use VPI to unlock the full potential of the Jetson platform. As an example, we will walk you through the development of a low-latency, low-powered perception application for stereo disparity using these accelerators. To start, we will develop a single stereo camera pipeline, and then move onto developing a multi-stream pipeline with eight stereo cameras performing at 30FPS on Thor T5000—about 10x faster than Orin AGX 64 GB.