Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

Back to Articles

Motivation

NVIDIA Cosmos Predict 2.5 is a large-scale world model capable of generating physically plausible videos conditioned on text, images, or video clips. To adapt it to a specific domain, such as robot manipulation or a particular camera viewpoint, teams still need targeted fine-tuning.

Training robot policies requires demonstration data, but collecting real-robot trajectories is slow and expensive. Generating synthetic trajectories with a fine-tuned video world model offers a scalable alternative. However, full fine-tuning of a 2B-parameter model is expensive and risks catastrophic forgetting of general knowledge. LoRA and DoRA inject small trainable adapter modules into the frozen base model, reducing memory requirements while keeping the adapter files small and portable. This makes it practical to fine-tune on a single GPU and flexibly swap adapters for different domains at inference.

This guide walks through parameter-efficient fine-tuning of Cosmos Predict 2.5 with LoRA and DoRA, using the diffusers and accelerate libraries with support for both single- and multi-GPU training. We then show how to use the fine-tuned model to generate synthetic robot trajectories for downstream robot learning tasks.

Back to Articles

Motivation

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

Related reading

Welcome NVIDIA Cosmos 3: The First Open Omni-model for Physical AI Reasoning…

Nvidia unveils Cosmos 3 world model to enhance robot navigation

Scale Synthetic Data and Physical AI Reasoning with NVIDIA Cosmos World…

NVIDIA Isaac GR00T N1.7: Open Reasoning VLA Model for Humanoid Robots

NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic…

Related reading

Welcome NVIDIA Cosmos 3: The First Open Omni-model for Physical AI Reasoning…

Nvidia unveils Cosmos 3 world model to enhance robot navigation

Scale Synthetic Data and Physical AI Reasoning with NVIDIA Cosmos World…

NVIDIA Isaac GR00T N1.7: Open Reasoning VLA Model for Humanoid Robots

NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic…