Teaching robot policies without new demonstrations: interview with Jiahui Zhang and Jesse Zhang - Robohub

The ReWiND method, which consists of three phases: learning a reward function, pre-training, and using the reward function and pre-trained policy to learn a new language-specified task online.

In their paper ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations, which was presented at CoRL 2025, Jiahui Zhang, Yusen Luo, Abrar Anwar, Sumedh A. Sontakke, Joseph J. Lim, Jesse Thomason, Erdem Bıyık and Jesse Zhang introduce a framework for learning robot manipulation tasks solely from language instructions without per-task demonstrations. We asked Jiahui Zhang and Jesse Zhang to tell us more.

What is the topic of the research in your paper, and what problem were you aiming to solve?

Our research addresses the problem of enabling robot manipulation policies to solve novel, language-conditioned tasks without collecting new demonstrations for each task. We begin with a small set of demonstrations in the deployment environment, train a language-conditioned reward model on them, and then use that learned reward function to fine-tune the policy on unseen tasks, with no additional demonstrations required.

Tell us about ReWiND – what are the main features and contributions of this framework?

The ReWiND method, which consists of three phases: learning a reward function, pre-training, and using the reward function and pre-trained policy to learn a new language-specified task online.

What is the topic of the research in your paper, and what problem were you aiming to solve?

Tell us about ReWiND – what are the main features and contributions of this framework?

Teaching robot policies without new demonstrations: interview with Jiahui Zhang and Jesse Zhang - Robohub

Teaching robot policies without new demonstrations: interview with Jiahui Zhang and Jesse Zhang - Robohub

Other newsrooms on this story

Related reading

Robohub highlights 2025 - Robohub

How can robots acquire skills through interactions with the physical world? An…

LeRobot v0.6.0: Imagine, Evaluate, Improve

What is RLHF? Reinforcement learning from human feedback for AI alignment

Mastering Agentic Techniques: AI Agent Reinforcement Learning | NVIDIA…

LLMs help robots understand vague instructions and focus on key details

Other newsrooms on this story

Related reading

Robohub highlights 2025 - Robohub

How can robots acquire skills through interactions with the physical world? An…

LeRobot v0.6.0: Imagine, Evaluate, Improve

What is RLHF? Reinforcement learning from human feedback for AI alignment

Mastering Agentic Techniques: AI Agent Reinforcement Learning | NVIDIA…

LLMs help robots understand vague instructions and focus on key details