Teaching robot policies without new demonstrations: interview with Jiahui Zhang and Jesse Zhang - Robohub
The ReWiND method, which consists of three phases: learning a reward function, pre-training, and using the reward function and pre-trained policy to learn a new language-specified task online.