VAGEN: Teaching Vision-Language Models to Build World Models Through Reinforcement Learning
Kangrui Wang, Pingyue Zhang, Zihan Wang, Yaning Gao, Linjie Li, Qineng Wang, Hanyang Chen, Chi Wan, Yiping Lu, Zhengyuan Yang, Lijuan Wang, Ranjay Krishna, Jiajun Wu, Li Fei-Fei, Yejin Choi, Manling Li
We introduce VAGEN, a reinforcement learning framework that trains vision-language model (VLM) agents to build internal world models through explicit visual state reasoning.
Fantastic Bugs and Where to Find Them in AI Benchmarks
Sang T. Truong, Yuheng Tu, Michael Hardy, Anka Reuel, Zeyu Tang, Jirayu Burapacheep, Jonathan Jude Perera, Chibuike Uwakwe, Benjamin W. Domingue, Nick Haber, Sanmi Koyejo











