Teaching a robot to see is hard. Teaching it to talk is harder. Teaching it to feel things, and then react to what it feels in real time, while also seeing and understanding language? That’s the problem a team from UC Berkeley, Nvidia, Stanford, and collaborating institutions just took a serious swing at.
The framework is called T-Rex, short for Tactile-Reactive Dexterous Manipulation. It was submitted to arXiv on June 15 under paper ID 2606.17055, and it represents a meaningful leap in how robots handle physical contact during complex tasks.
What T-Rex actually does
Most modern robot brains, known as Vision-Language-Action (VLA) models, are good at processing what they see and understanding instructions. But the moment something unexpected happens during physical contact, like an object slipping or deforming, these systems tend to fall apart.
T-Rex solves this by adding a third sensory channel: high-frequency tactile data. The robot can feel what’s happening at its fingertips and adjust its grip or motion many times per second, not just react to what it sees.














