Generalist Is Betting Its Robot-Training Gloves Will Usher In Robotics’ ChatGPT Moment

Generalist co-founders Pete Florence and Andy Zeng pose for a photo in their office.GeneralistThe robot, a pair of disembodied arms with crablike pincers at the end, wasn’t supposed to pick up the bag. It had been told to do a single tedious job: open plastic bags on a conveyor belt and stuff toy potted plant plushies inside them.Then one plushie snagged halfway in. The robot paused — briefly, as if assessing its work — then did something it had not been programmed to do. It raised its other arm, grabbed the other side of the bag, gave it a quick shake so the toy slid all the way down, and then placed it back on the belt.For a human worker that’s muscle memory. For the engineers at Generalist, a Silicon Valley startup developing robot “brains”, it was a tell: the robot wasn’t just replaying a scripted task. It was improvising.Those kinds of “emergent” behaviors are the reason Generalist’s CEO, Pete Florence, who was a lead on PaLM-E, one of Google’s foundational robotics papers, thinks robotics is nearing its ‘ChatGPT moment.’ The startup, which raised $140 million at a $440 million valuation in 2025, and which Florence founded with Google co-worker Andy Zeng and Boston Dynamics roboticist Andy Barry, has largely been under-the-radar. (Its backers include Spark Capital, Nvidia’s NVentures, Bezos Expeditions and Boldstart Ventures). Now it’s releasing a new model called GEN-1, and Florence says it can help off-the-shelf robots handle a wider range of high-dexterity tasks usually performed by humans — folding laundry and “kitting,” packing multiple different types of items into a single box—while improvising in the messy, unpredictable edge cases that have historically stumped robots.“What’s happening now with robotics parallels when people opened GPT-3 and asked it to write a completely new limerick,” he told Forbes. “The limerick didn’t exist before. To achieve that, you need an improvisational level of intelligence. What we’re doing applies to robotics and beyond.”His thesis is simple, expensive and proven (to a point): stop treating robotics like custom machinery and start treating it like a large language model. It’s the same thesis that ushered in the dramatic explosion of AI capabilities in ChatGPT, except with robotics data swapped in for the text-based data large language models are trained on: build ever larger models, feed them tons of data, iterate relentlessly and trust (or hope) that new capabilities will emerge.“We’re doing whatever we need to do to scale,” he says.After years of playing second fiddle to software, robots are back in fashion in Silicon Valley. Nvidia CEO Jensen Huang helped ignite the latest frenzy last year when he declared robots were entering the ChatGPT era. Since then, the internet has been flooded with videos of humanoid robots performing backflips, breakdancing and vaulting. Meanwhile, most real-world robots still struggle outside carefully defined tasks. ChatGPT may write code and boilerplate emails, but robots still don’t make lunch, handle DoorDash deliveries or run factories without an army of human babysitters.Generalist’s approach is similar to that of its more highly-valued competitor, Physical Intelligence, (reportedly raising $1 billion at an $11 billion valuation): pair off-the-shelf robotics hardware with transformer-based AI models in the same family as those behind ChatGPT.The data problemThere’s one thing on which nearly everyone in robotics agrees: data collection is a fundamental bottleneck. Large language models can train on the vast corpus of the internet. Robots can’t. There’s no Wikipedia for physical labor, you can’t scrape “if the toy doesn’t slip into the bag, try shaking it.”The most common workaround is teleoperation: bulky rigs that lets humans remotely control robotic systems to generate training examples. Rival Physical Intelligence is leaning heavily on that approach, building staged environments like kitchens and bedrooms for training. It’s even rented local Airbnbs to practice in real-world settings.Generalist believes it has found a more scalable alternative.Years before Generalist existed, cofounder Zeng was walking in Newport Beach when he noticed someone picking up trash with a simple grabber tool. It was an ‘aha’ moment for Zeng, who wondered if a tool like that could be used to generate the data to train those robot pincers mentioned above.The result of that idea is what Generalist calls “data hands”: strap-on devices worn on the wrists that effectively turn a person’s hands into pincer-like robot hands, collecting visual and sensory data. Generalist declined to explain what exactly is collected and how it's processed, but claims it's intuitive enough to be used in homes, warehouses, and workplaces to perform everyday tasks. At Generalist’s offices in San Mateo, “data hands” operators work side-by-side with researchers, practicing tasks like bundling together a bouquet of flowers, or futzing around with electronics.A robot trainer uses the "data hands" to generate training data for Generalist's AI models.GeneralistFlorence says the payoff is a dataset that’s both large—now over half a million hours—and rich enough to train models that can generalize across tasks, rather than simply memorize them.Right now, these results still require some squinting. The robots can fold boxes nearly as fast as humans, Florence says, and roughly three times faster than competing systems. But the hardware itself is rudimentary with pincer-style grippers that don’t have the grace of human-like hands with opposable thumbs. Generalist’s rebuttal is pragmatic: fancy hands are great until they break or fail outside tightly controlled lab conditions, and the pincers can perform a pretty varied menu of tasks typically carried out by human hands..”If you looked at GPT-2, which was released in 2019, you'd be super dismissive of it,” said Fraser Kelton, an investor in Generalist at Spark Capital who previously led product at OpenAI during the commercialization of GPT-3 and ChatGPT. “ But since then, every time they've scaled up these models, the returns on generalization have been profound…And all of a sudden the language model companies that were building vertical or domain-specific models have been eclipsed. Literally, the exact same thing is happening within robotics.”Not everyone buys Generalist’s “scale is all you need in robotics” hypothesis. Brad Porter, a former Amazon robotics executive and now CEO of Cobot, argues that robotics still needs significant architectural advances before scale can be applied effectively.“Just brute forcing a huge amount of data against a not-perfect architecture is really expensive and not necessarily going to get you the result you want,” he told Forbes. “ImageNet didn’t work without CNNs, and OpenAI didn’t work without transformers,” he added, referring to the breakthroughs that have made modern AI possible. “Scaling has always gone hand-in-hand with architectural breakthroughs.”

Generalist Is Betting Its Robot-Training Gloves Will Usher In Robotics’ ChatGPT Moment

Related reading

Will Robotics Have a ChatGPT Moment?

I sent ChatGPT Agent out to shop for me

Is the humanoid robot industry ready for its ChatGPT moment?

OpenAI Ramps Up Robotics Work in Race Toward AGI

ChatGPT: Everything you need to know about the AI chatbot

OpenAI is putting apps in ChatGPT. Why that's a bigger deal than you might…