Forget data labeling: Tencent’s R-Zero shows how LLMs can train themselves

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now

A new training framework developed by researchers at Tencent AI Lab and Washington University in St. Louis enables large language models (LLMs) to improve themselves without requiring any human-labeled data. The technique, called R-Zero, uses reinforcement learning to generate its own training data from scratch, addressing one of the main bottlenecks in creating self-evolving AI systems. R-Zero works by having two independent models co-evolve by interacting with and challenging each other.

Experiments show that R-Zero substantially improves reasoning capabilities across different LLMs, which could lower the complexity and costs of training advanced AI. For enterprises, this approach could accelerate the development of specialized models for complex reasoning tasks without the massive expense of curating labeled datasets.

Visa’s $3.5B Bet on AI

The challenge of self-evolving LLMs

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now

Visa’s $3.5B Bet on AI

The challenge of self-evolving LLMs

Forget data labeling: Tencent’s R-Zero shows how LLMs can train themselves

Forget data labeling: Tencent’s R-Zero shows how LLMs can train themselves

Related reading

Teaching the model: Designing LLM feedback loops that get smarter over time

Beyond static AI: MIT’s new framework lets models teach themselves

This AI Model Never Stops Learning

AlphaOne gives AI developers a new dial to control LLM ‘thinking’ and boost…

IEEE Rolls Out Large Language Models Virtual Training Course

Small language models: Rethinking enterprise AI architecture

Related reading

Teaching the model: Designing LLM feedback loops that get smarter over time

Beyond static AI: MIT’s new framework lets models teach themselves

This AI Model Never Stops Learning

AlphaOne gives AI developers a new dial to control LLM ‘thinking’ and boost…

IEEE Rolls Out Large Language Models Virtual Training Course

Small language models: Rethinking enterprise AI architecture