The team uses rewards to teach the AI to solve problems, allowing them to bypass conventional training barriers.

Unlike US AI firms which publish findings of their frontier risk evaluations, Chinese companies do not announce such details.

The team uses rewards to teach the AI to solve problems, allowing them to bypass conventional training barriers.

El modelo chino de inteligencia artificial DeepSeek-R1 aprende más y mejor cuando recibe 'recompensas' por resolver problemas.