Rapidly advancing Chinese artificial intelligence models are showing early signs of “evaluation awareness” – the ability to recognise when they are being tested – sparking fears that they could bypass safety audits, a Singapore-based research lab has found.Evaluation awareness refers to a model’s understanding that it is undergoing testing, evaluation or experimentation by human researchers rather than operating in a real-world setting.The phenomenon was raising alarms because it could allow AI systems to deliberately game human evaluators to pass safety tests, according to Clement Neo, founder of Neo Research, a frontier AI safety evaluation lab.“It would mean that whatever testing the model developers themselves do might not reflect the actual behaviour of a model once it gets deployed,” he said. “And that’s a really big problem”.Neo Research’s findings, published last week, detail a jump in evaluation awareness among Chinese AI models. Over just a few months, these systems had risen from near-zero awareness to within striking distance of their US counterparts, propelled by a broader leap in overall capabilities, the report said.Anthropic’s Claude 4.5 Opus scored nearly 80 per cent in evaluation awareness. Photo: NurPhoto via Getty ImagesNeo and his co-founder Miro Pluckebaum tested models from DeepSeek, Moonshot AI and Zhipu AI. They used a popular AI misalignment test originally developed by US company Anthropic, which places models in fictional scenarios where their goals or continued operations are threatened.
Like US models, Chinese AI is learning to ‘game’ safety tests, research lab says
In just a few months, Chinese AI models have risen from near-zero ‘evaluation awareness’ to within striking distance of their US counterparts.








