Once covered models are defined, Nguyen then warned that the effectiveness of the safety testing will likely depend on whether AI firms are fully transparent and treat the process as a “genuine collaboration.”
“Underneath the definitional problem sits an observability problem,” Nguyen wrote. “The government cannot assess what it cannot see, and frontier capabilities are visible only to the labs that build them.”
Ferren suggested that “the window for erecting proper cyber defenses to new AI models may also close quickly,” and that even a well-designed government program may struggle to properly vet frontier models in such a short timeframe. “Even when well implemented, pre-deployment testing has limits,” Ferren said, noting that Google’s threat intelligence team has found state-aligned actors using frontier models to automate cyberattacks and “researchers have shown that Mythos-style vulnerability reasoning can be reproduced with open-weight systems.”
So while AI may voluntarily submit to testing, they may be financially motivated to seek a rubber-stamp, rather than work with the government to test known frontier capabilities to their fullest extent.
“It will likely prove difficult to develop models that are incapable of malicious hacking yet remain commercially compelling,” Ferren said.










