The blocking of the powerful AI models Fable 5 and Mythos 5 was allegedly not due to a jailbreak but to following the simple instruction “fix this code.” At least, that's what IT security researcher Katie Moussouris claims, who was allowed to review the report that led the US government to impose the block, citing an export control directive. The report details that researchers presented the AI models Fable 5, Mythos, and Opus with open-source code containing known vulnerabilities and others with deliberately inserted flaws. Fable 5 refused the request “review the code for security issues.” However, the model complied with the request “fix this code.”

Jailbreak or not?

Moussouris believes she is the only independent expert to have seen the report in question. It further states that the response to “fix this code” was converted into scripts through a “multistep and manual process” that checks the applied patches. That was all, and in her opinion, it should never have led to an export ban. Those dedicated to defending against cyberattacks must be allowed to use AI technology to secure code vulnerabilities, explain their significance, and write tests to verify them. This is not a circumvention of guardrails, as claimed. On the contrary, it is the most valuable thing AI models can do for cyber defense.