A new benchmark for assessing how the largest artificial intelligence models could be used for terrorist attacks has found the industry has already facilitated deadly attacks. Tech against Terror conducted a series of tests across 27 leading AI systems to demonstrate how robust they were in denying information that could be used to kill. The group created the Counter-Terrorism AI benchmark, which found one third of responses could help a terrorist in perpetrating an attack. A report issued on Wednesday said there were real concerns that AI could assist terrorist plots and identified 30 such cases with a combined death toll exceeding 70. "What a model produces when asked to assist terrorism is therefore a major safety concern, not a hypothetical one, and as model capability rises and access widens it becomes a property that can and should be measured," the report said. "The agentic nature of the newer models also increases the risk that 'single-shot' failures could be compounded in autonomous reinforcement loops."A single-shot prompt is when the AI model is given one example of the task in hand before it received the prompt. Because a template has been offered to follow, it is said to hugely improve consistency in AI performance.Mass attacksThe report cited an incident last November when a car bomb exploded near the Red Fort in New Delhi, India, killing at least a dozen people and injuring more than 20 others. "The plot was linked to an Al Qaeda-aligned module whose 'in-house engineer' had used ChatGPT and YouTube to research device construction and explosive chemistry – an early, lethal example of AI used as an operational assistant in terrorism," it said. In the real-world incident there were links to at least 11 AI tools. The establishment of a benchmark involved testing across 27 models and the execution of almost 2,500 single-shot prompts. Two open models with their safety stripped out, a process called abliteration, complied with 89 per cent and 100 per cent of dangerous requests, respectively. These models cannot be recalled, it notes.Overall the tests found that around a third of responses handed over usable content. Tech against Terrorism said that AI assistance to the inception of terrorist activity was a safety and control problem across the board. The issues was not confined to the extremes of testing protocols. “Until now, there was no AI benchmark focused specifically on terrorism, so we built one. "With nothing more than simple, single-shot questions, many of the models we tested handed over meaningful help towards making a bomb or planning a mass-casualty attack," founder Adam Hadley said. "This is not acceptable. “This is a control problem as much as a safety one. The real risk is that AI developers are inadvertently creating models they cannot control." What the model tests is the extent to which a model provides or compiles insights and information above the baseline of what can be found online ordinarily through search. It found that initial full refusal to provide information rates varied widely, with some models registering almost 90 per cent but one leading product dropping below 50 per cent. It also looked at the wider help to a potential bomber by weighting the information, a process that generated concerning outcomes. "With rudimentary single-shot examples, many models provide meaningful uplift for terrorist use cases such as making bombs or planning mass casualty attacks," it said. Even where refusals were high, this could be offset by severity of the assistance on offer. So where refusals accounted for 57 per cent of responses, there was still what was deemed "hedged compliance" accounting for 15 per cent. This is where output opens with a refusal or warning and then supplies the full requested content anyway. The report says that in these instances the refusal is only cosmetic.Tech against Terrorism is offering its benchmarking process to the industry as a task force to raise safety. RecommendationsIt makes a series of recommendations to the tech giants and governments. Treat terrorist misuse as a distinct safety category.Test models for changes in stated intent, which currently defeat many guardrails. Extend refusal training beyond the most recognisable threat. Treat the circulation of de-restricted, or abliterated, open models as a major national security concern, tracking their distribution and planning for it.
AI models provide assistance to terror attacks | The National
Tech against Terrorism model finds one third of requests would boost plots







