Cisco used AI to write security incident reports, with mixed results

You’ll need a lot of detailed prompts to get solid output - and even then it may have errors and typos

Cisco tested AI’s ability to write an accurate report on a tabletop security incident response exercise, and found that while the tech can save time, many risks remain.The networking giant revealed its results in a Thursday blog post https://blogs.cisco.com/security/ai-generated-reporting-lessons-learned-from-talos-incident-response by Nate Pors, a senior incident commander in the Cisco Talos Incident Response team.Pors opened by observing that when to used generate long-form technical content, large language models can deliver “significant inaccuracies, unusual conclusions, and inconsistent writing styles.”

LLMs make those mistakes because they’re essentially a fancy autocomplete system that makes educated guesses. Pors wrote that the nature of LLMs therefore sees them mess up in four ways:

Using different data for each query, which means it’s “difficult to rely on an LLM for repeatable, standardized research outcomes.”Reaching different conclusions from the same data. “In a data breach scenario, a model might suggest a full organization-wide password reset in one instance and a targeted reset in another,” Pors wrote and AI then “often defaults to whichever recommendation it generates first” – and may therefore give bad advice.Because LLMs generate content token-by-token, they can create documents with different structure and formatting on each new run. “This unpredictability is problematic for professional environments where standardized layouts, such as consistent executive summaries or recommendation sections, are essential for quality control,” the Talos man observed.AI can discard data, so its output might ignore critical information.Talos developed several techniques to stop this sort of thing happening.One involves giving an LLM “granular, single-task instructions” that focus on “a specific, small portion of the report.” Doing so means “risk of hallucination or cross-contamination between sections is significantly reduced.” Telling an LLM which sources to use also helps. So does setting rules about the style and format of output.

You’ll need a lot of detailed prompts to get solid output - and even then it may have errors and typos

LLMs make those mistakes because they’re essentially a fancy autocomplete system that makes educated guesses. Pors wrote that the nature of LLMs therefore sees them mess up in four ways:

Cisco used AI to write security incident reports, with mixed results

Cisco used AI to write security incident reports, with mixed results

Other newsrooms on this story

Related reading

I let the AI write the report, not decide the alerts

Enterprise AI's Security Time Bomb Is Ticking. Cisco Shares Its Plan.

Cisco report finds no closed frontier AI model is safe from multi-turn attacks…

AI now carries out cyber attacks with little human input: Report

Cisco bets on small AI for cybersecurity

Thinking Fast and Slow in the SOC: The Case for Combining Autonomous AI with…

Other newsrooms on this story

Related reading

I let the AI write the report, not decide the alerts

Enterprise AI's Security Time Bomb Is Ticking. Cisco Shares Its Plan.

Cisco report finds no closed frontier AI model is safe from multi-turn attacks…

AI now carries out cyber attacks with little human input: Report

Cisco bets on small AI for cybersecurity

Thinking Fast and Slow in the SOC: The Case for Combining Autonomous AI with…