Top AI agents achieved zero percent on expert-level professional tasks according to the ALE benchmark. It wasn't minimal, it wasn't frustrating. Not even one.

Enjoy this satisfying round number while your timeline fills up with threads about how agents will replace your entire engineering team by Q3.

What ALE Actually Showed

ALE, which stands for Agents' Last Exam, is a benchmark meant for testing AI agents on problems that demand real professional expertise. Not the "summarize this PDF" kind of problems. But hard, domain-specific work that experts in the field do.

The findings were grim. Models including Fable 5 and GPT-5.5 were among those tested. On the most difficult "Last-Exam" tier of expert-level problems, they obtained a 0% pass rate (note that partial credit was non-zero). A coin flip would have been more impressive.