Frontier Bakeoff: We Benchmarked Fable 5 Hours Before the Shutdown

Fable 5 didn't win.

I need to say that up front because the timing of this post is going to make it sound like a very different story. Yes, we benchmarked Claude Fable 5 on our homelab harness. Yes, the US government suspended it about three hours later. But the actual result? Fable 5 scored 89.3. Opus 4.8 scored 91.9. The model everyone's eulogizing right now lost to a model you can still use today.

That's the real story. The suspension is just what makes it weird.

What We Tested

This is Round 6 of our homelab bakeoff series — but with a twist. Rounds 1 through 5 tested quantized local models on an RTX 5090 via llama.cpp. This time we pointed the same task suite at four frontier cloud models:

Frontier Bakeoff: We Benchmarked Fable 5 Hours Before the Shutdown

Other newsrooms on this story

Related reading

Claude Fable 5 Pulled by US Export Order — 72 Hours After Launch

Claude Opus 5 Outscores Fable 5 on Most Benchmarks—At Half the Price - Decrypt

I Had 72 Hours With the Best AI Model Ever Released. Then the Government Took…

I used Claude Fable 5 for 28 minutes. Then the US government shut it down.

Fable 5 Went Dark Friday Night. I Ran My Critical Workflow on a Backup Saturday…

Fable 5 vs GPT 5.5: Anthropic's model dominated every benchmark, then the…