AstaBench’s latest update adds new frontier-model results, including GPT-5.5, and highlights growing adoption from groups including the UK AISI, General Reasoning, Elicit, SciSpace, Distyl AI, and EvoScientist.

TL;DR: presenting the ultimate benchmark, getting models to create benchmarks for each other, and GPT 5.2 is the current (only) winner

AstaBench’s latest update adds new frontier-model results, including GPT-5.5, and highlights growing adoption from groups including the UK AISI, General Reasoning, Elicit,…