Storia in 3 fonti

Humans outperform AI at this highly rigorous mathematics test

A new benchmark pitting AI against previously unseen maths problems shows systems still fall short of top human expertise.

Raccontata da

scientificamerican.com

nature.com

cryptobriefing.com

Confronto fonti

3 prospettive sulla stessa storia

AI · summaries

nature.comStai leggendo1 mese fa

Humans outperform AI at this highly rigorous mathematics test

First Proof test on ten unpublished research math problems: the top model (ETH's ChatGPT harness with advisory council) scored 6/10, below expert mathematicians. This reveals critical limits in autonomous mathematical reasoning, constraining AI as a research assistant or proof-checker in enterprise R&D.

originale

scientificamerican.com1 mese fa

AI scores a ‘C–’ on its hardest math test yet

The second batch of “First Proof” problems is meant to evaluate AI’s usefulness for research-level math. The best model got six or seven of the 10 questions right

Leggi questa versione → originale

Timeline cronologica

mercoledì 10 giugno 2026·scientificamerican.com
AI scores a ‘C–’ on its hardest math test yet
The second batch of “First Proof” problems is meant to evaluate AI’s usefulness for research-level math. The best model got six or seven of the 10 questions right
venerdì 12 giugno 2026·nature.com
Humans outperform AI at this highly rigorous mathematics test
A new benchmark pitting AI against previously unseen maths problems shows systems still fall short of top human expertise.

Humans outperform AI at this highly rigorous mathematics test

Confronto fonti

Humans outperform AI at this highly rigorous mathematics test

AI scores a ‘C–’ on its hardest math test yet

Timeline cronologica

AI scores a ‘C–’ on its hardest math test yet

Humans outperform AI at this highly rigorous mathematics test

Mathematicians grade AI performance on complex problem set at Harvard

Mathematicians grade AI performance on complex problem set at Harvard