WARPTECHNEWS · LAB
HomeAIBusinessTechArchive
WARPTECH LAB NEWS

Warptech Lab News aggrega le notizie più rilevanti da oltre 700 fonti internazionali, con classificazione AI, TL;DR sintetici e timeline cluster su singole storie.

Navigazione

  • Home
  • Archivio
  • Editor's Brief
  • Cerca
  • Il tuo account
  • Newsletter tech/AI

Informazioni legali

  • Privacy Policy
  • Termini di servizio
  • Cookie Policy

© 2026 Sparktech S.R.L. — Tutti i diritti riservati. Sito gestito e manutenuto da Sparktech S.R.L.

Sede legale: Corso Libertà 55, 13100 Vercelli (VC), Italia · P.IVA / C.F. 02835910023 · Contatti: admin@warptechlab.com

Home
Storia in 3 fonti

Humans outperform AI at this highly rigorous mathematics test

A new benchmark pitting AI against previously unseen maths problems shows systems still fall short of top human expertise.

Raccontata dascientificamerican.comnature.comcryptobriefing.com

Confronto fonti

3 prospettive sulla stessa storia
AI · summaries
nature.comStai leggendo6 g fa

Humans outperform AI at this highly rigorous mathematics test

First Proof test on ten unpublished research math problems: the top model (ETH's ChatGPT harness with advisory council) scored 6/10, below expert mathematicians. This reveals critical limits in autonomous mathematical reasoning, constraining AI as a research assistant or proof-checker in enterprise R&D.

originale
scientificamerican.com7 g fa

AI scores a ‘C–’ on its hardest math test yet

The second batch of “First Proof” problems is meant to evaluate AI’s usefulness for research-level math. The best model got six or seven of the 10 questions right

Leggi questa versione → originale

Timeline cronologica

  1. mercoledì 10 giugno 2026·scientificamerican.com

    AI scores a ‘C–’ on its hardest math test yet

    The second batch of “First Proof” problems is meant to evaluate AI’s usefulness for research-level math. The best model got six or seven of the 10 questions right

  2. venerdì 12 giugno 2026·nature.com

    Humans outperform AI at this highly rigorous mathematics test

    A new benchmark pitting AI against previously unseen maths problems shows systems still fall short of top human expertise.

cryptobriefing.com3 g fa

Mathematicians grade AI performance on complex problem set at Harvard

Harvard mathematicians tested OpenAI and Google AI on 10 unpublished research math problems; models solved 7 of 10, versus 2-of-10 earlier. The jump signals genuine reasoning capacity beyond pattern-matching—essential intel for CTOs evaluating AI for research-grade technical work.

Leggi questa versione → originale
  • lunedì 15 giugno 2026·cryptobriefing.com

    Mathematicians grade AI performance on complex problem set at Harvard

    Thirty mathematicians at Harvard blind-graded AI solutions to 10 original research-level math problems. AI passed seven but struggled with the hardest