WARPTECHNEWS · LAB
HomeAIBusinessTechArchive
WARPTECH LAB NEWS

Warptech Lab News aggrega le notizie più rilevanti da oltre 700 fonti internazionali, con classificazione AI, TL;DR sintetici e timeline cluster su singole storie.

Navigazione

  • Home
  • Archivio
  • Editor's Brief
  • Cerca
  • Il tuo account
  • Newsletter tech/AI

Informazioni legali

  • Privacy Policy
  • Termini di servizio
  • Cookie Policy

© 2026 Sparktech S.R.L. — Tutti i diritti riservati. Sito gestito e manutenuto da Sparktech S.R.L.

Sede legale: Corso Libertà 55, 13100 Vercelli (VC), Italia · P.IVA / C.F. 02835910023 · Contatti: admin@warptechlab.com

Home
Storia in 1 fonti

New math benchmark reveals AI models confidently solve problems that have no solution

A consortium of 64 mathematicians built SOOHAK, a new AI benchmark with 439 handwritten tasks, including 99 that are deliberately unsolvable. Google's Gemini 3 Pro leads on research-level problems at 30 percent. But no model cracks 50 percent on spotting broken tasks. More compute makes models better at solving. It doesn't improve them at admitting a problem has no answer. SOOHAK tries to pin down the gap between a few flashy results and the broad research skills AI systems still lack.

Raccontata dathe-decoder.com

Timeline cronologica

  1. domenica 17 maggio 2026·the-decoder.com

    New math benchmark reveals AI models confidently solve problems that have no solution

    A consortium of 64 mathematicians built SOOHAK, a new AI benchmark with 439 handwritten tasks, including 99 that are deliberately unsolvable. Google's Gemini 3 Pro leads on…