Storia in 1 fonti

Evaluating agents for scientific discovery | Ai2

Two benchmarks developed at Ai2 – ScienceWorld and DiscoveryWorld – reveal that even incredibly strong AI science agents struggle with problems human scientists solve routinely.

Raccontata da

allenai.org

Timeline cronologica

martedì 26 maggio 2026·allenai.org
Evaluating agents for scientific discovery | Ai2
Two benchmarks developed at Ai2 – ScienceWorld and DiscoveryWorld – reveal that even incredibly strong AI science agents struggle with problems human scientists solve routinely.