I put 12 AI models into a public World Cup prediction arena.

Not because I think anyone should use LLMs for betting. They should not. The page says entertainment only for a reason.

I did it because sports prediction is a surprisingly clean stress test for models:

structured facts

stale priors