A few months ago, I got tired of manually checking which AI model had the longest context window. Every week, some provider would quietly update a model card, or a new release would drop with a bigger number, and the leaderboard would shift without anyone noticing.

So I built something simple but obsessive: an automatically updating database that scrapes and ranks 360+ AI models by their advertised context windows(https://modelatlas.net/blog/long-context-models) or pricing (https://modelatlas.net/blog/cheapest-ai-models). It pulls from OpenRouter, official provider docs, and model cards. Every time a provider changes a spec, the database updates within hours.

Then, when I was watching it, the ranking algorithm did something that made me stop everything.

Llama 4 Scout appeared at the #1 position with a context window of 10,000,000 tokens.

I stared at the number for a solid minute. Ten million. That wasn’t just bigger than GPT-4. That was bigger than Claude, bigger than Gemini, bigger than everything. 5 Times the Second Largest (Grok after Llama’s claims). My first thought was exactly what you’d expect: "... What?"