Tuesday afternoon I kicked off a re-grading job. About 18,000 prompts against claude-opus-4-7, eight workers, each one looping messages.create as fast as it could.
Forty minutes in, every call started coming back with a 429 and a header that said anthropic-ratelimit-tokens-remaining: 0. Fine, I thought. Back off. I cut workers to four and waited. Still 429. Cut to two. Still 429.
Then I noticed the cap-clear timestamp was not minutes. It was rolling. I had pushed past the daily token budget for the whole org, and a daily window does not reset in five minutes.
I emailed support. They acknowledged Wednesday morning. They cleared the cap Friday afternoon. 72 hours.
I am not going to claim the engineering was elegant after that. I sat there refreshing the dashboard for three days. When the cap finally cleared, I built llmfleet so I would never sit there again.







