Coinbase CEO Brian Armstrong has moved his company to cheap Chinese AI models. The company is using more tokens than ever but paying half what it used to.
Coinbase now runs on models like GLM 5.2 and Kimi 2.7, according to Armstrong. Developers can still pick whatever model they want, but 91 percent never hit their old usage limits anyway.
The CEO of startup Lindy made the same move to Deepseek v4 recently. Snowflake is testing Chinese models too as cheaper alternatives to OpenAI and Anthropic. That puts real pricing pressure on Western AI labs and adds risk right as some are eyeing IPOs. It's a stress test for the growth numbers they need to hit to justify the money they've raised.
Coinbase also runs an automatic routing system that picks the best model for each request based on task, price, and caching potential. Better caching alone pushed the hit rate from 5 to 60 percent. Developers are told to keep context lean and start fresh sessions for new tasks, a strategy that falls under the broader umbrella of context engineering.
Coinbase's token usage has shot up in recent months as agentic reasoning models like GPT-5.x-Thinking and Opus 4.5 hit the market. | Image: Armstrong










