I was paying $480/month for GPT-4o API access. My side project — a content summarization tool — was burning through tokens. Every week I'd check the bill and wince. $120. $140. Then $480 in a bad month.

I knew Chinese AI models existed, but I had assumptions: harder to access, lower quality, complicated setup. I was wrong on all three.

After a weekend benchmarking, I switched. My bill dropped to $28/month. The quality? My users didn't notice a difference. Here's exactly how.

The Setup

I'm running a Python app that summarizes long articles, support tickets, and docs. Heavy on text processing — about 15-20 million tokens per month. Mostly GPT-4o, some GPT-4o-mini for simpler tasks.