Introduction: The Wrong Question

GitHub's shift from premium requests to usage-based billing has triggered a wave of anxiety across engineering teams. The question echoing through Slack channels and leadership meetings is some variation of: "How do we reduce our token spend?"

It's the wrong question.

Focusing purely on cost diminishes the value you get from agents. A better framing is: "How do we get the most out of the tokens we spend?" That subtle reframing changes everything — from how you write prompts, to which model you reach for, to how you architect your codebase, to how you organize your team's workflows.

This article walks through the full case for quality-first token optimization, the foundational mental models you need to reason about it, and the concrete controls and techniques that move the needle.