I'm a data science student learning how AI APIs are priced. I run agentic coding sessions through OpenRouter, where every code-generation loop pulls a fresh batch of tokens from a model I picked from a list of 300+ names I barely understood. Some loops cost a few cents. Some cost dollars. The total adds up faster than I expected.
So I built a calculator for it. The blog post below is what I learned along the way.
The cheapest paid models on OpenRouter in mid-2026 are clustered in the open-weights category. Llama 3.1 8B Instruct from Meta is $0.02 per 1M input tokens and $0.03 per 1M output tokens. Phi-4 from Microsoft is $0.07/$0.14. Llama 3.3 70B at $0.10/$0.32 is the cheapest 70B-class model. Mistral Small 3.1 24B at $0.35/$0.56 is the cheapest non-Meta option in the mid-tier band.
The per-call cost on these models is essentially free at low volume. A chat-shaped call (1,000 in + 500 out) on Llama 3.1 8B is $0.000035. At 1 million calls per month, that is $35. The same call on GPT-4o is $0.0075, which is $7,500 per month. The cheap models buy a 200x cost reduction at the cost of some quality on hard tasks.
Therefore the cheap tier is the right starting point for any product where the per-call cost is a meaningful fraction of revenue.






