TL;DR

Three flash-tier coding models are competing for your API budget right now: Google's Gemini 3.5 Flash (May 19, 2026), Anthropic's Claude Haiku 4.5 (the reigning budget pick since October 2025), and Microsoft's MAI-Code-1-Flash (June 2, 2026). Haiku wins on output cost at $5/M tokens and structured output reliability. Gemini 3.5 Flash leads on agentic benchmarks (76.2% Terminal-Bench 2.1) and offers a 1M-token context window. MAI-Code-1-Flash beats both on SWE-Bench Pro by 16 points at 51.2%, but you can only use it inside GitHub Copilot. Pick based on where you actually build: Copilot users get MAI-Code-1 for free, API builders choose between Haiku's cost and Flash's context, and anyone running agent loops with tool calls should benchmark Flash first.

Three Flash Models, Three Different Bets

I've spent the last three weeks routing coding tasks through all three of these models: code reviews in Copilot, agent loops via API, and batch refactors across a 40-file Python project. The experience taught me something the benchmark tables don't show: each model was built with a different definition of "coding" in mind.

Google optimized Gemini 3.5 Flash for agents that run in terminals, call tools, and iterate. Anthropic built Haiku 4.5 for developers who need a cheap, fast model that follows instructions precisely and returns clean JSON. Microsoft trained MAI-Code-1-Flash end-to-end inside the GitHub Copilot harness, so it knows how VS Code works, what diffs look like, and how to stay concise in inline completions.