The Gemini 3.5 Flash announcement made the rounds on Hacker News this week, and I've been getting pinged by teammates asking whether to migrate our internal tools off Claude Haiku or GPT-4o mini. After spending a weekend running informal evals on classification and code-routing tasks, I have some takes.

Upfront caveat: I'm hedging on specific Gemini 3.5 Flash feature claims because marketing benchmarks and actual API behavior rarely match. Check the official Gemini docs before you bet your architecture on any number — including the ones I quote.

Why migrate small models at all?

Most teams I work with use small, fast models for the boring stuff:

Classification (is this ticket billing, bug, or feature-request?)