Running Brand-New Gemma 4 12B on an 8-Year-Old GTX 1080 Ti: Speed, 3 Gotchas, and Why Q8 Beat Q4 on My Own Field
I pulled the just-released Gemma 4 12B and ran it on a GTX 1080 Ti. ~28 tok/s at Q4 on one card — but three things broke first, and Q8 (2 cards, 30% slower) fixed both the token glitches and a domain answer Q4 got confidently wrong.