TL;DR (Quick Answer)

Gemma 4 12B just dropped, so I ran it on a GTX 1080 Ti (Pascal, 2017) to see what an 8-year-old card does with a 2026 model. Real numbers, and a few honest surprises:

Speed: ~28 tok/s at Q4_K_M on a single 1080 Ti (~8 GB VRAM). The 12B fits one card, so the second GPU sits idle.

Three things broke before it worked: the GGUF is multimodal and its vision projector crashes Ollama; it's a reasoning model that hides its answer in a thinking channel; and Q4 produces visible token glitches.

The interesting part — Q4 vs Q8. I asked it real bioinformatics questions. At Q4 it answered concepts and code well but got a niche method (the HEIDI test) confidently backwards, with garbled characters sprinkled in. Going to Q8_0 (12.7 GB, split across both 1080 Tis, ~30% slower at ~19.5 tok/s) removed the glitches and fixed the wrong answer.