Shipping Gemma 4 speech recognition in a Windows .NET desktop app: a 5-variant model-selection tour

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

Parlotype is a voice-to-text desktop app for Windows. It is built with .NET 10 and Avalonia UI. You hold a global hotkey, speak, then release it. Your text appears in whatever app you were typing into. All speech recognition runs on your machine. No cloud, no audio leaves the machine.

Google released Gemma 4 in April 2026. It has a native multimodal audio path. I added it as an alternative speech engine alongside the existing Whisper.net pipeline. You pick Whisper or Gemma 4 in Settings. The rest of the audio pipeline (WASAPI capture, then Silero VAD, then text injection) stays the same.

The interesting part, and what this post is mostly about, is which Gemma 4 variant to ship. The ggml-org GGUF repo publishes five variants (E2B and E4B, each in BF16, Q4_K_M, and Q8_0, except where the repo skips one). The model card does not tell you which combination of accuracy, speed, and disk footprint you will actually get. So I ran each one on the same dataset, picked a default, and shipped.

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

Shipping Gemma 4 speech recognition in a Windows .NET desktop app: a 5-variant model-selection tour

Shipping Gemma 4 speech recognition in a Windows .NET desktop app: a 5-variant model-selection tour

Related reading

Adding Gemma 4 speech recognition to a .NET desktop app: the llama-server…

Your Laptop Just Got Smarter: A Complete Guide to Gemma 4's Four Models

I Gave Gemma 4 150 Tools on Windows. Here's What Actually Happened.

I Fine-Tuned Gemma 4 on an Emotion Dataset Using a Single GPU

How I Built a Local, Multimodal Gemma 4 Visual Regression & Patch Agent:…

Gemma 4 Is Not Just Another Open Model — It Changes What Developers Can Build…

Related reading

Adding Gemma 4 speech recognition to a .NET desktop app: the llama-server…

Your Laptop Just Got Smarter: A Complete Guide to Gemma 4's Four Models

I Gave Gemma 4 150 Tools on Windows. Here's What Actually Happened.

I Fine-Tuned Gemma 4 on an Emotion Dataset Using a Single GPU

How I Built a Local, Multimodal Gemma 4 Visual Regression & Patch Agent:…

Gemma 4 Is Not Just Another Open Model — It Changes What Developers Can Build…