I run my own smart home — Home Assistant, voice assistant pipeline, the whole self-hosted thing. The speech-to-text step (Parakeet TDT 0.6B v3 over the Wyoming protocol) had been running on my i3 1220P intel NUC with an 12gb RTX 3060 eGPU for months. I recently upgraded my home server to a full desktop with an AMD 7900XTX, and since I want to save as much of the VRAM as I can for LLMs, I've been running nvidia parakeet on CPU since then.

It works fine, but it always nagged me: my new home server has an Intel Core Ultra 7 265K (Arrow Lake) with the built-in "AI Boost" NPU, and that silicon was sitting completely idle.

With the hype of AI, chip manufacturers have started to slap NPUs on their chips mostly so they can put AI on their names, but little to no software actually makes use of them, although some projects are starting to pop here and there.

So I decided to actually try one if I could put that stupidly underused chunk of silicon to work on a workload that should, on paper, be ideal for it.

And it worked remarkably well, but the road was bumpy.