Part 1 of a 4-part series. Full fine-tuning a tiny Gemma 3 model for intent classification — the generative framing, the loss-masking trick, and why full FT is so learning-rate sensitive.

Part 1 of a 4-part series. Full fine-tuning a tiny Gemma 3 model for intent classification — the generative framing, the loss-masking trick, and why full FT is so learning-rate…

Part 2 of a 4-part series. How LoRA works (the low-rank trick), a working PEFT config, and three real GPU walls I hit — the FP16 unscale error, an OOM, and a 2-GPU speed mystery.

Part 4 (finale) of a 4-part series. Three model sizes tied on the same task — so when does bigger actually earn its keep? And the bug no model size could fix.