I Fine-Tuned a 270M Model on My Laptop (Full Fine-Tuning, From Scratch)

Part 1 of a 4-part series. Full fine-tuning a tiny Gemma 3 model for intent classification — the generative framing, the loss-masking trick, and why full FT is so learning-rate sensitive.

domenica 21 giugno 2026 New tab

TL;DRAI

Author fine-tuned Gemma 3 (270M parameters) on Banking77 intent classification, achieving ~96% accuracy via full fine-tuning on a laptop. Full fine-tuning updates all weights (4× memory cost) but is fragile—learning rate 5e-5 stable, 2e-4 crashes—making it the expensive baseline to benchmark LoRA/QLoRA against.

513 words~2 min read

Series — Fine-Tuning, Smallest to Largest (same task, three techniques, smallest model to largest):

Full Fine-Tuning (270M) ← you are here

LoRA (1.5B)

QLoRA (7B)

If the small model worked, why go bigger?

I Fine-Tuned a 270M Model on My Laptop (Full Fine-Tuning, From Scratch)

I Fine-Tuned a 270M Model on My Laptop (Full Fine-Tuning, From Scratch)

Other newsrooms on this story

Related reading

If a 270M Model Already Worked, Why Did I Fine-Tune a 7B One?

LoRA: I Trained <1% of a 1.5B Model and Matched a Full Fine-Tune

Beyond LoRA: Can you beat the most popular fine-tuning technique?

Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts, Enhanced Hugging…

QLoRA: Fine-Tuning a 7B Model on a 16GB GPU (It Shrank to 5.4GB in Front of Me)

Fine-Tuning Small Open-Source LLMs to Outperform Large Closed-Source Models by…

Other newsrooms on this story

Related reading

If a 270M Model Already Worked, Why Did I Fine-Tune a 7B One?

LoRA: I Trained <1% of a 1.5B Model and Matched a Full Fine-Tune

Beyond LoRA: Can you beat the most popular fine-tuning technique?

Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts, Enhanced Hugging…

QLoRA: Fine-Tuning a 7B Model on a 16GB GPU (It Shrank to 5.4GB in Front of Me)

Fine-Tuning Small Open-Source LLMs to Outperform Large Closed-Source Models by…