What Happened This Week
Week 3 produced a working fine-tuned model: one epoch, one dataset, a clear improvement over the base model. This week 4 was supposed to make it better with More data (a second dataset), two epochs, and a cleaner setup.
The eval loss dropped from 2.495 to 2.275. By that number alone, Week 4 was going to be a success.
The model was worse.
This is the story of how a better loss number hid a serious regression, how I diagnosed it, and what it took to actually fix it. It is one of the most useful things I have learned in this project.









