A community thread on r/learnmachinelearning landed on a sharp claim this week: 20% of ML theory handles 80% of production work. The post — written by a data scientist six months into an engineering role — named the algorithms (logistic regression, gradient-boosted trees, transformers) and the shipping skills (Docker, SQL, data validation). It left the theory itself implicit. The four classical concepts below are what production reliably tests for, and what reliably falls away.
Bias-variance, but as a deployment forecast
Bias-variance is taught as a U-curve and a training-set anecdote. In production it shows up earlier — as the forecast for whether a model will quietly degrade between offline metrics and live traffic. High-variance fits look brilliant on a held-out set and embarrass themselves on the long tail; high-bias fits look mediocre offline and stay mediocre live. The reason the framework earns its keep is that it answers the question every team asks in week three — "training looked fine, deployment didn't, why" — without inventing new vocabulary for the diagnosis.
Why regularization is a data-budget question
The textbook frames regularization as a way to discourage large weights. The production frame is cheaper: regularization is the lever for "how much data does this model have, really, after the duplicates and the leakage are gone." Strong L2, larger dropout, smaller learning rates are the same answer to the same problem — the effective dataset is smaller than the row count suggests. Tuning regularization without first auditing data quality is how teams burn a week chasing a number that data cleaning would have moved more.












