Back to Articles
When a model’s training history is moved close enough to its deployment task, parameter count stops being the decisive variable. A 3-billion-parameter specialized model outperformed every commercial frontier API tested in a well-measured enterprise domain — at roughly fifty times lower cost. The Strategic Default What the Empirical Record Actually Shows The Variable That Mattered Specialization Compounds The Strategic Questions That Change A Bounded Reframe Sources:
When a model’s training history is moved close enough to its deployment task, parameter count stops being the decisive variable. A 3-billion-parameter specialized model outperformed every commercial frontier API tested in a well-measured enterprise domain — at roughly fifty times lower cost.
In April, we released DharmaOCR — a pair of specialized small language models for structured OCR, alongside a benchmark and the accompanying paper. The models and the benchmark are available on Hugging Face. Together they form part of a broader effort at Dharma to study how specialization, alignment, and inference economics interact in production AI systems.
This article isolates one strategic implication from those findings: the relationship between specialization, distributional alignment, and parameter scale. What follows develops it within the boundaries the paper supports.








