The 55.8 Percent Productivity Number From Doshi And Vaishnav Is Narrower Than People Think

When Doshi and Vaishnav published their controlled experiment on AI code completion in Science (2023), the headline that propagated everywhere was "55.8% faster." Repeat it enough and it becomes received wisdom.

The actual paper measured time-to-completion on a single well-defined HTTP server task. A problem with a known shape, a stable target, and a scoring function that rewarded a specific solution path. The 55.8% lift was real for that task. It is also the narrowest possible reading of what "AI productivity" means in software work.

A more careful follow-up at HICSS-59 (Stray et al., 2026) looked at sustained workflow integration over weeks instead of a single benchmarked task. Numbers compressed. Across mixed work (greenfield, debugging, refactoring, code review) aggregate time savings landed closer to 10-20%, with high variance across task class. Debugging and code review barely moved. Greenfield CRUD work moved the most.

That gap between single-task lab benchmark and integrated weekly workflow is where most engineering org AI productivity decisions are silently going wrong.

The mechanism gap

The 55.8 Percent Productivity Number From Doshi And Vaishnav Is Narrower Than People Think

Other newsrooms on this story

Related reading

Frontier Radar #2: Why AI productivity gets lost between benchmarks and the…

Other newsrooms on this story

Related reading

Frontier Radar #2: Why AI productivity gets lost between benchmarks and the…

Zoho founder Sridhar Vembu flags AI coding ‘productivity paradox’; says…

Stop Measuring AI Agents by How Much Code They Produce

Best AI Agents for Software Development Ranked: A Benchmark-Driven Look at the…

Who Will Solve the AI Productivity Puzzle?

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer…