Two Layers Your AI-SDLC Metrics Are Missing

Originally published at devopsdiary.blog. Post F2 in the "Governing AI in the Enterprise" series.

DORA worked because shipping software was a pretty stable thing to measure. You changed code, you deployed it, you watched whether prod fell over. The four metrics held up for a decade because the underlying activity didn't change much underneath them.

Then Copilot showed up. And Cursor. And whatever your team is piloting this quarter that nobody told platform engineering about.

The activity changed. The metrics didn't. That's the gap.

I keep landing in the same place on this. You need two layers. Most teams have neither. One is an evaluation layer that watches the AI itself. The other is a governance layer that decides what the evaluation results mean. Skip either one and you end up with dashboards that look healthy while the work underneath them quietly drifts.

Two Layers Your AI-SDLC Metrics Are Missing

Related reading

Design AI Productivity Metrics That Survive Goodhart’s Law

I Used AI for Code Review on a Production ERP for 6 Months. Here's Where It…

The AI Development Life Cycle (AIDLC): Why Your ML Projects Need More Than SDLC

Why Your Developer Productivity Metrics Are Measuring the Wrong Thing

We Doubled Our AI Tooling Budget. Our Release Rate Dropped Anyway | Towards AI

4 Open-Source AI Tools, 1 MCP Server — What I Built and What I Learned