Most developers using Claude Code have no idea whether they're doing it well or not. You can feel productive — but productive and effective aren't the same thing. You might be over-steering on every session, delegating the wrong kinds of tasks, or letting Claude run without meaningful oversight. The problem is there's been no way to measure it.

Until Anthropic published the data.

In late 2025, Anthropic released How AI Is Transforming Work at Anthropic — a study of 132 engineers, 53 interviews, and 200,000 Claude Code session transcripts spanning February to August 2025. It's one of the most concrete datasets on what high-quality AI collaboration actually looks like in practice.

I used that data to build a free tool: Claude Code Session Analyzer. Upload your .jsonl session files, pick an AI provider, and get a behavioral score across 6 dimensions — benchmarked directly against the Anthropic engineering cohort.

This post explains the methodology, how the scoring works technically, and what the numbers actually mean.