You open a pull request. Thirty seconds later, an AI reviewer drops a comment: "Looks good to me. No issues found."

You feel a tiny chemical reward. Approval. Speed. You're one click closer to merging. The cognitive cost of waiting for a human reviewer just got compressed into half a minute, and the diff you spent two hours wrestling with is now blessed by a system that has read more code than any single engineer alive.

Then a week later, the same code ships a subtle authorization bug to production, and you're staring at an incident channel wondering how that "Looks good to me" survived a real review.

This is the question that follows every AI code reviewer around like a shadow. Is it a helpful assistant - a faster, calmer, more patient version of the senior engineer who used to leave you twelve comments before lunch? Or is it a false confidence machine - a tool that says reassuring things about code it doesn't fully understand, and convinces you to merge anyway?

The honest answer is both, and which one you get depends almost entirely on how you set up the workflow around it. Let's break down the four angles that actually matter - what these reviewers catch well, where they fail on correctness, where they fail on security, where they make things up, and how to design a review process that gets the upside without buying the downside.