Detecting API anomalies behind a 200 OK — with statistics, not AI

Most uptime monitors answer one question: is it up or down? But some of the worst incidents I've...

lunedì 15 giugno 2026 New tab

784 words~4 min read

Most uptime monitors answer one question: is it up or down? But some of the worst incidents I've dealt with returned a perfectly happy 200 OK:

an endpoint that started serving a cached error page

a JSON API returning {"error": ...} with status 200

a response that quietly got 10× slower

a payload that dropped from 14 KB to 800 bytes because a backend started returning empty results. A plain up/down check sails straight past all of these. I wanted my monitor to notice "it's up, but it's wrong." Here's how I built that — and why I deliberately didn't reach for machine learning (or the word "AI").

Detecting API anomalies behind a 200 OK — with statistics, not AI

Detecting API anomalies behind a 200 OK — with statistics, not AI

Related reading

HTTP 200 Is a Lie: A 30-Line Schema Canary for Source Drift

Why Your AI Agent Monitoring is Wrong (And How to Fix It)

Why your uptime monitor says everything's fine while users see a white screen

When APIs Lie: A Lesson in Defensive Debugging

A Link Can Be Up and Still Be Wrong

I Monitored 10,000 AI API Calls. Here's What Went Wrong.

Related reading

HTTP 200 Is a Lie: A 30-Line Schema Canary for Source Drift

Why Your AI Agent Monitoring is Wrong (And How to Fix It)

Why your uptime monitor says everything's fine while users see a white screen

When APIs Lie: A Lesson in Defensive Debugging

A Link Can Be Up and Still Be Wrong

I Monitored 10,000 AI API Calls. Here's What Went Wrong.