The Node.js bug that's invisible to your monitoring

Your health check is a single line. res.send('ok'). It used to take a millisecond. Then traffic ramped up one afternoon and p99 went to 400ms, and you spent the next three hours staring at dashboards that all said the same thing, which was nothing.

CPU is moderate, event loop lag is flat, memory looks healthy, and your APM is reporting that the request took 400ms while telling you nothing about why. No slow database spans, no slow downstream calls, no errors or GC pauses. The time was spent in a place your APM can't see.

The place is the libuv thread pool. Standard Node observability is built around the event loop, and the pool is a different queue with different occupants, sitting just out of reach of every dashboard you have.

What lives on the pool

Node's event loop runs your JavaScript. Anything that would block the loop, because it's CPU-heavy or because it's blocking I/O on a syscall that has no real async kernel variant, gets pushed to a separate pool of OS threads inside libuv. That pool defaults to four threads. Four, total, for the whole process, regardless of how many cores the machine has.

What lives on the pool

The Node.js bug that's invisible to your monitoring

The Node.js bug that's invisible to your monitoring

Related reading

Why is my Node.js app slow? An OpenTelemetry debugging checklist

We Deleted 300 Lines of Code After Discovering Node.js 22's Hidden Perf Feature

Stop Finding Out About Downtime from Users — Monitor Your Node.js App

Why your uptime monitor says everything's fine while users see a white screen

The Security Bug Every Node.js Developer Ships to Production

Observability Practices in Modern Applications: A Practical Guide with Node.js…

Related reading

Why is my Node.js app slow? An OpenTelemetry debugging checklist

We Deleted 300 Lines of Code After Discovering Node.js 22's Hidden Perf Feature

Stop Finding Out About Downtime from Users — Monitor Your Node.js App

Why your uptime monitor says everything's fine while users see a white screen

The Security Bug Every Node.js Developer Ships to Production

Observability Practices in Modern Applications: A Practical Guide with Node.js…