How I Built a Real-Time Fraud Detection System That Handles 71,000 RPS at p95 <6ms

A deep dive into building Sentinel — an ML inference pipeline that processes 7.8M requests with zero errors, using XGBoost, ONNX, and Go.

The Problem

Fraud detection is a classic hard problem in systems design. You need to:

Classify transactions in real-time — users can't wait 100ms for a payment to go through