I Built a Fraud Detection System That Catches 99.76% of Fraud — Here's Everything I Learned

There is a number that haunts every fraud detection engineer: 0.13%.

That is the fraud rate in the PaySim dataset — 8,213 fraudulent transactions buried inside 6,362,620 legitimate ones. It sounds small. It is not. At that ratio, a model that predicts "legitimate" for every single transaction achieves 99.87% accuracy — and catches exactly zero fraud.

This is the problem I set out to solve with TrustGuard AI, a course project that turned into one of the most technically demanding things I have built. By the end of it, our deployed XGBoost model achieves AUC-ROC of 0.9995 and Recall of 0.9976 — meaning it catches 99.76% of all fraud on a 6.3 million row test set. It also explains every single prediction using SHAP, and grounds each fraud alert in real State Bank of Pakistan regulatory documents through a RAG pipeline.

This article is the full story — what worked, what broke, and why accuracy is the wrong metric for fraud detection.

The Problem With Accuracy

I Built a Fraud Detection System That Catches 99.76% of Fraud — Here's Everything I Learned

Other newsrooms on this story

Related reading

How I Built a Real-Time Fraud Detection System That Handles 71,000 RPS at p95…

How Machine Learning Detects Fraud: A Practical Breakdown

Your Fraud Stack Was Built For Humans: Now What?

Your Fraud Detection Cannot Tell a Legitimate Agent From a Bot. Identity Is the…

Real-Time Fraud Detection: Latency, Features & Scale

Stop Trusting Your Accuracy Score: A Practical Guide to Evaluating Logistic…