I'm starting an Electronics and Communication Engineering degree this year, and a few weeks before classes began I decided to build something real instead of waiting for a syllabus to tell me what to learn: a model that detects abnormal heartbeats from raw ECG signal, small enough to run on a microcontroller, not a cloud GPU.

The first version of this project hit 98% accuracy. That number was almost meaningless, and it took me two separate rounds of being wrong to find out why.

The number that looked great

The task is beat classification on the MIT-BIH Arrhythmia Database, a public dataset of annotated heartbeats used across decades of cardiology ML research. Each heartbeat gets sorted into one of five categories defined by the AAMI standard: Normal, Supraventricular ectopic, Ventricular ectopic, Fusion, and Unclassifiable.

My first pipeline extracted twenty features from each heartbeat's waveform — amplitude, statistical moments, frequency content — and trained three tree-based models at different size tiers, the smallest exportable to plain C for a microcontroller with a few kilobytes of flash. I evaluated with five-fold cross-validation and got 96–98% accuracy depending on model size.