Deep dive into Android Runtime memory management during ML inference — using profile-guided compilation hints, large object space pinning, and region-based allocation to eliminate GC stalls that cause frame drops when running on-device models. Covers RegionSpace tuning, CC collector behavior during tensor allocation bursts, and the JNI boundary strategies that keep native inference buffers out of managed heap pressure.