Speculative decoding: how a small draft model plus a single-pass target verification cuts TTFT, the variants that actually work, and the gotchas that bite you in production.