Deepseek's new DSpark framework boosts per-user response speed by 60 to 85 percent. A small model proposes token candidates that the larger model checks in batches, squeezing more performance out of fewer chips. That could further reduce China's dependence on US high-end hardware.

DeepSeek releases DSpark, an open-source speculative decoding framework accelerating DeepSeek-V4 per-user generation 57–85% over MTP-1

DeepSeek's new DSpark framework delivers 60% to 85% faster inference speeds for its V4 models through speculative decoding, with throughput gains up to