Part 1 introduces scheduling algorithms for foundation model training over heterogeneous networks, achieving near data-center throughput on 100x slower connections.

Part 1 introduces scheduling algorithms for foundation model training over heterogeneous networks, achieving near data-center throughput on 100x slower connections.

Part 2 explores activation compression techniques for decentralized training over slow networks, reducing communication overhead while maintaining model quality.