Posted September 8, 2025 by nevillelyh gandalfhz We now cache torch.compile artifacts to reduce boot times for models that use PyTorch.

Models like black-forest-labs/flux-kontext-dev, prunaai/flux-schnell, and prunaai/flux.1-dev-lora now start 2-3x faster.

We’ve published a guide to improving model performance with torch.compile that covers more of the details.

What is torch.compile?

Many models, particularly those in the FLUX family, apply various torch.compile technique/tricks to improve inference speed.