Posted September 8, 2025 by nevillelyh gandalfhz We now cache torch.compile artifacts to reduce boot times for models that use PyTorch.
Models like black-forest-labs/flux-kontext-dev, prunaai/flux-schnell, and prunaai/flux.1-dev-lora now start 2-3x faster.
We’ve published a guide to improving model performance with torch.compile that covers more of the details.
What is torch.compile?
Many models, particularly those in the FLUX family, apply various torch.compile technique/tricks to improve inference speed.
















