Toto 2.0: Time series forecasting enters the scaling era

Today we’re releasing Toto 2, a family of open-weights time series forecasting models, on Hugging Face. Spanning 4m to 2.5B parameters, Toto 2.0 is designed to answer a simple and open question: Can time series foundation models (TSFMs) improve as they scale? Our results show they can. The highlights:

Scaling that works. Every size improves on the one below it, with no sign of saturation at 2.5B. Best in class on every benchmark we tested. Toto 2.0 takes the top spots on BOOM (Datadog’s observability forecasting benchmark), GIFT-Eval (the standard general-purpose benchmark), and TIME (a new contamination-resistant zero-shot benchmark).A generational jump from Toto 1.0. Toto 2.0 is 7× more parameter-efficient at matching quality and dramatically faster at inference time.Trained on observability and synthetic data, generalizes broadly. Toto 2.0 does not see any public forecasting data during pretraining, yet leads the field on general-purpose benchmarks.

CRPS Rank vs. parameter count on BOOM (left) and GIFT-Eval (right) for top foundation models; lower is better. The Pareto frontier traces the best CRPS rank achievable at each parameter budget—points on or near it represent the best quality-for-size tradeoff available. Every Toto 2.0 size sits on or near the frontier on both benchmarks, and CRPS rank improves monotonically with model size across the family. CRPS Rank vs. parameter count on BOOM (left) and GIFT-Eval (right) for top foundation models; lower is better. The Pareto frontier traces the best CRPS rank achievable at each parameter budget—points on or near it represent the best quality-for-size tradeoff available. Every Toto 2.0 size sits on or near the frontier on both benchmarks, and CRPS rank improves monotonically with model size across the family.

Toto 2.0: Time series forecasting enters the scaling era | Datadog

Other newsrooms on this story

Related reading

Transform OpenAI gpt-oss Models into Domain Experts with Together AI Fine-Tuning

Building Blocks for Foundation Model Training and Inference on AWS

Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts, Enhanced Hugging…

Transformers v5: Simple model definitions powering the AI ecosystem

From Zero to One: Building An Autonomous and Open Data Scientist Agent from…

IBM has a time-series model for every task