New Study Explains Why Neural Networks Prefer “Flatter” Solutions

The paper adds to ongoing work in the field aimed at demystifying deep learning, suggesting that what once appeared to be unstable behaviour is a key ingredient in AI’s effectiveness.

Neural networks, core tools in modern AI for tasks from image recognition to language processing, are known for their ability to learn complex patterns, but their training dynamics remain poorly understood. This research contributes to a growing effort to uncover the mathematical principles behind their success. The study investigates training instabilities, that occur as neural networks learn and finds that these instabilities play a constructive role. Rather than being problematic, they guide models toward “flatter” regions of the reward landscape, which are associated with better performance on new data.

“What’s exciting here is that it turns an abstract observation about the orientation of dominant curvature into a concrete mechanism that helps explain the strong generalisation performance of neural networks.”

Lead author, Dr. Lawrence Wang

Central to the paper is a newly identified mechanism, Rotational Polarity of Eigenvectors. This concept describes how the dominant directions of curvature in the reward space rotate during training. During training instabilities, this rotational mechanism gives rise to a coupled dynamical system that captures the intricate dynamics of learning and helps explain how gradient descent navigates the complex, very high-dimensional reward landscapes of modern deep learning models. This connects also account for the strong generalisation performance observed in modern deep neural networks despite their vast numbers of parameters.

New Study Explains Why Neural Networks Prefer “Flatter” Solutions

Other newsrooms on this story

Related reading

The Shape of AI: Jaggedness, Bottlenecks and Salients

Strange Truths from the Architecture of AI

The Hidden Networking Problem Behind AI Agent Failures

Small models, big insights into vision

The AI That Manages Its Own Memory: Why Recursive Language Models Are the Next…

How scientists are trying to use AI to unlock the human mind