SummaryDan Fu, our VP of Kernels, has published a new post challenging the idea that AI is hitting a hardware wall. He argues that we are vastly underutilizing current chips and that better software-hardware co-design will unlock the next order of magnitude in performance.Is progress toward AGI hitting a wall?In the fast-moving world of AI, there is a growing debate about whether we are approaching the “limits of digital computation.” Some recent analysis suggests that hardware constraints and stalled GPU progress might bottleneck the road to generally useful AI.Dan Fu, who leads our kernels research team, offers a different, more optimistic perspective in his latest post: "Yes, AGI Can Happen – A Computational Perspective."While acknowledging the real constraints we face, Dan argues that we are far from hitting a ceiling. In fact, he suggests that today’s AI systems are nowhere near their theoretical limits. In his deep dive, he breaks down the numbers to show exactly where the "headroom" lies:We are underutilizing current hardware: Today’s state-of-the-art training runs (like DeepSeek-V3 or Llama-4) often achieve only ~20% Mean FLOP Utilization (MFU), and inference utilization is often in the single digits. There is massive efficiency to be unlocked through better software-hardware co-design and innovations like FP4 training.Models are a lagging indicator: The models we use today were trained on "old" hardware. The next generation of compute—massive clusters of 100k+ latest generation GPUs—hasn’t even fully entered the equation yet.Utility is already here: Even without future leaps, current models are already transforming complex workflows, such as writing high-performance GPU kernels with human-in-the-loop guidance.If you are interested in the intersection of systems engineering, hardware efficiency, and the future of AI scaling, this is a must-read.Read Dan’s full analysis here.