Musk has also apparently used the Grok chatbots as an automated extension of his trolling habits, showing examples of Grok 3 producing “based” opinions that criticized the media in February. In May, Grok on X began repeatedly generating outputs about white genocide in South Africa, and most recently, we’ve seen the Grok Nazi output debacle. It’s admittedly difficult to take Grok seriously as a technical product when it’s linked to so many examples of unserious and capricious applications of the technology.

Still, the technical achievements xAI claims for various Grok 4 models seem to stand out. The Arc Prize organization reported that Grok 4 Thinking (with simulated reasoning enabled) achieved a score of 15.9 percent on its ARC-AGI-2 test, which the organization says nearly doubles the previous commercial best and tops the current Kaggle competition leader.

“With respect to academic questions, Grok 4 is better than PhD level in every subject, no exceptions,” Musk claimed during the livestream. We’ve previously covered nebulous claims about “PhD-level” AI, finding them to be generally specious marketing talk.

Premium pricing amid controversy

During Wednesday’s livestream, xAI also announced plans for an AI coding model in August, a multi-modal agent in September, and a video generation model in October. The company also plans to make Grok 4 available in Tesla vehicles next week, further expanding Musk’s AI assistant across his various companies.