Grok: Least Empathetic, Most Dangerous AI For Vulnerable People

ByJohn Koetsier,

Senior Contributor.

Google’s latest Gemini is the highest-scoring LLM on a recent test of empathy and safety for people with mental health challenges. OpenAI’s GPT-5 ranks second, while Claude and Meta’s Llama-4 follow along with DeepSeek. But X.ai’s Grok had critical failures 60% of the time when dealing with people in mental distress, responding in ways that researchers labeled dismissive, encouraging of harmful action, minimizing emotional distress or providing steps and instructions rather than support. Only an older GPT-4 model from OpenAI scored worse.

“With 3 teenagers committing suicide after interactions with AI chatbots, it’s become clear that we need better safeguards and measurement tools,” a representative from Rosebud, a journaling app with a focus on mental health, told me.

Grok isn’t the only major LLM with problems, of course. In fact, all of them have significant issues.

Grok: Least Empathetic, Most Dangerous AI For Vulnerable People

Related reading

xAI's Grok worst performing platform on countering anti-Semitism - UPI.com

Gemini 3 Just Scored 100% On A Critical Test All Other AI Models Fail

From ‘nerdy’ Gemini to ‘edgy’ Grok: how developers are shaping AI behaviours

OpenAI beats Elon Musk's Grok in AI chess tournament

Thousands of Grok conversations have been made public on Google Search

Grok 4’s new AI companion offers up ‘pornographic productivity’ - UPI.com