A new generation of voters will ask ChatGPT, Claude, Gemini, and Grok how to vote, where the polling station is, and who is telling the truth. The published research is consistent: the models cannot reliably answer those questions. The election will arrive anyway.
In the spring of 2024, a Tow Center researcher at Columbia Journalism School ran a controlled experiment that should, in retrospect, have settled an industry argument.
The team fed eight AI search products, including ChatGPT Search, Perplexity, Gemini, Copilot, and the Grok-2 and Grok-3 search modes, a set of 200 news articles drawn evenly from twenty publishers, then asked each tool to identify the article and credit its source. Across 1,600 queries, the models returned the wrong answer more than 60% of the time.
ChatGPT Search, the only tool that consented to answer all 200 queries, was completely accurate on 28% of them and completely wrong on 57%. Perplexity, marketed as the research-grade option, was wrong 37% of the time, the lowest failure rate in the cohort.
Those numbers were published over a year ago. They have not improved. A Bloomberg study summary published on 20 May confirmed that ChatGPT, Claude, Gemini, and Grok remain unreliable when asked about news, including election news.










