A major scientific repository’s decision to ban authors whose work contains “hallucinated” references written by generative artificial intelligence (AI) has been welcomed by research integrity campaigners despite concerns about how the policy can be properly enforced.
In a landmark move, the popular preprint platform arXiv has said it will impose an immediate one-year ban if it finds “incontrovertible evidence” that submissions contain “inappropriate language, plagiarized content, biased content, errors, mistakes, incorrect references, or misleading content” written by large language models (LLMs).
“If a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can’t trust anything in the paper,” explained Thomas Dietterich, who chairs arXiv’s computing section, as he announced the policy on social media platform X.
Examples of incontrovertible evidence would include “hallucinated references” and “meta-comments from the LLM”, continued Dietterich, who gave examples of a researcher failing to delete phrases such as “here is a 200 word summary; would you like me to make any changes?” or “the data in this table is illustrative, fill it in with the real numbers from your experiments”.










