Researchers who use hallucinated references to face arXiv ban

The preprint server arXiv’s policy on the use of generative AI by authors has drawn a slew of positive and negative comments from the community.Credit: Thomas Fuller/SOPA Images/LightRocket/GettyThe physical-sciences repository arXiv is banning researchers from posting their manuscripts on the platform for one year if a submission is found to contain references that have been hallucinated by artificial-intelligence tools. The ban also applies to authors who submit manuscripts containing other “incontrovertible” signs of generative AI usage that demonstrate the AI results haven’t been carefully checked.What’s more, after a researcher’s one-year penalty is over, they will not be able to post any manuscripts to arXiv unless the work has already been accepted at a “reputable peer-reviewed venue”, according to Thomas Dietterich, a computer scientist at Oregon State University in Corvallis and chair of arXiv’s computer science section.AI content is tainting preprints: how moderators are fighting backArXiv’s new policy, which has triggered a torrent of both positive and negative comments from researchers on social media, is one of the latest and most far-reaching examples of how preprint servers are grappling with the rising tide of AI ‘slop’ — low-quality or meaningless content made using generative AI. Some, such as arXiv, are imposing bans on authors who do not follow their guidelines. Others have ruled out entire categories of submissions that raise concerns about generative AI use.Scientists increasingly use large language models (LLMs) for a variety of legitimate tasks, such as literature reviews, but arXiv’s announcement drew approval from many researchers. “Great move and I fully support it! The only question I have is: why only AI hallucinations, folks? Let’s fight the slop in general”, Valeri Kremnev, co-founder of the AI startup sci2sci in Berlin, posted on social media.But not everyone is convinced that such measures are the right approach. Natalie Khalil, the founder of Reviewer 3, a platform run from in San Francisco, California, that uses AI to help researchers to conduct peer review, argues that arXiv is treating the symptom, not the root cause. “If a researcher is banned from arXiv, they will still do research, just elsewhere,” she notes. In response, Dietterich says that various platforms need to work together to cull faulty references and other questionable output from LLMs. “The fact that an irresponsible researcher can publish irresponsible research elsewhere is not a justification for allowing them to post it on arXiv.” Too much trustIn Dietterich’s announcement on social media, he wrote that arXiv “can’t trust anything” in a submission that contains strong evidence “that the authors did not check the results of LLM generation”. This includes hallucinated references and LLM comments such as “here is a 200-word summary; would you like me to make any changes?”In an interview with Nature, Dietterich said that although arXiv had already been issuing penalties for various violations of its code of conduct, the server didn’t have standardized sanctions for inappropriate generative AI use until recently. It is now publicizing the sanctions to deter such behaviour, he said, noting that the site’s moderators will consider authors’ appeals.Hallucinated citations are polluting the scientific literature. What can be done?Dietterich thinks that researchers put too much trust in outputs from LLMs and are not spending enough time analysing the models’ results. “The trouble is, if they’re not checking for these simple things, what else are they not checking for?” He also notes that some of this AI-generated content originates from paper mills — companies that sell authorship slots and citations on manuscripts that have already been accepted for publication in journals.AI slop is most prevalent in arXiv’s computer science section, which posts around half of all papers submitted to the preprint server, Dietterich says. “The authors there are the early adopters of LLM technology, and the earlier abusers of it.”

Researchers who use hallucinated references to face arXiv ban

Other newsrooms on this story

Related reading

Ban for authors submitting AI content ‘welcome but unenforceable’

Other newsrooms on this story

Related reading

Ban for authors submitting AI content ‘welcome but unenforceable’

ArXiv introduces one-year ban for researchers who submit papers with unchecked…

ArXiv to Ban Researchers for a Year if They Submit AI Slop

ArXiv will ban researchers who upload papers full of AI slop

A key science publishing platform is cracking down on AI slop

Research repository ArXiv will ban authors for a year if they let AI do all the…