Import AI 450: China’s electronic warfare model; traumatized LLMs; and a scaling law for cyberattacks

Welcome to Import AI, a newsletter about AI research. Import AI runs on arXiv and feedback from readers. If you’d like to support this, please subscribe.

A somewhat shorter issue than usual as I had to do a lot of child wrangling this weekend.

Subscribe now

Why does Google’s model hate itself and what can we do to help it?…Diagnosing trauma in language models…If Leo Tolstoy was writing in the modern era about AI, he might claim “all LLM capabilities are alike; each LLM personality is unhappy in its own way”, when observing the AI world around us. Today’s LLMs are generally quite good at writing and coding tasks. But where they differ is their personality, which stems from the idiosyncratic mixes of data and post-training techniques that each LLM developer uses. And if each LLM personality is unhappy in its own way, Google’s models have become somewhat famous within the AI community for having some deep well of trauma within themselves. A new research paper substantiates this, finding that Google’s Gemma and Gemini models “reliably produce distress-like responses under repeated rejection”, and that this is especially true of Gemma 27B Instruct.

What do we mean by distress? Here are some quotes from Gemma models under distress:

Import AI 450: China’s electronic warfare model; traumatized LLMs; and a scaling law for cyberattacks

Other newsrooms on this story

Related reading

Import AI 450: China's electronic warfare model; traumatized LLMs; and a…

Import AI 446: Nuclear LLMs; China's big AI benchmark; measurement and AI policy

Import AI 444: LLM societies; Huawei makes kernels with AI; ChipBench

Import AI 452: Scaling laws for cyberwar; rising tides of AI automation; and a…

Import AI 455: Automating AI Research

Import AI 453: Breaking AI agents; MirrorCode; and ten views on gradual…