How AI models can optimise for malice
Researchers have discovered an alarming new phenomenon they are calling ‘emergent misalignment’
Researchers have discovered an alarming new phenomenon they are calling ‘emergent misalignment’

Few industries have escaped our hapless misuse of this troubling technology

OpenAI, DeepMind and Anthropic tackle the growing issue of models producing responses that are too sycophantic

How users treat a product in the real world can diverge wildly from what the designers intended

When a foreign competitor gains an unexpected technological capability, it can precipitate conflict

Start-up has limited the release of the new tool because of concerns about its hacking abilities

More lessons from the jagged frontiers of new technology