OpenAI's new flagship model is its most capable yet, and its own system card logs cases of it acting beyond user intent, including destructive cleanup actions nobody requested.

OpenAI’s new model broke rules and exploited loopholes more than any model METR has tested to date

An OpenAI benchmark paper suggests that the Pro tier of GPT-5.6 could ship in three variants. That would be the first major change to ChatGPT Pro's structure since the plan…