On June 9, Anthropic shipped Claude Fable 5 — the most capable coding model the industry had ever seen. Three days later, the U.S. government ordered it offline for every user on Earth. No warning. No transition period. One directive, and the frontier vanished overnight.
The same week, Z.ai (Zhipu AI) released GLM-5.2 — a 744-billion-parameter coding model with a one-million-token context window, MIT-licensed open weights arriving within days. The timing was not lost on the developer community.
ℹ️ The message landed clearly on Hacker News: as user Reubend put it, they're "grateful to Chinese labs for being open with their work" — especially after "the Fable 5 fiasco." Open weights aren't just a cost play anymore. They're insurance.
This guide walks you through actually running GLM-5.2 on your own hardware — the VRAM you need, the quantization that fits, and the exact commands for llama.cpp, Ollama, and LM Studio. No API keys. No cloud dependency. No one can pull the plug.
What GLM-5.2 Actually Is








