Snowflake CEO finds GLM-5.2 competitive with Opus 4.7 at a fraction of the cost

Zhipu AI's GLM-5.2 nearly matches Claude Opus 4.7 in a Snowflake benchmark with 103 coding tasks at one-fifth the cost per output token. But the Chinese model burns through nearly twice as many tokens per task. Still, that pricing gap is putting real pressure on Anthropic and OpenAI, and could rattle the valuations of Western AI labs.

mercoledì 24 giugno 2026 New tab

Snowflake compared GLM-5.2 and Opus 4.7 in a hands-on benchmark. The Chinese model held its own.

The test covered 103 tasks, each run three times, where models had to write code that works on both DuckDB and Snowflake. When each model got three attempts per task, the two were neck and neck: 66% vs. 67% of tasks solved.

First-attempt accuracy diverges: Opus hit 53.7%, GLM only 47.6%, showing GLM's output is less consistent. The Chinese model also averaged 99 runs per task versus Opus's 80 and burned through 860 million tokens, nearly double Opus's 439 million.

Opus 4.7 is the better model, but GLM is competitive in Snowflake's code benchmark and costs far less. | Image: via X[GLM's strength is validating code reliably across both platforms (DuckDB and Snowflake) at the same time. According to Snowflake CEO Sridhar Ramaswamy, that's why only GLM could solve certain tasks.

Its weaknesses are giving up too early and obsessively checking the wrong things. On one task, GLM fired off 411 tool calls in 24 minutes, checking row counts, distributions, null values, and column types, and still failed all three attempts. Opus solved the same task with 49 calls in 9 minutes.

Snowflake compared GLM-5.2 and Opus 4.7 in a hands-on benchmark. The Chinese model held its own.

Snowflake CEO finds GLM-5.2 competitive with Opus 4.7 at a fraction of the cost

Snowflake CEO finds GLM-5.2 competitive with Opus 4.7 at a fraction of the cost

Other newsrooms on this story

Related reading

Zhipu AI's GLM-5.2 closes in on closed-source leaders in coding marathons

China’s Z.AI Releases GLM-5.2: A Model That Rivals Claude Opus—Using Zero…

Z.ai pitches GLM-5.2 for long-running software engineering tasks

Z.AI's GLM-5.2 outperforms GPT-5.5 on coding benchmarks at one-sixth the cost

DeepSeek vs Qwen vs Kimi vs GLM: Which AI API Actually Wins in 2026? (A…

China's new AI model beats GPT-5.5 at key benchmarks

Other newsrooms on this story

Related reading

Zhipu AI's GLM-5.2 closes in on closed-source leaders in coding marathons

China’s Z.AI Releases GLM-5.2: A Model That Rivals Claude Opus—Using Zero…

Z.ai pitches GLM-5.2 for long-running software engineering tasks

Z.AI's GLM-5.2 outperforms GPT-5.5 on coding benchmarks at one-sixth the cost

DeepSeek vs Qwen vs Kimi vs GLM: Which AI API Actually Wins in 2026? (A…

China's new AI model beats GPT-5.5 at key benchmarks