We Asked 10 LLMs to Write Efficient Code. Only 4 Got Better.

By Vilius Vystartas | May 2026 Every LLM can write code that works. The question is: can they write...

mercoledì 27 maggio 2026 New tab

1,007 words~5 min read

By Vilius Vystartas | May 2026

Every LLM can write code that works. The question is: can they write code that's efficient — and does telling them to be efficient actually help?

I tested 10 models on 10 coding tasks, each in two phases: unprompted (the model writes its own code) and prompted (explicitly told to write clean, DRY, efficient code). That's 200 API calls, $0.56 total. The results are... not what most prompt engineers would predict.

GPT-5.4 was the only model where prompting gave a substantial boost (+0.20). For most models, the "write efficient code" prompt was meaningless or actively harmful.

How the Metric Works

We Asked 10 LLMs to Write Efficient Code. Only 4 Got Better.

We Asked 10 LLMs to Write Efficient Code. Only 4 Got Better.

Related reading

I Gave 13 LLMs the Same Codebase and Asked for a Specification. Six Ran on My…

The Best LLMs for Agentic Coding in 2026 (Real-World, Not Just Benchmarks)

How to use LLMs effectively in your daily work — a practical tutorial

LLM-as-a-Judge: I Built One From Scratch, Then Checked It Against Humans

10 LLM API Patterns Every Developer Should Know

The 5 Things Your LLM Benchmark Misses That Actually Decide the Winner

Related reading

I Gave 13 LLMs the Same Codebase and Asked for a Specification. Six Ran on My…

The Best LLMs for Agentic Coding in 2026 (Real-World, Not Just Benchmarks)

How to use LLMs effectively in your daily work — a practical tutorial

LLM-as-a-Judge: I Built One From Scratch, Then Checked It Against Humans

10 LLM API Patterns Every Developer Should Know

The 5 Things Your LLM Benchmark Misses That Actually Decide the Winner