Poetiq has just published some very interesting results showing its Meta-System reached a new state-of-the-art on LiveCodeBench Pro (LCB Pro), a competitive coding benchmark, by automatically building and optimizing its own inference harness — without fine-tuning any underlying model or accessing model internals.
The result: GPT 5.5 High with Poetiq’s harness scores 93.9% on LCB Pro (25Q2), up from its baseline of 89.6%. Gemini 3.1 Pro, the model the harness was specifically optimized on, jumps from 78.6% to 90.9% — surpassing Google’s own Gemini 3 Deep Think (88.8%), a model that isn’t even accessible via API for external verification.
Before getting into the mechanics, it helps to understand why the benchmark matters. LiveCodeBench Pro (LCB) is designed to test AI coding ability in a way that resists two common failure modes in benchmarks: data contamination and overfitting.
LCB Pro pulls problems from major competitive programming competitions and withholds public ground-truth code. Instead, solutions are validated against a comprehensive testing framework. Correct output alone isn’t enough — solutions must also satisfy specific memory and runtime constraints. The benchmark is also subject to continuous updates, which distinguishes it from many standard benchmarks that become stale.









