Closing the verification loop, Part 2: Fully autonomous optimization

What can be verified marks the boundary of what can be created, regardless of whether a human or an LLM wrote the code. Distributed systems—backend systems, distributed protocols, and data pipelines—are notoriously difficult to build. They require reasoning about concurrency, partial failure, network nondeterminism, and subtle invariants that hold across multiple machines.

Distributed systems are also notoriously difficult to test: the interesting bugs only appear under specific timing conditions that unit tests rarely exercise. If AI-assisted development could produce a verifiably correct distributed system, that would be immensely valuable. If correctness verification does not pass, understanding where it failed would be equally valuable. Make it cheaper and faster (without waiting for human review) to say “this is wrong,” and the agent can explore more aggressively.

In Part 1, we described how we built redis-rust and Helix under harness-first engineering—design contracts, verification pyramids, deterministic simulation testing. The human designed the harness, the agent iterated against it.

In this post, we remove the human from the loop for fully autonomous optimization to achieve optimal proof carrying code per workload, per tenant, in real time.

Closing the verification loop, Part 2: Fully autonomous optimization | Datadog

Closing the verification loop, Part 2: Fully autonomous optimization | Datadog

Related reading

Closing the verification loop: Observability-driven harnesses for building with…

Why AI code optimization needs production-grounded benchmarks | Datadog

Offline evaluation for AI agents: Best practices | Datadog

The Missing Half of Trust in AI Coding: Verifying AI-Generated Code

How we built a real-world evaluation platform for autonomous SRE agents at…

Why Autonomous AI Systems Require Continuous Verification