I built an open source SDK to catch AI agent regressions before they ship.

I built an open source SDK to catch AI agent regressions before they ship

You fix a bug in your agent. A week later you change the prompt or swap the model. The same bug comes back. Nobody notices until a user does.

Regular software has regression tests for this. AI agents mostly do not. So I built replayd.

When your agent fails, you capture that run and save it as a test. Before you ship a new version, you replay the saved failures against it. If the same failure comes back, you catch it before your users do.

pip install replayd

I built an open source SDK to catch AI agent regressions before they ship.

Related reading

Three things AI agents keep getting wrong (and why I'm rebuilding the platform…

Your AI Agent Made a Decision. Now Prove Why.

A note on building reliability infrastructure for AI agents — and why…

Why Your AI Agent Works in Dev and Breaks in Prod

I built a GitHub App that auto-generates adversarial tests for AI-written code…

# Peek Inside the Black Box: Why Your AI Agent Needs OpenTelemetry and SigNoz