When TestSmith generates tests with --llm, it calls an LLM for every public member of every source file being processed. A project with 20 files and 5 public functions each means up to 100 API calls in a single run. That's a lot of surface area for things to go wrong.
Here's the reliability stack we built, layer by layer.
Layer 1: Retry with Exponential Backoff
LLM APIs fail transiently. Rate limits, timeouts, occasional 5xx responses — all of these are recoverable if you wait and retry. We built a retry middleware that wraps any Provider:
type RetryProvider struct {








