TL;DR: Our internal flaky-test summariser at Buildkite was firing ~40k LLM calls a day, and most were...

TL;DR: We run an LLM-backed build-failure summariser at Buildkite. To stop a provider wobble from...

TL;DR: Our internal flaky-test summariser at Buildkite was firing ~40k LLM calls a day, and most were...