Our job is to run infrastructure you can rely on. We want you to be able to get on with building your business, launching your project, or just making cool things with machine learning models.

Unfortunately, we’re not perfect (yet!) and sometimes things go wrong. When they do, you deserve to be kept in the loop about what’s happening, and so we’ve shipped a status page at replicatestatus.com for real-time updates on the health of Replicate.

We’re also going to be publishing detailed reports when we have major incidents so you can understand what happened during and after the outage, and can see what we’re doing to improve our systems for the future. We had a significant outage on 11 May and you can find our report about it below.

Incident report: 11 May outage

On 11 May, Replicate experienced a significant outage affecting both the replicate.com website and our API. For about two hours from 05:45 UTC, many customers received slow responses or HTTP 500 errors from our API or website, and had trouble running predictions on Replicate’s platform as a result.