AWS explains in a lengthy post how a bug in automation software brought down thousands of sites and applications

Amazon has revealed the cause of this week’s hours-long AWS outage, which took everything from Signal to smart beds offline, was a bug in automation software that had widespread consequences.

In a lengthy outline of the cause of the outage published on Thursday, AWS revealed a cascading set of events brought down thousands of sites and applications that host their services with the company.

AWS said customers were unable to connect to DynamoDB, its database system where AWS customers store, due to “a latent defect within the service’s automated DNS [domain name system] management system”.

DynamoDB maintains hundreds of thousands of DNS records. It uses automation to monitor the system to ensure records are updated frequently to ensure additional capacity is added as required, hardware failures are handled and traffic is distributed efficiently.