{"UUID":"99d5650b-6c18-4c33-b5e5-683b10054a7d","URL":"https://aws.amazon.com/message/4372T8/","ArchiveURL":"","Title":"AWS Sydney Region EC2 and EBS power disruption","StartTime":"2016-06-05T05:25:00Z","EndTime":"2016-06-05T15:00:00Z","Categories":["automation","cloud","config-change","hardware"],"Keywords":["sydney","ec2","ebs","power","outage","availability zone","drups","voltage sag"],"Company":"Amazon","Product":"EC2, EBS","SourcePublishedAt":"0001-01-01T00:00:00Z","SourceFetchedAt":"2026-05-04T17:51:04.142442Z","Summary":"At 10:25pm PDT on June 4, loss of power at an AWS Sydney facility resulting from severe weather in that area lead to disruption to a significant number of instances in an Availability Zone. Due to the signature of the power loss, power  isolation breakers did not engage, resulting in backup energy reserves draining into the degraded power grid.","Description":"On June 4th, at 10:25 PM PDT, a severe weather event caused a regional substation power loss, leading to a total utility power outage at multiple AWS facilities in the Sydney Region. This primarily affected EC2 instances and associated EBS volumes within a single Availability Zone. Power was manually restored by 11:46 PM PDT. Automated recovery began, with over 80% of impacted instances and volumes online by 1:00 AM PDT on June 5th, and DNS resolution issues resolved by 2:49 AM PDT. Nearly all instances were recovered by 8:00 AM PDT, though a small number of EBS volumes remained unrecoverable.\n\nThe root cause was a failure in the power redundancy system. An unusually long voltage sag, rather than a complete outage, prevented breakers from quickly isolating the Diesel Rotary Uninterruptable Power Supply (DRUPS) from the degraded utility power grid. This caused the DRUPS's energy reserves to drain into the grid, leading to its shutdown and preventing the integrated generators from engaging and connecting to datacenter racks.\n\nCustomer impact included the loss of power to EC2 instances and EBS volumes. During recovery, some instances experienced DNS resolution failures, and a latent bug in instance management software slowed the recovery of remaining instances. A small percentage of EBS volumes (less than 0.01%) were unrecoverable due to simultaneous failure of both replicated storage servers. Additionally, customers faced errors when launching new instances or scaling auto-scaling groups, and API state propagation delays caused issues with resource visibility and Elastic Load Balancing (ELB) latency.\n\nRemediation efforts include adding additional breakers to ensure quicker isolation of DRUPS from degraded utility power during voltage sags, allowing generators to activate sooner. AWS is also addressing the latent bug in their instance management software and implementing a program for regularly testing recovery processes on long-running hosts. Changes to improve API resilience are planned for rollout in the Sydney Region in July. AWS continues to recommend running applications across multiple Availability Zones for highest availability."}