Lightning Takes Down Amazon And Microsoft Clouds

Lightning strikes cut the power to two major Amazon and Microsoft data centres and disabled backup systems in Dublin on Sunday, resulting in up to twelve hours of downtime.

Lightning struck a transformer which Amazon said resulted in a fire and an explosion and then a total power outage. As well as Amazon’s Elastic Compute Cloud (EC2) and Elastic Block Storage (EBS) services being affected, Microsoft BPOS services also went down.

The power of the bolt was such that part of the phase control system that synchronises the backup generators was disabled, said Amazon on its Service Health Dashboard. It said it began investigating connectivity issues at around 03:00 GMT yesterday – twelve hours later it was still grappling to restore 100 percent access.

Customers Left With Nowhere To Go

Microsoft told eWEEK Europe UK in a statement that at around the same time a widespread power outage caused connectivity issues for European BPOS customers. Services were restored to all customers around seven hours later, it said. In the past year, the BPOS service worldwide has seen several outages and at least one data breach. Microsoft is trying to move customers across to the recently-launched Office 365.

Just six days ago, an article on the Daily Telegraph Website says Microsoft’s Dublin data centre includes a “comprehensive system of secondary electricity sources” and the whole operation could switch seamlessly to Amsterdam in the event of a “major catastrophe”. Microsoft would not say whether this system had come into play during yesterday’s power outage when asked by eWEEK Europe UK, but it appears it did not.

By 15:00, Amazon’s dashboard had reported that 75 percent of the EC2 instances affected had been recovered but the large scale of disruption meant manual intervention was necessary before the remaining EBS volumes and EC2 instances could be restored.

“While many volumes will be restored over the next several hours, we anticipate that it will take 24-48 hours until the process is completed,” it said at that time. “In some cases EC2 instances or EBS servers lost power before writes to their volumes were completely consistent.

“Because of this, in some cases we will provide customers with a recovery snapshot instead of restoring their volume so they can validate the health of their volumes before returning them to service,” Amazon promised.

Among the Websites affected were the Telegraph’s puzzles page, an Amazon customer, and the Edinburgh Book Festival. Service-level agreement (SLA) terms are rarely made public but it would be reasonable to assume that, barring future downtime this year, Amazon at 99.86 percent  and Microsoft at 99.92 percent uptime will have some penalties to pay to their customers, assuming most of them hold a 99.99 percent SLA.

Microsoft’s Dublin site is its largest data centre outside of the US and its green credentials are heavily touted. For example, it uses Dublin’s naturally cool air for cooling rather than relying on power intensive refrigeration. Amazon opened its data centre in Dublin in 2008 and is planning to expand the centre with the conversion of a 240,000 sq feet (22,300 sq metres) building.

David Jamieson

View Comments

  • In one incident Amazon have set back the cloud computing market by years.

    Amazon and other cloud computing vendors make the case that they are the IT experts with regard to hosting and provisioning of utility based infrastructure and suggest to customers they can manage the infrastructure better than they can.

    They have been proven wrong.

    All other cloud vendors will get tarred with the same brush.

    http://grahamsblog4444.blogspot.com/

  • Gabriel Chaher, vice president, EMEA/APAC marketing, Quantum:

    “The latest outage from Amazon, where lightning caused a service disruption to Amazon's EC2 cloud computing platform in Dublin, is proof that basic data availability housekeeping must not be neglected. Simply moving data into a more flexible cloud based environment will not eliminate availability problems.

    “A resilient backup strategy will help to restore customer trust in public cloud services and will encourage widespread adoption. By keeping more than one logical, physical and site copy of data, the customer, or service provider, can be assured that data is always available for recovery.”

Recent Posts

Australia Rejects Elon Musk Claim About Social Media Ban For Under-16s

Government minister flatly rejects Elon Musk's “unsurprising” allegation that Australian government seeks control of Internet…

56 mins ago

Northvolt Files For Bankruptcy Protection In US

Northvolt files for Chapter 11 bankruptcy protection in the United States, and CEO and co-founder…

3 hours ago

UK’s CMA Readies Cloud Sector “Behavioural” Remedies – Report

Targetting AWS, Microsoft? British competition regulator soon to announce “behavioural” remedies for cloud sector

18 hours ago

Former Policy Boss At X Nick Pickles, Joins Sam Altman Venture

Move to Elon Musk rival. Former senior executive at X joins Sam Altman's venture formerly…

20 hours ago

Bitcoin Rises Above $96,000 Amid Trump Optimism

Bitcoin price rises towards $100,000, amid investor optimism of friendlier US regulatory landscape under Donald…

22 hours ago

FTX Co-Founder Gary Wang Spared Prison

Judge Kaplan praises former FTX CTO Gary Wang for his co-operation against Sam Bankman-Fried during…

23 hours ago