AWS users fret over downtime ahead of Amazon's massive EC2 reboot

Amazon is preparing for a massive reboot of EC2 instances across the globe to remedy a security flaw.
Written by Liam Tung, Contributing Writer

Amazon Web Services has kicked off a large scale maintenance reboot of Elastic Compute Cloud (EC2) instances across the world in what appears to be a response to a critical security flaw.

Amazon has told EC2 customers by email that it's gearing up to reboot a range of EC2 instance types across all availability zones over the next few days, and that users won't be able to do anything to stop the process.

Amazon occasionally schedules instances for a reboot to apply patches, upgrades, or maintain a host, but in this case hasn't told users what's behind this month's overhaul, causing some to speculate that it is applying a pre-released but embargoed (XSA-108) fix for a bug in the open source hypervisor, Xen.

Some AWS users have also expressed concern on the AWS user forum that they've been given too short notice to monitor services that may be affected during the maintenance event. Meanwhile, others have commended AWS for forcing a reboot at the expense of some downtime rather than allowing instances to continue running insecurely.

Responding to downtime concerns, one AWS support staffer said on its forum that while the reboots will be occurring across all availability zones, they will be staggered, offering users some redundancy. They added that there was no way to reschedule the reboots since they are "very timely security and operational patches" for a portion of the AWS EC2 instance fleet.

Another issue AWS's director of EC2 fleet operations Doug Grismore has acknowledged is that customers need a "more effective way to validate that all their instances (even the ones that launched a half-hour ago) are or are not subject to near-term maintenance or a previously planned reboot".

"We are working non-stop to attempt to deliver something useful that you can rely on as soon as possible," wrote Grismore.

In the meantime, Thorsten Von Eicken, founder of cloud management firm RightScale, has posted a blog, offering AWS users advice on how to manage the reboot, which, depending on timezones kicks off on 25 September or 26 September.

Von Eicken notes a few challenges to users may face, which are due to the scale of patching that Amazon is undertaking.

"If you relaunch an instance before the maintenance, you are not guaranteed to get an already-patched host," he noted.

"Normally, whenever our Ops team receives a maintenance notice regarding a specific set of instances, we relaunch them as soon as possible at our convenience so that by the time the maintenance windows arrives, our instances are already on hosts that have had the maintenance done. This time, due to the scale of the patching, there is not enough patched capacity available to guarantee this."

Von Eicken also noted that T1, T2, M2, R3, and HS1 instance types are not affected.

Von Eicken added that not all instances of the affected instance types will be rebooted. AWS users should check their AWS console for information on the instances that will be rebooted.

As for the cause of the massive patch update, Von Eicken notes that Amazon won’t be able to disclose it until 1 October — the date the embargo on the Xen bug is scheduled to be lifted.

"As usual, AWS is totally tight-lipped about the underlying cause. It seems obvious that the company is patching a security vulnerability, but it will not disclose which one until October 1 — that is, after they have patched all hosts."

The reboot starts on September 26, 2014, at 2:00 UTC/GMT (September 25, 2014, at 7:00 PM PDT) and ends on September 30, 2014, at 23:59 UTC/GMT (September 30, 2014, at 4:59 PM PDT), according to Von Eicken.

Read more on AWS

Editorial standards