Amazon Web Services is operating normally for most customers and the company said it will post a detailed post mortem on what went wrong in its Northern Virginia data center.
On its dashboard, Amazon Web Services (AWS) wrapped up a multi-day outage with the following post:
As we posted last night, EBS (Elastic Block Store) is now operating normally for all APIs and recovered EBS volumes. The vast majority of affected volumes have now been recovered. We're in the process of contacting a limited number of customers who have EBS volumes that have not yet recovered and will continue to work hard on restoring these remaining volumes...
We are digging deeply into the root causes of this event and will post a detailed post mortem.
That post mortem is going to be interesting. In the meantime, tech observers have posted a few post mortems of their own. The takeaways appear to be:
- Architecture for failures.
- If you're solely on AWS use multiple Availability Zones.
- Have multiple cloud providers for infrastructure.
Items worth a read:
- Amazon's Web Services outage: End of cloud innocence?
- Whether it's Amazon or Microsoft, there's (still) no foolproof cloud