Managing risk in the wake of Amazon's cloud outage

With the recent Amazon Cloud outage many are suggesting that it brings Cloud Computing into question. And that Cloud Computing will now be more difficult to sell. I disagree.
Written by Gery Menegaz, Contributor

You hear that Mr. Anderson? That is the sound of inevitability...Risk! With the recent Amazon Cloud outage many are suggesting that it brings Cloud Computing into question. And that Cloud Computing will now be more difficult to sell. I disagree.
Whether you’re considering the implementation of a cloud strategy, taking on a merger with the hopes of increasing revenue, or are thinking about implementing a new, emerging technology there is risk involved. How you manage that is risk is critical to business continuity.
The first step in managing risk is to understand the types of risk that organizations face. According to Robert S. Kaplan and Anette Mikes, whose recent article Managing Risks: A New Framework, risk falls into one of 3 categories. Preventable, Strategic, or External risk.

Also see: Amazon Web Services: The hidden bugs that made AWS' outage worse | AWS outage reveals backup cheapskates | Google launches alleged Amazon Web Services killer, but lacks maturity, options

Preventable risk – this category of risk is internal, arising from within the organization, are preventable and ought to be avoided as they add no value to the organization. These risks arise from employee’s actions which are inappropriate, unethical or downright illegal. No doubt you have seen or taken part in training intended to educate and prevent this sort of behavior.
The New York Times, citing an internal report at the bank, reported that the JPMorgan Trading Loss May Reach $9 Billion. JPMorgan's initial estimate was $2 billion when it disclosed the trade in May, although CEO Jamie Dimon said then that the loss could grow. Given the enormity of the recent trading losses, it appears that simple online training is not enough to combat greed. This was definitely preventable risk!
Strategic risk – this category of risk is one in which the organization accepts as part of a new plan in efforts to generate higher returns. This is a risk that is not inherently bad, it is a risk that is accepted as part of a strategic plan to capture potential gains.
An example of this is the Microsoft purchase of aQuantive. on Friday, CNNMoney reported that Microsoft spent $6.3 billion in cash buying online display advertising company aQuantive in 2007. Microsoft bought the company in efforts to beef up Bing, but never made money on its online services decision. On Monday, the company wrote off almost the entire value of the acquisition, taking a $6.2 billion write down.
External risk – some risks arise from events that occur outside and are beyond the control of an organization. These include natural disaster, political, economic in nature, and ought to be identified and planned for in efforts to mitigate impact. The Amazon Cloud outage is a good example. Sites like Netflix, Instgram and Pintrest were offline for hours.
The Boston Globe reported “The weekend’s disruption happened after a lightning storm caused the power to fail at the Amazon Web Services center in Northern Virginia containing thousands of computer servers. For reasons Amazon was still unsure of on Sunday, the data center’s backup generator also failed.”
The key thing to note here is that the backup generator failed. The purpose of the back-up generator is to allow systems to come down gracefully, it has nothing to do with site redundancy. So, we can assume that for sites like Netflix, Instgram and Pintrest that not having a redundant site, such as banks have, was a risk they were willing to accept.

This is to say, that they were not willing to accept an outage should a server go down (the value of Cloud Computing), but willing to accept an outage should the site go down.
So, having the cloud outage was bad, but it ought not to reflect badly on Cloud Computing, it ought to reflect poorly on their level of planning with regard to business continuity, and may cause many of Amazon’s customers to take a second look at their risk profile.
Are there other examples that you can think of that better exemplify the risks highlighted above? Talk Back and Let Me Know.

Editorial standards