Cloud outages: Why storage is the real villain

Cloud outages: Why storage is the real villain

Summary: Amazon's EC2 suffered a four-day outage in April but the cloud itself is not at fault, says Lori MacVittie


Critics seize gleefully on every cloud outage as an indictment of the technology itself, but the true blame lies elsewhere, says Lori MacVittie.

Amazon's outage in April drew a lot of coverage — with most of it unfairly focused on cloud computing. Yet if the culprit wasn't the cloud for Amazon Web Services, what was to blame?

The detailed technical descriptions of what went wrong clearly indicate the source of the real issue. Unfortunately it's an issue that's rarely discussed, let alone in the context of cloud computing. So what is it?

As my storage-minded colleague Don MacVittie expounded in a recent blog, many people have had much to say about the Amazon outage, but the one thing that it brings to the fore is not a problem with cloud, but a problem with storage. Yes, we have a problem with storage and it's compounded by cloud computing.

The issue of availability

While application and network virtualisation have enabled architectures designed for failure — that is, supportive of failover — storage virtualisation has not.

The underlying problem is that storage virtualisation is about aggregation of resources for purposes of expanding capacity of the entire storage network, not individual files. Storage virtualisation controllers, unlike application delivery controllers, do not provide failover.

If a resource such as a file system becomes unavailable, it's unavailable. There's no backup, no secondary, no additional copies of that file system to which the storage virtualisation controller can redirect users. Storage virtualisation products just aren't designed with redundancy in mind, and redundancy is critical for enabling availability of resources.

Redundancy is critical, but it's not the only technological feature required. Interfaces to storage must also be normalised across redundant resources. A common interface allows transparent failover from one resource to another in the event of failure, making it possible to take advantage of redundancy.

Multiple copies of the application

Applications, for example, especially in cloud-computing environments, generally take advantage of the ubiquitous nature of HTTP. Availability of applications is made possible by the existence of multiple copies of the application and because all clients are accessing the application via HTTP. If one instance fails, the same protocol can seamlessly interact with a secondary or tertiary copy of the application.

Storage virtualisation products have addressed the problem of normalised interfaces by acting as a go-between, a proxy, to provide a single interface to clients while managing the complexity of heterogeneous storage systems in the background.

But the protocols used to manage storage resources internal to the storage architecture are not always...

Topic: Cloud

Lori MacVittie

About Lori MacVittie

Lori MacVittie is responsible for application services education and evangelism at app delivery firm F5 Networks. Her role includes producing technical materials and participating in community-based forums and industry standards organisations. MacVittie has extensive programming experience as an application architect, as well as in network and systems development and administration.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Unfairly focused on cloud computing? But the true blame lies elsewhere?

    Saying that cloud computing is not to blame is something I disagree with, a key aspect of cloud computing is that you allow providers to manage your infrastructure, you expect them to keep it secure and available at all times the advantage of this is not having to worry about the ins and outs of management, however the downside is the same, you are relying on cloud providers to manage your system, when things go wrong things have to go at their speed not yours.

    I'm not a cloud basher, its clearly the future and it would seem that data management needs to catch up. However this incident clearly must be considered when looking into aspects of cloud technologies.
  • What a load of waffle! Of course storage solutions can allow for failure, you just store the data in more than one place. This is what I would of expected from Amazon S3.
    Looks to me as this is evidence that the availability of some cloud services is being exaggerated.