How to avoid the amateur cloud

When people take outdated data center management practices and label them as cloud computing, that's amateur cloud. There's a lot of it about and it won't be easy to spot, making the transition to cloud computing lengthy, troubled and painful.
Written by Phil Wainewright, Contributor

Someone this week asked me, what's the cloud equivalent of SoSaaS? What do we call it when people take outdated data center management practices and label them as cloud computing, even when they fall far short of what's required? We already have the name, I replied, thanks to the events of the past week: amateur cloud.

There's going to be a lot of amateur cloud in the market for the next few years, and businesses of all sizes will have to be intensely wary of the pitfalls when they go shopping for cloud services. Amateur cloud won't be easy to spot, and often it'll be operated by huge, reputable companies with long, honorable track records in computing and data center operations. In many cases, businesses will knowingly choose amateur-cloud providers for reasons of cost or habit. As a result, the transition to the cloud computing era is going to be lengthy, troubled and painful.

The past week's Sidekick debacle has been an object lesson in the full perils of amateur cloud. The hit to the reputations and brand image of Microsoft and T-Mobile has been massive. To its credit, Microsoft has pulled out all the stops and seems well on the way to recovering the lost user data, which will go a long way towards restoring its cloud credibility. But at what cost? — not only in direct resource costs but also the unseen cost of top-level crisis management that has had to be devoted to the rescue exercise. One silver lining (though scant comfort for those who suffered directly) is that every such failure has the welcome side-effect of driving home to all cloud providers the risk exposure that amateur cloud represents. Many will now be re-examining their vulnerability and tightening up procedures or strengthening their infrastructure, all of which helps raise expected operating norms a few notches higher.

One can't help feeling sorry for venerable, established players like IBM, berated by Air New Zealand's CEO for last week's data center outage, and Hitachi Data Systems, caught up in the incident that caused the Sidekick data loss. As several of the Talkback commenters to my previous post have argued, it's not as if they've done anything different from what they've always done in the past. They weren't even attempting to operate as cloud computing facilities (although Sidekick's users certainly regarded it as a cloud service and trusted it as such).

Yet somehow in the space of a few short months, the world has changed. Suddenly, every online service is being measured against cloud standards. What was once state-of-the-art in data center operations has now become second-rate — and it's happened almost overnight, as though someone flicked a switch. Not so long ago, the IBM team would have won plaudits for bringing back an unexpectedly powered-off mainframe transaction system in less than six hours, especially on a Sunday morning. But now, the old standard operating procedures are no longer acceptable for today's always-on, ultra-connected, high-throughput cloud computing environments. Today's higher cloud computing standards mean any failure is exceptional and top management had better be on the phone apologising profusely for any disruption from the get-go.

So what are the missing features to look out for when assessing whether you're being offered amateur cloud in place of the real thing? The key attributes that tend to get overlooked by conventional providers fall into three main categories:

  • Cloud-scale operation. A cloud data center has to serve hundreds or thousands of separate businesses and this requires an infrastructure far beyond the capabilities most individual enterprise data centers aspire to. We're talking resilient, high performance that's always on, ready to flex up or down on demand and able to sustain high-volume peak loads to support users in their millions across all hours, business cycles, geographies and timezones. Even planned downtime is kept to a minimum, using redundant capacity where possible to apply patches and upgrades. Naturally, there's the highest level of hardened security, and a fully tested disaster recovery plan.
  • As-a-service model. As well as scaling up and down on demand on a pay-for-usage basis, the as-a-service model emphasizes a customer's need to have:
    • visibility into operational performance;
    • enough choice to ensure proper governance of the environment;
    • a fine degree of delegated control over matters such as provisioning, configuration and service level commitments.

    This depends on a higly instrumented and automated service delivery infrastructure that few conventional data center environments support.

  • Web-scale infrastructure. A true cloud operates a pooled, multi-tenant infrastructure — sharing not just the compute platform but every aspect of the infrastructure, including connectivity, service delivery components and integration services. This is a very different environment than you'll find in a data center designed to support individual enterprises using single-tenant architectures. It requires that all infrastructure resources be rigorously componentized and flexibly accessible via a web API. Pooling the infrastructure produces huge savings in aggregate cost of ownership and operation, while its exposure to the varied demands of many different customer requirements ensures it is tuned and strengthened in the most cost-effective way to deliver optimum security, reliability and performance to all.

One of the biggest problems facing the industry today is that customers expect all this extra availability and customer service at no extra cost. Real-time failover capability to a fully functional alternative data center doesn't come cheap. Many providers are meeting market price pressures while keeping their fingers crossed behind their backs in the hope of avoiding catastrophe until they can afford to invest in higher resilience. Few are prepared to come out straight and tell customers the true cost of adding full redundancy — and if they did do that, many customers would still choose to forgo the extra cost in favor of cheaper options. Just like the subprime mortgage market, there's a conspiracy of silence around the true long-term risks as providers and customers alike both close their eyes to the underlying fundamentals.

Meanwhile, many of the best-known and most trusted operators of enterprise data centers are running infrastructure that dates from up to a decade ago. Old-school data center operators, especially the likes of IBM, EDS/HP and Accenture, are lumbered with acres of costly, out-dated plant that's rapidly becoming unfit-for-purpose in the cloud computing era. This leaves them in an unenviable position, outclassed as amateur cloud and yet facing a huge hit in profitability or write-offs if they're to upgrade their facilities to meet today's expectations.

That's why I'm warning that we'll see many more failures of so-called cloud services in the next couple of years. Not because of fundamental flaws in the cloud model itself, but because few providers and customers are economically or culturally ready to properly embrace cloud computing. That will perpetuate confusion about what the cloud really is and what standards and architectures are required to implement it. It'll take a succession of amateur cloud failures before we arrive at mainstream acceptance of what constitutes an acceptable standard and best practice for cloud infrastructure. Until that time arrives, many cohorts of unwary or ill-advised enterprises and individuals will find themselves unwitting victims of the amateurism of would-be cloud computing providers.

[Disclosure: over the past few months I've explored the concepts outlined above in several white papers or analyst reports funded by consulting clients, all of whom operate cloud services. I'd like to acknowledge their funding and in particular cite The Acid Test for the On-Demand Data Center (NetSuite), Enterprise Meet Cloud (OpSource) and Redefining Software Platforms (Intuit). These companies contract my services not to influence my opinion but because they know I'm already on their wavelength. Regular readers of this blog will be familiar with my incipient bias in favor of cloud services.]

Editorial standards