Amazon's EC2 service offers a potential shortcut past some of the hurdles faced by ISVs on the road to SaaS, not least how to build and operate a high-availability, shared-services data center infrastructure, says the company's CTO Dr Werner Vogels.
ISVs that want to deliver software-as-a-service have to learn a lot of new skills, not least how to build and operate a high-availability, shared-services data center infrastructure. Anyone who heard this month's keynote presentation at SIIA's OnDemand Summit by Amazon.com CTO Dr Werner Vogels will have a keen appreciation of how tough a technological challenge that is — and thus will want to evaluate whether Amazon's Elastic Compute Cloud (EC2) service offers a potential shortcut past some of those hurdles. More on that in a moment.
Those who like to own their own infrastructure will also no doubt be tempted to take a look at IBM's 'Blue Cloud' offering, announced last week and due for first availability in 2008. Despite the excitement this announcement has raised in some quarters of the ZDNet blogging fraternity, personally I would counsel caution. I gather that Blue Cloud relies on IBM's Tivoli software to manage resources in the cloud infrastructure. To me, this instantly sets alarm bells ringing. If it were possible to manage cloud infrastructure with Tivoli (or OpenView, or Unicenter) then how come it took Amazon.com five years to bring EC2 to market? IBM has plenty enough enterprise customers who will pay over-the-odds for the privilege of beta-testing IBM's cloud computing product offerings for as long as it takes the company to bring them up to scratch. My view is that ISVs who want to be successful in the on-demand market can afford neither the time nor the margins IBM will ask.
The key point that came across from Werner Vogels' keynote was — to paraphrase from my own perspective — everyone underestimates what it really takes to build a constantly available, high-volume, shared-services data center. Vogels described the two-stage journey that Amazon went through as it first of all scaled its data center infrastructure and then started providing services to third parties.
The first stage was to break down Amazon's initially monolithic 'get-big-fast' architecture into a services model, which the company finally had the breathing-space to do in 2001, during the dot-com downturn. The resulting infrastructure had close to a thousand services. At first, the developers thought all these services would be equal, but it turned out that some became foundation services that ran below a second aggregation layer. The breadth of services Vogels described include:
A management infrastructure that provides services such as authentication, authorization, auditing, performance managment, metering, accounting and billing;
A connection infrastructure with messaging, notification and workflow services; and finally
A resource knowledge management layer, consisting of directories, resources and brokers, along with indexing and search.
The second stage came when Amazon.com started offering its services to the outside world, beginning with Target, but soon extending to other retailers wishing to set up online stores. Core services that Amazon.com had to provide included identity management, order processing, fulfillment and customer service, product and offers management, content generation and discovery.
The interesting discovery that Amazon.com made when it embarked on this stage of its journey was that there was an extra stage that inserted itself in the usual cycle of develop -> test -> operate, and which could soak up as much as 70 to 80% of the time the developers spent on a project. Vogels called this extra stage "the elephant in the room", and defined it as the "undifferentiated heavy lifting" required to hone the infrastructure so that it would deliver reliable services. "We needed to fix that problem," he said.
The fix that Amazon.com's developers evolved — the automation of all those infrastructure issues — is embodied in the Amazon Web Services product set of compute (EC2), storage (S3) and messaging (SQS). Vogels calls this Web scale computing, which he defines as, "Scalable infrastucture that allows your applications to meet infinite demand, cheaply and reliably." He believes the economic argument for having a third-party provider take care of that is compelling: "You can have your engineers spend most of their time on infrastructure work — things that do not matter. All of you [SaaS providers] have to invest in this infrastructure. It would be silly to go into that in my eyes."
No ISV embarking on a SaaS initiative wants four-fifths of their development efforts getting bogged down in infrastructure work — especially not when they're going to be reinventing the wheel, repeating a journey that providers like Amazon.com, Saleforce.com and many others have already completed before them. In an ideal world, these infrastructure issues would already be automated in off-the-shelf packages from systems vendors such as IBM, Microsoft, Oracle and others. But with the noteworthy exception of Progress Software, no conventional platform vendor has yet got to grips with the SaaS proposition. IBM, as noted above, has just promised to release a first attempt at a packaged cloud computing platform, while the architecture team at Microsoft has been busy prodding the various server development teams into prioritizing SaaS readiness. Oracle can point to a number of SaaS providers that use its database and other middleware products, as well as its own OnDemand business unit, but it doesn't offer a toolkit or any documentation of best practice for deploying its technologies in a SaaS environment. Amazon, by contrast, is advanced enough that its infrastructure claims to survive entire data center failures: "Amazon has built ourselves as able to survive multiple data center failures," said Vogels — all as part of the service.
Given the lack of packaged solutions, the Amazon services provide a useful alternative, despite some shortcomings, in particular the lack of database persistency and, SLA notwithstanding, a troubling lack of information and responsiveness when problems do occur. It may not be the best route in the long term (something I'll elaborate on in future postings). But, either on its own or coupled with additional services such as RightScale and rPath (see my next posting and disclosure: rPath just had me do a webinar) Amazon services do seem to have proven a viable shortcut to SaaS for some vendors.