Google hopes its cloud will have greater redundancy and reliability than that provided by Amazon Web Services, thanks to some of the secretive technology the cloud sits on.
Craig McLuckie, product manager for Google Compute Engine, explained to me last week that Google has "worked very hard to make sure we're not subject to those types of [Amazon] situations", referring to the severe AWS failure that happened last month.
Google is able to do this because Google Compute Engine (GCE) runs on the same advanced technology that powers its search engine, he says.
"For the most part, GCE is positioned as a way for customers to benefit from years and years of infrastructure investments, which span everything from our datacentre design to our operational practices, our hardware design and software design, [and] includes the software stack," McLuckie said.
Google is designing its cloud to be "similar" to Amazon Web Services's in structure, he said, to let customers put workloads in specific datacentres located geographically close to where requests for data are coming from. However, Google hopes its technology will help it defend against the type of cascading software failure that bit into Amazon's cloud in July.
To this end, McLuckie said Google is working to make a component of GCE — Google Cloud Storage — immune to the types of failures that have hit AWS. He says it is trying to make sure that workloads in the storage cloud can function as part of a global state, rather than be regionally bound.
McLuckie would not go into the specific bits of software that lets this happen. But based on information that has trickled out of the company over the past few years, it's possible to form a picture.
Engine uses Spanner
To start with, McLuckie says Google is able to offer "deep replication [of data] out of zone, spread out across the world", paired with "high consistency guarantees", for data kept in Google Cloud Storage.
This is almost certainly made possible by a Google technology named Spanner, which lets the web giant seamlessly migrate and replicate datacentre workloads across the world.
Spanner was described by Google Fellow Jeff Dean in a presentation given at the LADIS conference in 2009 (PDF) as an in-development technology that gave Google a "storage and computation system that spans all of our datacentres", whose goal was to assure the movement and replication of data and tasks according to constraints and usage patterns.
It strikes me that it must be this technology, or a variant of it, that lets Google have confidence in its ability to shift workloads across datacentres in the case of a cloud outage.
The reason why it's worth paying attention to Google's cloud lies less in what it does for customers — so far, it seems like a capable Amazon competitor, backed by proven infrastructure — and more in the technology it uses.
This is because some of Google's in-house technologies spawned the industry-leading big data platform Hadoop. What people sometimes forget is Hadoop is based on technology developed by Yahoo, which was in turn based on a Google academic paper released in 2004 (PDF). This means the most advanced open-source data system around is based on technology almost a decade old for the Mountain View-based search engine provider.
McLuckie was coy when asked about whether some of Google's cloud-computing technology may some day be pushed into open source.
"Practically, we invest a tremendous amount of R&D into producing some of the world's finest infrastructure," he said. "Frankly, it's really important that we can maintain that level of investment and protect these levels of investments."
But Google — like Amazon, Microsoft and all other major cloud providers excluding OpenStack — seems set on keeping its technology secret for the time being.
"We don't necessarily aim to explain in every detail," McLuckie said, noting Google has a lot of "secret sauce" powering its cloud — and for the time being, it will keep the recipe in-house.