How Google Compute Engine hopes to sidestep AWS failures

How Google Compute Engine hopes to sidestep AWS failures

Summary: Google is calling on some of its most sensitive technologies in an attempt to give its cloud greater redundancy and reliability than Amazon Web Services's cloud, though it plans to keep them proprietary

SHARE:
TOPICS: Cloud, Google
5

Google hopes its cloud will have greater redundancy and reliability than that provided by Amazon Web Services, thanks to some of the secretive technology the cloud sits on.

Craig McLuckie, product manager for Google Compute Engine, explained to me last week that Google has "worked very hard to make sure we're not subject to those types of [Amazon] situations", referring to the severe AWS failure that happened last month.

Google Compute Engine
Google hopes that its Google Compute Engine can avoid the sort of outages that have downed Amazon Web Services in the past. Image credit: Google

Google is able to do this because Google Compute Engine (GCE) runs on the same advanced technology that powers its search engine, he says. 

"For the most part, GCE is positioned as a way for customers to benefit from years and years of infrastructure investments, which span everything from our datacentre design to our operational practices, our hardware design and software design, [and] includes the software stack," McLuckie said.

Google is designing its cloud to be "similar" to Amazon Web Services's in structure, he said, to let customers put workloads in specific datacentres located geographically close to where requests for data are coming from. However, Google hopes its technology will help it defend against the type of cascading software failure that bit into Amazon's cloud in July.

To this end, McLuckie said Google is working to make a component of GCE — Google Cloud Storage — immune to the types of failures that have hit AWS. He says it is trying to make sure that workloads in the storage cloud can function as part of a global state, rather than be regionally bound.

McLuckie would not go into the specific bits of software that lets this happen. But based on information that has trickled out of the company over the past few years, it's possible to form a picture.

Engine uses Spanner

To start with, McLuckie says Google is able to offer "deep replication [of data] out of zone, spread out across the world", paired with "high consistency guarantees", for data kept in Google Cloud Storage.

This is almost certainly made possible by a Google technology named Spanner, which lets the web giant seamlessly migrate and replicate datacentre workloads across the world.

Spanner was described by Google Fellow Jeff Dean in a presentation given at the LADIS conference in 2009 (PDF) as an in-development technology that gave Google a "storage and computation system that spans all of our datacentres", whose goal was to assure the movement and replication of data and tasks according to constraints and usage patterns.

It strikes me that it must be this technology, or a variant of it, that lets Google have confidence in its ability to shift workloads across datacentres in the case of a cloud outage.

The reason why it's worth paying attention to Google's cloud lies less in what it does for customers — so far, it seems like a capable Amazon competitor, backed by proven infrastructure — and more in the technology it uses.

This is because some of Google's in-house technologies spawned the industry-leading big data platform Hadoop. What people sometimes forget is Hadoop is based on technology developed by Yahoo, which was in turn based on a Google academic paper released in 2004 (PDF). This means the most advanced open-source data system around is based on technology almost a decade old for the Mountain View-based search engine provider.

Open source

McLuckie was coy when asked about whether some of Google's cloud-computing technology may some day be pushed into open source.

"Practically, we invest a tremendous amount of R&D into producing some of the world's finest infrastructure," he said. "Frankly, it's really important that we can maintain that level of investment and protect these levels of investments."

But Google — like Amazon, Microsoft and all other major cloud providers excluding OpenStack — seems set on keeping its technology secret for the time being.

"We don't necessarily aim to explain in every detail," McLuckie said, noting Google has a lot of "secret sauce" powering its cloud — and for the time being, it will keep the recipe in-house.

Topics: Cloud, Google

Jack Clark

About Jack Clark

Currently a reporter for ZDNet UK, I previously worked as a technology researcher and reporter for a London-based news agency.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

5 comments
Log in or register to join the discussion
  • What happened to their GAE

    That's what they were touting back in 2008/9. It's all silence now. I doubt anything has changed with this latest GCE stuff.
    LBiege
    • How about using this technology for Gmail?

      Why do we see outages in Gmail then? Or is it that the sites which make money for Google like Search are the only ones to get the privilege of using these technologies?
      mm71
      • Gmail was "once in a blue moon"...

        The Gmail failure was very extreme, a top google engineer told me a year ago (http://www.zdnet.com/google-at-scale-everything-breaks_p2-3040093061/) seemed to be a v severe cascading fail with unknown bugs. They had to recover data back from TAPE!?
        However, I take your point - Google is not immune from this, even with Spanner, but it does seem like a better technical approach than AWS from my (admittedly info-limited) position.
        JC
        Jack Clark
  • GAE folded into GCE

    LBiege,
    Thanks for commenting. Funnily enough I'm working on a story on this at the minute. GAE was their PaaS and had little uptake (from what I can tell) and they did some pricing tweaks that infuriated its developers. GAE has basically been folded into GCE. This is because most money/use for the enterprise seems to be, today, in IaaS rather than PaaS, so Google shifted down. I think GCE has a good proposition (one of the original architectures of Amazon Web Services said some nice things about it over here http://blog.b3k.us/2012/07/04/cloud-independence-day.html ) so I think it's worth keeping an eye on Google in this area.
    Thanks for commenting!
    Jack
    Jack Clark
    • Spelling errors

      * architectures >> architects
      Jack Clark