Worldwide Gmail, Chrome crash caused by sync server error

Worldwide Gmail, Chrome crash caused by sync server error

Summary: Gmail crumbles for 40 minutes, while at the same time Google Chrome browsers crashed almost simultaneously around the world (and in some cases, numerous times). Despite the apparent connection, the two problems were not linked. Here's what happened.

SHARE:
61
Screen Shot 2012-12-11 at 12.53.31

Google's email service crumbled yesterday for about 40 minutes, leaving millions of enterprise and consumer users without access to their cloud-stored email. 

Gmail didn't fall down due to a denial-of-service attack as was reported initially yesterday (which was quickly amended), despite no initial evidence to suggest that it was. The search giant said on its dashboard status pages: "Although our engineering team is still fully engaged on investigation, we are confident we have established the root cause of the event and corrected it."

At the same time millions of Google Chrome browsers crashed at around the same time. In some cases, Chrome crashed multiple times times within a short period. (It happened to me. Chrome crashed about three times in the space of 20 minutes, annoyingly, as I was -- ironically -- writing about the Gmail outage and Chrome crashes.)

However, in spite of Google Chrome's sandboxing feature, which allows each tab and process to run in a separate thread to prevent the browser from fully crashing if a plug-in or bad bit of Web site code causes issues, the entire browser crashed, losing any unsaved work at the same time.

Google engineer Tim Steele took to the firm's developer forums to confirm that, in spite of the apparent link between Gmail's outage and Chrome crashes, it was Google Sync that was causing the browser to crash worldwide, which ultimately then had a knock-on effect to other Google services, not limited to Gmail, Google Docs, Drive and Apps.

Google Sync keeps a user's Chrome browser in sync when they log in to their browser. Bookmarks, extensions, apps and settings are transferred across to the new Chrome browser on another machine when a user logs in. 

But this back-end service's failure had a knock-on effect to Chrome browsers. (Presumably, browsers that aren't set up to synchronize settings were not affected). Steele noted that Google's Sync Server relies on a component to enforce quotas on per-datatype sync traffic, which failed. The quota service "experienced traffic problems today due to a faulty load balancing configuration change."

He added: "That change was to a core piece of infrastructure that many services at Google depend on. This means other services may have been affected at the same time, leading to the confounding original title of this bug."

As a result, Google's Sync Server "reacted too conservatively" by telling the Chrome browser to "throttle 'all' data types," without taking into account for the fact that the browser doesn't support all these data types. This caused Chrome to crash en masse around the world.

The 'too-long, didn't-read' version is that Google changed something, it didn't work, and it caused the crashes. No hackers were involved, and the outage and crashes certainly were not a result of a denial-of-service attack.

(via Wired)

Topics: Google, Browser, Networking

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

61 comments
Log in or register to join the discussion
  • Who edited this?

    The "More worryingly, however..." sentence is brought to you by the Department of Redundancy Department.
    GTGeek88
    • Incorrect wording?

      And shouldn't this wording - "which allows each tab and process to run in a separate process" - be changed to "which allows each tab and process to run in a separate thread"?
      GTGeek88
      • That I agree with

        And have changed. Thanks for the spot! -- Zack.
        zwhittaker
  • Clear Skies over Google

    Using the cloud is one thing; relying on it, quite another.
    jdm12@...
    • Quite right

      While "the cloud" offers some tempting features, I'm still not buying it. Why would I put myself at the mercy of "them" (whoever that may be) and the myriad of inter-connectivities between "them" and me? Cloud computing is an Illuminati / globalist wet dream. And they didn't even have to pay people to give up their privacy and independance. Indeed, people pay THEM to "host" their stuff. Of course, no faceless corporation (part of the infamous Corporatocracy that actually runs the planet) is ever going to rummage through our data, right? They promise they won't. Really. :-\
      naibeeru
  • The important thing is that millions of customers had no e-mail ...

    ... and no reliable WEB access. Enterprise customers can translate that lost of server directly into lost dollars. Dollars in lost productivity. Dollars in lost sales. Since the browser crash lost unsaved data too - there could even be lost mission-critical data.

    Google could at least sequester enterprise services from consumer services. Not doing so is just plain stupid.
    M Wagner
    • Downtime happens

      Enterprise or cloud-based, downtime happens. I can say that in my year of living exclusively on cloud-based services that my downtime has been much less than what I experienced with enterprise based services (all 18 years of them). All companies are responsible for determining their risk mitigation strategies regardless of which style of hosting they choose. No service will have 100% up time, and for those who expect otherwise...there's this bridge you need to consider purchasing.
      jcraig36
      • High availability is possible with careful system design

        In the old telecom world, we designed the computer-based telecom switches to run 24/7 with only one hour of downtime per 20 system years of operation. It can be done if the consumers require it, but there is some extra cost in HW and engineering effort.

        Jeffd
        jeffrey.denenberg@...
        • Agree. This casual acceptance of mediocrity

          ensures you'll only get more of it.
          baggins_z
      • Have to agree

        This is the first 40 minutes of down time I have heard of from Google. If I have only 40 minutes of (network) down time a month at work (managed by CSC) I feel lucky.

        Yeah, the cloud should work better, but the complaints are overwrought.
        wiseoldbird
        • i think you forgot to add "since last summer"

          google is not known for robustness.
          Johnny Vegas
      • Amen

        For lack of a more appropriate affirmative reply outside of church; "Amen!"
        StuEZWebPlayer
    • "... lost mission critical data..."

      I am a work-from-home media designer; focusing 90% of my time on one mission; EZWebPlayer.com. If I completely relied upon one avenue of communication in order to fulfill EZWebPlayer's requirements for my services, I would be a fool, and my mission critical tasks would become mission unemployed in that theater.

      It is bad enough that I am tethered to my work by an unreliable Comcast which I have attempted to remedy by paying for AT&T as a backup, which often uses the same infrastructure that Comcast utilizes.

      Relying completely on Google and Gmail for one's "mission critical" tasks is something for novices, including those who live by Huffington Post and and other main stream media for their reality news checks. But a professional must have some backup plans.

      Do I believe Zack's iteration 100% for what it attempts to explain? Nope. Though it goes a long way to explain what might have happened, an incomplete infrastructure that continues to operate in constant Beta Mode can never be entirely explained away.

      Does anyone else ever get tired of living in the box where we are waiting for the next failure so we can get out of our work station chair to get a coffee refill or put another log on the fire? Or is it just me?
      StuEZWebPlayer
      • ...Lost Mission Crtical ...

        That's why I trust Cloud about as far as I can throw a feather. All cloud amount's is a Bunch Computers In some boiler room somewhere, That still depends on computer technology of today, including Hard drives. If they lose information so what! Its no skin off their noses.

        I use various clouds. But I only keep information that If the system blows up and burns down, I won't miss.
        pjones
  • hmm

    Why does Chrome browser, which is a clent application, need sync to google server?
    FADS_z
    • As you are logged in

      How do you think that the server works on the content you have on each browser you are logged on different computers? it syncs to the server so you can have your bookmarks and surf history on cloud
      DannyGM
      • IE bookmark is local, surf history is local

        That is real personal, and I don't want anyone else keeping them.....
        FADS_z
        • well some of us use chrome on multiple devices

          so having all of this in the cloud is what we need.

          I use chrome on multiple pc's/laptops, my nexus phone, and nexus7 tablet.
          otaddy
          • Multiple devices

            Firefox has this feature as well. Would be nice if we could sync across browsers.
            William_OP
    • so big brother will know

      so they can spy on everything you do - Just kidding - or am I ??????
      dave0420