Worldwide Gmail, Chrome crash caused by sync server error
Summary: Gmail crumbles for 40 minutes, while at the same time Google Chrome browsers crashed almost simultaneously around the world (and in some cases, numerous times). Despite the apparent connection, the two problems were not linked. Here's what happened.

Google's email service crumbled yesterday for about 40 minutes, leaving millions of enterprise and consumer users without access to their cloud-stored email.
Gmail didn't fall down due to a denial-of-service attack as was reported initially yesterday (which was quickly amended), despite no initial evidence to suggest that it was. The search giant said on its dashboard status pages: "Although our engineering team is still fully engaged on investigation, we are confident we have established the root cause of the event and corrected it."
At the same time millions of Google Chrome browsers crashed at around the same time. In some cases, Chrome crashed multiple times times within a short period. (It happened to me. Chrome crashed about three times in the space of 20 minutes, annoyingly, as I was -- ironically -- writing about the Gmail outage and Chrome crashes.)
However, in spite of Google Chrome's sandboxing feature, which allows each tab and process to run in a separate thread to prevent the browser from fully crashing if a plug-in or bad bit of Web site code causes issues, the entire browser crashed, losing any unsaved work at the same time.
Google engineer Tim Steele took to the firm's developer forums to confirm that, in spite of the apparent link between Gmail's outage and Chrome crashes, it was Google Sync that was causing the browser to crash worldwide, which ultimately then had a knock-on effect to other Google services, not limited to Gmail, Google Docs, Drive and Apps.
Google Sync keeps a user's Chrome browser in sync when they log in to their browser. Bookmarks, extensions, apps and settings are transferred across to the new Chrome browser on another machine when a user logs in.
But this back-end service's failure had a knock-on effect to Chrome browsers. (Presumably, browsers that aren't set up to synchronize settings were not affected). Steele noted that Google's Sync Server relies on a component to enforce quotas on per-datatype sync traffic, which failed. The quota service "experienced traffic problems today due to a faulty load balancing configuration change."
He added: "That change was to a core piece of infrastructure that many services at Google depend on. This means other services may have been affected at the same time, leading to the confounding original title of this bug."
As a result, Google's Sync Server "reacted too conservatively" by telling the Chrome browser to "throttle 'all' data types," without taking into account for the fact that the browser doesn't support all these data types. This caused Chrome to crash en masse around the world.
The 'too-long, didn't-read' version is that Google changed something, it didn't work, and it caused the crashes. No hackers were involved, and the outage and crashes certainly were not a result of a denial-of-service attack.
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback
Who edited this?
Incorrect wording?
That I agree with
Clear Skies over Google
Quite right
The important thing is that millions of customers had no e-mail ...
Google could at least sequester enterprise services from consumer services. Not doing so is just plain stupid.
Downtime happens
High availability is possible with careful system design
Jeffd
Agree. This casual acceptance of mediocrity
Have to agree
Yeah, the cloud should work better, but the complaints are overwrought.
i think you forgot to add "since last summer"
Amen
"... lost mission critical data..."
It is bad enough that I am tethered to my work by an unreliable Comcast which I have attempted to remedy by paying for AT&T as a backup, which often uses the same infrastructure that Comcast utilizes.
Relying completely on Google and Gmail for one's "mission critical" tasks is something for novices, including those who live by Huffington Post and and other main stream media for their reality news checks. But a professional must have some backup plans.
Do I believe Zack's iteration 100% for what it attempts to explain? Nope. Though it goes a long way to explain what might have happened, an incomplete infrastructure that continues to operate in constant Beta Mode can never be entirely explained away.
Does anyone else ever get tired of living in the box where we are waiting for the next failure so we can get out of our work station chair to get a coffee refill or put another log on the fire? Or is it just me?
...Lost Mission Crtical ...
I use various clouds. But I only keep information that If the system blows up and burns down, I won't miss.
hmm
As you are logged in
IE bookmark is local, surf history is local
well some of us use chrome on multiple devices
I use chrome on multiple pc's/laptops, my nexus phone, and nexus7 tablet.
Multiple devices
so big brother will know