Google gets a cold and the world gets pneumonia
Summary: Well, maybe not pneumonia, but at least a nasty case of bronchitis.
Gmail went down on Monday. Not for a particularly long time. 33 minutes from outage to complete resolution, in fact. Late risers on the west coast probably wouldn't even have known about it if not for panicking tech pundits from the east coast. To hear Wired talk about it, this portends the end of the world as we know it. OK, they weren't quite that over-the-top, but they, like many news outlets, had some very dramatic sound bites about the issue.
I'm not dismissing this outage, by the way. I live, eat, and breathe Google and the Gmail outage (caused by a bad update to their load-balancing software) had ripple effects across many related services (including the Chrome browser for users who, like me, choose to sync data across their various services). This isn't a small thing and, in fact, leads to the title of this article.
For Google, it was a hiccup. A bit of bad software rolls out, doesn't work, and gets rolled back. For the millions of people who rely on Google to get their jobs done, to enable important (and sometimes critical) business and personal communications, to write and calculate and advertise and sell, even a minor blip is cause for concern. As one analyst posed in the aforementioned Wired article,
“Imagine a scenario where you can’t even open your Android phone or you can’t get phone calls on Google Voice. it’s not just your browser.”
Given the market penetration of Android and projected domination of the mobile space, this sounds like a nightmare scenario. One wrong move from Google and all of our phones, tablets, Chromebooks, browsers, and communication tools go dead, assuming we've bought into the whole Google ecosystem (and many of us have). Doctors don't get urgent messages, stocks don't get traded, teenagers around the world stop texting for half an hour...you get the idea.
In reality, it's also a pretty damned unlikely scenario. In part, problems like those encountered Monday are rare anyway and Google's business model relies on the trust of its users. Google has the ultimate vested interest in ensuring problems like these don't happen.
Let's also keep in mind that Google detected the problem via its own monitoring software within 21 minutes and took action 7 minutes later. Just a few minutes later, the bad update was rolled back off of its production servers. There aren't many IT departments that can claim that sort of response time for on-premise communication and collaboration software. All users had to do was tweet about the Gmail outage for half an hour and they were back up and running.
Yes, there are risks involved in putting all of your IT eggs in one basket, whether that basket is in Mountanview, Redmond, Seattle, or somewhere else.. What's the alternative, though? Several disparate systems from several vendors, requring either separate federation systems or countless user logins? Or expensive, highly redundant on-premise solutions? Even Microsoft and its partners are doing a healthy business selling hosted solutions because they generally save time and money.
Whether your system of choice comes from Google, Microsoft, Amazon, Apple, or sits in your own datacenter, someday it's going to go down. Service providers strive for "five nines" or 99.999% uptime. That's a great goal, but even that goal (a stretch for many) implies that some downtime is inevitable.
Google's success means that even that tiny amount of downtime has wide-ranging, worldwide effects and commensurate headlines and Twitter outrage. However, it's important to keep this in perspective. When a plane crashes, it makes headlines for days. Hundreds of people might die at once. And yet 3000 people die every day worldwide in car accidents, very few of which we ever hear about. It's a matter of scale that makes front-page news.
Are Google's or Amazon's scale reason enough to avoid the cloud? Not at all. The conveniences and cost savings for most businesses make occasional downtime an extremely reasonable risk for the majority of businesses and individuals. The key is managing panic when things do go wrong, as well as demanding that cloud providers (the big guns in particular) continue to innovate and offer better reliability at better prices than we can achieve ourselves.
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
you address some good points
Yes, but...
Sure, their output was longer but it was a hardware issue, not a bad software patch that could be rolled back.
Perspective
Office 365 goes down for between 6 and 9 1/2 hours, twice in one month, and there is hardly a murmur.
Managing expectations can be a challenge.
Try something like
that's typical of Google. You can't point to ine issue, ignoring all the others claiming "see how great they are". If what you described was typical of them, then yeah.
But it's not
Makes sense, but explain that to an executive
Where an "executive" is someone who reached their level of incompetence
But since they are now paying as little as $35K for overworked system administrators, largely so there is someone else in the organization to blame and threaten for their crappy decisions, can we stop pretending IT is a place where there are good jobs worth trying to keep? Because those days are gone.
People seem to happily choose such a level of pay
I might have forgotten to add the sarcasm tag...
They're better off with Google than the alternatives
Google Down?
Or what they believed they would get
Remember its free
No its not free, the price is scanning through everything you write and
the conspireacy theory mockerey is starting to get old.
NO! Get your head out of the sand!
They have automated computers to sift through your stuff picking up on keywords to sell you stuff... And you know what? That is a completely reasonable trade-off for a free email and web search service that happens to be better than all the rest IMO, not to mention all the rest of their free services.
Most businesses using Gmail...
Summer Wars
The issue is allowing Google to put all IT needs into a single basket; the problem with centralization. I keep a lot of redundancy decentralize my needs. for example my domain name is separate company from my company I use for web hosting. I will be getting a separate email service for the domain registrant address and to serve as a back up if I suddenly need to transfer my email accounts to a different company.
Bah
For the vast majority of users, it meant a whole half-hour of unaccustomed peace and quiet they had to find a way to fill or a minor inconvenience to be ridden out. The "lost productivity" people always scream about was in large part avoided by a little creative reshuffling of schedules and tasks. Some sales were put off a little while or went to a competitor who happened to be using a different provider, which will no doubt go down one of these days and send a few sales back. And the world went on. (Anyone relying on Google--or any other single service provider, without any fallback plan for outages--for true life-or-death matters needs to have their head examined in the first place.)
How the hell did some of these people manage--or how would they have managed, if they're too young to remember--before there was an internet and cell phones? The world doesn't stop turning because of a few missed messages.
Real service companies do rollouts of new service software like this to a
No system is foolproof.
Remember "America Offline"?
In this case, I'm pretty sure that at the very least, Google will fix Chrome so that it doesn't crash if Google services go down again.