updated below with responses from Google, AT&T:
I realize that emotions can run high when things like e-mail suddenly stop working. This morning, the twittersphere blew up with reports of widespread troubles with GMail and other Google services. Just as quickly, some used this opportunity to bash the reliability of the cloud.
ZDNet's own Larry Dignan did some IT detective work this morning to see if he could pinpoint the problem - at least the problem as it existed from the CBS offices in New York. His initial assessment, which has since changed, is that this appeared to be an AT&T problem. He grabbed an image to illustrate what he found and wrote in one of his live updates:
In the stray email department, I was informed that it’s an AT&T routing issue. Anything that touches Google via AT&T is down... Got some our resident IT guru to explain this in English. It does appear Google is stopped at the AT&T border. Note: This may just be New York City specific.
Obviously, there's no clear-cut answer here yet. Other readers - non-AT&T users - chimed in to say that they were also experiencing problems with Google services. So what gives? I can't answer that question but I'm sure Google will provide an answer soon enough. (updated below)
Until then, businesses should use this opportunity to put more thought into contingency plans. Maybe companies that are thinking about adopting a cloud strategy - such as Google Apps - need to look into backup clouds. Maybe there should be a hybrid approach - at least initially.
I remain a fan of the cloud and wouldn't let this outage steer me back to the old way. There have been plenty of instances in the past when exchange servers have gone down and e-mail has been crippled because of it. I can think of several incidents when e-mail was down for hours, even a couple of days, while the IT folks at the office scrambled to solve the problem.
From what I can tell, this outage lasted only about an hour or two - and things are starting to get back to normal. Maybe that's because the folks at Google - or some ISPs - who specialize in what they do were able to jump right into the problem and start making repairs. As crazy as it sounds, that quick reaction by Google may be a positive spin to show that, even when problems happen, Google has the on-site experts who can be dispatched to fix things - fast.
update: It turns out that Google is taking the blame for this one, pointing to a systems error that re-routed Web traffic through Asia, where it got stuck in a traffic jam. The company took its lumps for the error, saying in a blog post this afternoon:
We've been working hard to make our services ultrafast and "always on," so it's especially embarrassing when a glitch like this one happens. We're very sorry that it happened, and you can be sure that we'll be working even harder to make sure that a similar problem won't happen again.
The company said that 14 percent of its users experienced slowdowns or outages. In the meantime, AT&T - picking up some buzz about possibly being to blame - issued a statement (and a tweet) that read:
After receiving speculative reports in the media that Google experiencedan outage related to the AT&T network, we looked into the matter.We have not identified any specific problems in our network that could have caused the reported outage.