Talk about strange timing: Yesterday, I heard from a business user of Microsoft's Windows Azure cloud platform who said that his company had been taken down by an Azure storage outage that lasted for six hours on April 15.
A day later, the Web is abuzz with news about an Amazon EC2 outage (going on 10 hours as I type this post) that seems to be centered around the company's cloud storage components.
Like Amazon does with AWS, Microsoft maintains visible dashboard pages showing the real-time status of all of its Azure-related components. From the Azure Storage page, it looks like there've been Azure storage problems resulting in "service degradations" on not just April 15 (in the North Central and South Central regions), but also on April 19 (in East Asia and Western Europe).
I've asked Microsoft for more details about what specifically happened on April 15 that caused the reported downtime and am awaiting word back.
Update (4/22): Microsoft isn't saying much about the outage, other than to acknowledge it happened. The official response, delivered through a company spokesperson:
"At 6:40 AM PDT on April 15th, Microsoft became aware of an issue that affected some customers using the Windows Azure Storage service in the North Central and South Central US regions. This issue has been resolved. We regret any inconvenience the outage may have caused our impacted customers. As always, we will investigate the cause of this issue and take steps to better ensure it doesn’t happen again."
The user who contacted me -- who asked not to be named -- said he believed there was a misconfiguration during storage deployment that hit both North Central and South Central U.S. at the same time that affected the way the load balancers were sending traffic. The user wanted to know more details about exactly what happened and what Microsoft is doing to head off similar types of problems in the future.
I'm not posting this to downplay what's going on with Amazon's EC2. Nor am I doing so because I've heard Microsoft or Microsoft partners trying to use Amazon's EC2 outage as a way to paint Azure as superior. (In fact, one member of the Azure team tweeted today that he hoped no one at Microsoft would do such a thing.)
Outages and glitches happen across the cloud, not just on the infrastructure side, but on the cloud apps side, too. They're a good reminder about the importance of backup/redundancy and the need to distribute one's cloud storage across multiple geographic locations, if and when possible, as one of my ZDNet UK colleagues tweeted today.