Whether it's Amazon or Microsoft, there's (still) no foolproof cloud
Summary: Talk about strange timing: Yesterday, I heard from a business user of Microsoft's Windows Azure cloud platform that his company had been taken down by an Azure storage outage earlier this month.
Talk about strange timing: Yesterday, I heard from a business user of Microsoft's Windows Azure cloud platform who said that his company had been taken down by an Azure storage outage that lasted for six hours on April 15.
A day later, the Web is abuzz with news about an Amazon EC2 outage (going on 10 hours as I type this post) that seems to be centered around the company's cloud storage components.
Like Amazon does with AWS, Microsoft maintains visible dashboard pages showing the real-time status of all of its Azure-related components. From the Azure Storage page, it looks like there've been Azure storage problems resulting in "service degradations" on not just April 15 (in the North Central and South Central regions), but also on April 19 (in East Asia and Western Europe).
(click on the image above to enlarge)
I've asked Microsoft for more details about what specifically happened on April 15 that caused the reported downtime and am awaiting word back.
Update (4/22): Microsoft isn't saying much about the outage, other than to acknowledge it happened. The official response, delivered through a company spokesperson:
"At 6:40 AM PDT on April 15th, Microsoft became aware of an issue that affected some customers using the Windows Azure Storage service in the North Central and South Central US regions. This issue has been resolved. We regret any inconvenience the outage may have caused our impacted customers. As always, we will investigate the cause of this issue and take steps to better ensure it doesn’t happen again."
The user who contacted me -- who asked not to be named -- said he believed there was a misconfiguration during storage deployment that hit both North Central and South Central U.S. at the same time that affected the way the load balancers were sending traffic. The user wanted to know more details about exactly what happened and what Microsoft is doing to head off similar types of problems in the future.
I'm not posting this to downplay what's going on with Amazon's EC2. Nor am I doing so because I've heard Microsoft or Microsoft partners trying to use Amazon's EC2 outage as a way to paint Azure as superior. (In fact, one member of the Azure team tweeted today that he hoped no one at Microsoft would do such a thing.)
Outages and glitches happen across the cloud, not just on the infrastructure side, but on the cloud apps side, too. They're a good reminder about the importance of backup/redundancy and the need to distribute one's cloud storage across multiple geographic locations, if and when possible, as one of my ZDNet UK colleagues tweeted today.
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
RE: Whether it's Amazon or Microsoft, there's (still) no foolproof cloud
RE: Whether it's Amazon or Microsoft, there's (still) no foolproof cloud
RE: Whether it's Amazon or Microsoft, there's (still) no foolproof cloud
Banking on cloud is like banking your retirement on Social Security
RE: Whether it's Amazon or Microsoft, there's (still) no foolproof cloud
<a href="http://www.riseuniversity.com/schools-majors/business-and-management/">Business management degree</a> <a href="http://www.riseuniversity.com/schools-majors/computer-science/">online computer degree</a>
RE: Whether it's Amazon or Microsoft, there's (still) no foolproof cloud
RE: Whether it's Amazon or Microsoft, there's (still) no foolproof cloud
Don't worry, if your machines don't break, your people will get sick. Murphy's Law.
In house outages are a different animal entirely.
First off, if the outage is purely an in house problem, then its at least a problem of your own making, so to speak. Of course in house outages may occur due to issues relating to sources you have little or no direct control over but at least you have the sense of feeling that its your problem and you are taking steps to deal with it.
And there in lies the rub. When you are the one dealing with the problem, it brings an entirely different dynamic to the whole coping strategy and optic as opposed to simply knowing your services are down and hoping to heaven that those out there in control will bring them up again ASAP.
When its your problem and your the one working on it, the situation brings many things to the table that are important to long term decision making. Firstly, you are likely to get far more informative and timely updates as to the current status of your outage. This among all things is of paramount importance for those in positions of responsibility who make the decisions. It at least brings some confidence to the process of recovery if the answers one is getting indicate that indeed everything that can be done is being done. It also helps to know that those working on the problem for you are working on your problem specifically as opposed to doing things in a way that is perhaps best for the company hosting your services in a more general way even if it means further delays for you specifically.
Could the in house outage be longer then the cloud outage? Of course, its kind of a crazy question actually. Its like saying what is likely to be worse, falling ill at home or in a hospital? Without any further parameters characterizing the question its almost pointless. You might catch a cold at home but catch some kind of flesh eating disease at a hospital. Or visa versa. Or whatever. A more important point is this. Any company providing reliable cloud based services should by all accounts have numerous backup and fail safe protocols that the average small business just doesn't get into. In those respects it makes the cloud based service generally more reliable when serious issues arise that would need those kinds of things in place. On the other hand, when something goes really bad with one of these big service providers it could be very bad because with the kind of back up they have only the really bad would typically have an impact.
Its like my father used to say about four wheel drive vehicles, they don't get stuck often, but when they do its a disaster.
Without an in house backup you are really left to the mercy of the powers that be with cloud computing. Not getting on the spot detailed timely updates on the recovery process and little to no say whatsoever in any remedial plans to avoid the same thing happening in the future. And of course, as an individual person or organization, having little to no priority in your interests over that of the provider or of any of the multitudes that they are servicing. And the fact is, many of the better in house backup plans might well put into question the need for cloud based services at all.
Cloud; not yet ready for today.
Message has been deleted.
using the cloud is fine IF...
Message has been deleted.
RE: Whether it's Amazon or Microsoft, there's (still) no foolproof cloud
Message has been deleted.
mmichalik, the offense is from you
:|
"We don't need no steenkin' server outages!"
I'll stick to running my own cloud services
In the end, it's still hardware
These outages are all about money not technology.
Our business case was to save the company huge dollars and it worked well. So the technology is there to do it. It would litterally take a natural disaster to hit all 4 of our major data centers all at once to have the type of outage that Amazon and MS are having.
However, and this is a BIG however. It was costly to do. The investment was in the billions dollars.
What these companies are doing is a cost analysis and determining for you what acceptable downtimes are - they are probably right for 90% of their customers - or whatever percent they choose to plan for. They have obviously not designed for the type of business case we have. It is one of the reason I think cloud computing has its place, but it is not the silver bullet everyone is proclaiming. It is like public education where everyone gets taught the same, it does not work for everyone. Imagine the application for my pacemaker monitor going down for 6 hours. I could die before my Dr. is notified of any problem.
There is no reason Amazon and MS cannot eliminate these outages, except for the cost would probably price them out of the market for the average cloud user.
You will not see mission critical applications in the cloud unless this HA costs come down, for this reason.
RE: Whether it's Amazon or Microsoft, there's (still) no foolproof cloud
The difference here is your company created in an essence a private cloud which you invested the neccessary capital into and you control it therefore it works as well you designed it. MS - Amazon are operating under the premise how can we do this for as cheap as possible and still attract/keep our customer base. 90% is what they believe customer will tolerate since 1. Most customer cannot afford to implement it or 2. Are lazy and just work rather have the outage and take it in the rear.
Message has been deleted.