Google explained why its App Engine cloud service failed earlier this week. Following a similar failure at Amazon, the outage raises questions about whether cloud computing is ready for mission critical application deployment.
A company representative described the solution in a discussion forum:
We've identified the root cause of the issue and implemented a fix. Specifically, we've instituted a set of controls to ensure 1) that datastore queries no longer trigger this particular bug and 2) that bugs like this in the future don't affect the stability of the system as a whole.
Google's outage gives a black eye to cloud computing that may erode user confidence, causing the development community to delay adopting the service. When Amazon's S3 cloud service went down last February, developers experienced real pain.
Google's message acknowledged the company's immaturity in being responsive to users:
We're also trying to make sure that we build effective ways to communicate with developers about the hiccups that occasionally occur with large and complex systems like this, and we'd welcome your feedback and ideas
While the admission is welcome, it reflects a more basic set of customer service problems at Google.
ZDNet blogger, Garett Rogers, wrote about this topic:
One beef I’ve had for quite a while now is Google’s noticeable lack of commitment to personal support for people using their products and services.
Of course, the overall cloud computing issue remains a question that won't be solved today. At the very least, however, Google should immediately implement real-time, service level dashboards to increase transparency.
Amazon implemented service reporting after its failure and Salesforce.com offers a best in class example of dashboard openness.