The glitch that stole

The glitch that stole

Summary: didn't act quickly enough to explain to users what was going on when it suffered an outage yesterday.


Pity tech consultant and blogger Jason Klernow, who has lost the use of both his blog and his sales automation service in the past week, as he reported on his blog yesterday:

" has been down for most of the day.  No information at all on the outage.  Only one other posting from Sales Force Watch.  Why can't the company put up a simple web page with update information?

"Typepad (which I use for this blog) was out earlier this week and was quick to provide information and updates.   While I can live without typepad for a day or several, it is hard to work without"

Today, news of the outage has hit's share price, and only now has the company issued a statement to explain what went wrong:

On Tuesday, December 20th,, some users experienced intermittent access (between approximately 9:30 am and 12:41 pm ET & 2:00 pm and 4:45 pm ET) on one of the company’s four global nodes.  The root cause of the intermittent access was an error in the database cluster. addressed the issue with the database vendor. By Tuesday afternoon EST, the system was running normally for all users.

All four global nodes are currently operational and running normally.  There are no outstanding issues in the system at this time. No other aspects of the system were involved.

Although it appears that the outage only affected a minority of customers, those customers deserve an explanation, and should have had one faster. As one user told CNET yesterday:

"I wonder if we're in the cheap seats," [Charlie Crystle, CEO of Mission Research in Lancaster, Pa] said. "I wonder if the small customers get relegated to a cheaper service solution or something like that."

There's even a new blog called Gripeforce:

"I have started this blog as a forum for other disgruntled users of!

"I am sick of all the downtime, tired of the arrogant sales people (I feel like CS only contacts me when they want to sell more licenses), and if I never hear or see another interview with Marc Benioff again, it will be too soon."

All of this simply reinforces the message that it's not enough, as another provider has been boasting to me in an email today, to "have historically achieved 99.996% uptime." What's more important is to have a plan in place to contact and keep customers informed when the unthinkable 0.05% of uptime unexpectedly wipes out their service for hours at a time.



Phil Wainewright

About Phil Wainewright

Since 1998, Phil Wainewright has been a thought leader in cloud computing as a blogger, analyst and consultant.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Third Party Vendors

    I would like to thank Mr. Wainewright for this article. It is good to see a Third Party Vendor own up to their problems. However, it highlights the problem with TPVs. Like the recent spate of articles regarding TypePad, I really need to ask why in the world would anyone outsource not just a critical business process such as CRM, but critical, confidential data to a TPV. Wake up people! This is your business!

    What if this article had been not about a database bug that took offline, but about a database bug that revealed one clients data to another client? Not unimaginable, not by a long shot. A few days ago, a client of mine dropped a configuration file from the wrong local directory onto the server, and poof! Their application started working just fine... pulling data from a different database schema. That is all it takes. The application appears to be working correctly, because it is working correctly. It's just getting the wrong data from an identical data stack.

    Every time I read something like this, I like to save the bookmark, so when my boss asks me why I am spending so much time installing software onto our server, or writing customer applications for ourselves, or requesting funds for additional hardware, redundancy, and software, I can show him why I refuse to ever have any part of our business be in the clutches of a third party vendor if possible.

    At this point, our business has these items sourced by a TPV:

    * Office space (we are working to buy our own building)
    * Utilities (electricity, water)
    * Telecommunications (phone service, Internet connectivity)

    We have some measure of redundancy on all of these items. We perform regular backups with offsite storage; in a major "crunch" situation I can have our server duplicated in a new location. I can (at the speed of backup recovery) have users able to access important files on an ad hoc basis. We have UPS protection for critical pieces of equipment, and are looking to extend that coverage. We all have personal cell phones in case the lines are down. I am currently investigating a redundant data carrier.

    Is this expensive? Heck yes. But thanks to articles like this, my boss sees the need for us to never be held hostage.

    Folks: Get off the TPVs and end your pain. You are using a TPV because you are too lazy to learn how to run the system yourself, and too cheap to run the system yourself. These are the same reasons why home users don't run antivirus products, and then get viruses. This is why people who don't lock their doors get robbed. Lazyness and cheapness are two of the biggest drivers behind computer problems, if not most problems in this world. End your addiction to laziness and cheapness. Recognize that doing IT *right* requires a large investment in time, money, effort, sweat, frustration, and knowledge, just like any other aspect of your business. Recognize that yes, IT is a cost center, but cheap IT is a disaster center.

    And get away from any TPV that you can, immediately.

    Justin James
  • SOA = DOA?

    or "D" later on . . .
    Roger Ramjet