If you trade in dependency, you have to earn trust

If you trade in dependency, you have to earn trust

Summary: Yahoo's Cyber Monday blackout is just the latest in a lengthy catalog of 'bad news' stories about on-demand applications and services. Is the entire online model for applications fatally flawed?


For a while now, I've been keeping track of what I've been calling 'bad news' stories about on-demand applications and services. Yahoo's Cyber Monday blackout is just the latest in what has become a lengthy catalog:

  • Yesterday, Yahoo! Small Business: "Yahoo's small business merchant systems go down during the peak Cyber Monday shopping season for much of the day." (ZDNet's Larry Dignan)
  • Last week, Skype: "Internet telephony firm Skype could have kept the London numbers it stripped from customers this week for only a modest fee ... Business customers were given just a month's notice, leaving many with thousands of pounds worth of printed advertising and stationery that was now effectively useless." (PC Pro)
  • Two weeks ago, TinyURL: "TinyURL is apparently down for the count right now. Whether you're trying to compress a URL into a tiny one (via TinyURL.com) or attempting to visit a TinyURL-based URL that was created and distributed to you by someone else (for example, one of the TinyURLs that appears in my Twitter feed on the right of this page), visits to TinyURL.com are returning '500 - Internal Server Error'." (ZDNet's David Berlind)

Google and Microsoft have had their share of on-demand mishaps, too.

  • Ongoing, Gmail: "A small but steady stream of Gmail users ... regularly report losing some, many, or all of their messages without a clue as to why. It seems that hardly a week goes by without at least several users reporting this problem on discussion boards, such as the official Gmail Help forum." (IDG/InfoWorld)
  • Earlier this month, Windows Live Foldershare: "... it’s been dead for days, which is really bad, as it has become a key part of my infrastructure: I sync three computers using Foldershare, and run Mozy to create online backups on one." (Zoli's Blog)
  • Last month, Google FeedBurner: "Went to check my FeedBurner account, only to be informed I'm missing a 'spoon' ... Note to error-page authors at Enterprise 2.0 companies: the fact your application is down and interrupting your users' work is neither funny nor cute." (ZDnet's Michael Krigsman)
  • In August, Windows Genuine Advantage: "The server that verified users went down and began to disable ... the operating systems of computers that checked into the home base to affirm their legitimacy. The WGA server outage hit on Friday evening, Aug. 24, and was finally repaired on the next day. It was down for 19 long hours." (John Dvorak)
  • And throughout, a series of calamities at hosting providers: 365 Main, Navisite, Rackspace and Jatol.com, which apparently ceased trading without warning.

After amassing all this evidence — and the list is incomplete, there have been other examples that I didn't make a note of — should we conclude, along with Jon Dvorak, that the entire online model for applications is fatally flawed? Or at least perhaps we should agree with Michael Krigsman that, "SaaS may be here, but it’s not ready for mission-critical applications."

Personally, I draw quite different conclusions than these. I think the frequency of these stories over the past couple of months underlines that the on-demand model of consuming applications and services from Web-based providers has now become so prevalent that it's simply an accepted mode of behavior for mainstream computer users, whether for work or for leisure, for business or for personal consumption.

Lots of people do this now, in a huge variety of purposes and contexts, and thus the number of operational instances is expanding rapidly. When one of them fails, it's still a newsworthy exception. Most of the time, they stay running, and everyone takes that continuous service for granted, even though it's actually a lot more reliable and predictable than the computing operations that run in their own homes and offices.

Having said that, too many of these stories demonstrate that providers don't fully appreciate the nature of the relationship they've entered into. Their clients depend on them, quite literally, and that creates strenuous obligations on the part of the provider, including:

  • Stating clearly in advance what service levels clients can expect
  • Providing mechanisms for clients to monitor service levels
  • Keeping clients informed when things go wrong
  • Minimizing risks of failure at times of high demand or vulnerability
  • Maintaining established service levels throughout the contracted life of a service
  • Providing enough notice to allow clients to change provider before implementing any reduction in service levels
  • Giving clients direct access to their assets if the provider goes out of business

Banks learnt all of this long ago. Their industry has taken steps to make sure depositors are protected even after a bank goes bust. That is a price they are willing to pay because they understand (and have learnt from bitter experience over the past two hundred years) that it is the price of trust. On-demand application providers are in exactly the same kind of relationship; their clients depend on them for everyday functions and operations, and therefore trust is paramount. Get it wrong, even for a few hours, and unless you put it right and show your commitment to earning and retaining client trust, it's gone for ever.

Topics: Social Enterprise, Browser, Collaboration

Phil Wainewright

About Phil Wainewright

Since 1998, Phil Wainewright has been a thought leader in cloud computing as a blogger, analyst and consultant.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Trusting/Depending on any on line anything is

    a sure way to find a great deal of disappointment. Hey, I love the internet as much as anyone, but I fully understand that it is no where near ready for any critical use. The desktop is going to be with us for a very long time...
  • More reliable than operations running...

    locally is so far out there it is ridiculous. If this is the case where you work, maybe you should take a good hard look at your IT department. I wonder if they share your sentiment?
  • SaaS Breakdown

    As a hosted lead response management software and dialer platform, our company became painfully aware of the SaaS "service" implications earlier this year when a full fiber-optic line on the local telecom backbone got cut, dropping service to half-a-dozen of our company's core servers.

    We were clearly not to blame--how in the world do you plan for a fiber-optic line getting cut?--but still had to shoulder the responsibility for our clients. As a whole, though, system uptime far exceeds down time for applications like ours.

    --Steven R. Watts
    • You're not charging your cusomers money are you?

      I have to ask why any of your customers signed on to depend on any provider that doesn't have multiple geographically diverse data centers on each continent where they do business where multiple failures could be tolerated and still sustain the load for the entire continent with < 20% degradation. For the US you should have a couple on the East coast and one in the Moutain timezone ands a couple on the West coast. A cut fiber optic line is no different or less foreseeable than a natural disaster, brown or black out, or anything else that could disrupt the service.

      The bottom line is it is your fault if your service fails due to something as predictable as a fiber optic line being cut. In fact I mentioned this as a specific example (track hoe at a construction site down the block) in another response to another post just about a week ago. This has in fact happened several times to several providers over the last few years. Your infrastructure should have accounted for this with very up to date (15min or so) data replication across your data centers.

      You should in fact be able to sustain operation with multiple simultaneous failures of different natures at different sites and should test for such regularly.

      Geographic diversity is a must, not just because there might be an earthquake one place or a hurrincane another but because infrastructure failures can be widespread. Remember several years back when higher than normal cyclical solar activity started a power failure in Canada? As the automated response systems tried to shutdown failing systems and reroute supply around it started a chain reaction of failure after failure until about half of Canada was without power.

      I would expect to have about 50% normal response if all your North American date centers go out and I have to get sevice here in the states from one of your European, South/Latin American, or Asian data centers.

      Remember five 9's uptime means down for just over 5 minutes per year. There's no way you're going to recover from a severed fiber cable, let alone a natural disaster in 5 minutes. You must have diversity and redundency that's routinely tested.

      Unless you can deliver this you really should be providing your service for free on a trial basis, not charging for it...
      Johnny Vegas
  • SLAs

    Excellent summary. While it's obviously caveat emptor for the users of SaaS, the providers really need to take the reliability of their services more seriously. If nothing else, they should have some form of service level agreement, so at least their customers know what to expect. Most importantly, the SLA should be published in a form that is easy to understand, rather than buried in a mountain of legalise in their ToS or EULA.

    In other words, providers should say what they're willing to support, clearly and openly; their customers will then decide whether that's good enough. If it isn't, they've got some work to do if they want to stay in business!
    Jason Etheridge
  • RE: If you trade in dependency, you have to earn trust

    Good summary. I think this article shines a light on the fact that providers of SaaS have to focus as much time and energy on the delivery aspect of their offering as they do on the application itself, you can't simply delegate it to the hosting provider and assume that they will do the same for your business as you would. Absolutely SLA's are important, but make sure that they are just not marketing, and the infrastructure you've deployed is robust enough, you have enough insight into the performance to feel confident that you've done enough to put some meat behind the SLA's.
  • Key is Transparency

    I think SFDC pointed the way with trust.salesforce.com (which we've copied) ... no matter what an SLA says, no on-demand service can promise 100% uptime for every customer, everywhere, everytime, from every single data center. Salesforce learned the real issue was transparency to their customers, so their customers understood what was happening when there were issues, what the status was, and when the issues would be resolved.
  • Next level for SaaS market

    I think vendors are already preraing to offer the next level of reliability of their services. I think solutions like <a href="http://aws.amazon.com/s3">Amazon S3</a> is what most of SaaS providers need to adopt to catch up with this trend.