Glitch and recent system outages

Glitch and recent system outages

Summary: Recent data center and software failures have people pointing fingers. Should we look at a broader causal factor and could we be tempting fate?

SHARE:

There have been some well-publicized system outages recently. Cloud hosting sites have been down as have some cloud services and highly used applications.

Why these systems fail can be attributed to a number of causes. Some are due to hardware failures. Some fail due to operator error. Some fail due to software errors. And, several fail without any of us either aware of it or ever knowing what caused these failures.

As these events occur, I keep remembering a book that Jeff Papows (Jeff was Lotus' CEO and President of Cognos) wrote called Glitch (from Prentice-Hall). While I read months ago, recent events made me recall the words in those pages.

Glitch by Jeff Papows

Glitch by Jeff Papows

In Glitch, Jeff tells how software bugs trigger some spectacular problems. Some of the examples he cites are even tragic. But, through them all, he paints a picture that leaves no doubt as to the enormity and economic damage these glitches create.

Not all the glitches he describes are due to the willful disregard or negligence of programmers. Some occur when two independently created pieces of software (or software and hardware) are mated together without people understanding all the possible ways these different products will be used together. While many of us can test code for the predictable/knowable situations, we have trouble testing for the unknown and unforeseeable events. That's when really hinky things can happen.

Yes, more testing is also beneficial and for certain situations (like aviation navigation software) it is imperative. But, like all business decisions, human beings must make decisions as to when an adequate level of testing been completed. That thought makes me uncomfortable just thinking about it.

I'm not saying that we should accept anything less than perfection but I am also enough of a realist to understand that nothing in this world is perfect. The challenges Jeff describes in his book are only the tip of the problem - and - it's a problem that will only grow more apparent as we incorporate more software into our automobiles, homes, work, etc.

Several times this year, I've been chided for having a really old cell phone. It's not a smart phone but it does exactly what I need it to do: phone calls, voice mail and the odd text message. There's something to simplicity and that phone is one of the simplest pieces of technology I own. It has never had a software upgrade. It can't do much and, as a result, it has very limited means of failing me.

I worry when we surround ourselves with overly engineered 'solutions' as someday, sometime, somehow they will fail. When you see some spectacularly over-engineered products, think about the beach houses on U.S. barrier islands. People construct those homes thinking the engineering of the building will spare it from major damage from hurricanes. While the roof might survive the winds and the pilings might mitigate some of the flood damage, nothing will save the structure when the wind topples a utility pole onto the house and causes the transformer to short out and cause a massive fire. Sometimes, you can't predict every adverse situation. However, sometimes you should avoid over-engineered solutions.

Maybe the Luddites had it right after all....

Topics: Software, CXO, Mobility, Outage, IT Employment

About

Brian is currently CEO of TechVentive, a strategy consultancy serving technology providers and other firms. He is also a research analyst with Vital Analysis.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

3 comments
Log in or register to join the discussion
  • RE: Glitch and recent system outages

    Reality is an infinite series of IF-THEN statements. No wonder programmers have problems. ;-)
    tonymcs@...
  • RE: Glitch and recent system outages

    It's really very simple, "infrastructure", to access the cloud as opposed to a local machine (the one under your desk, or company servers).

    You have to first use the machine under your desk, then probably the company server, then a DSL modem, then to a phone line, then to another DSL modem (exchange MUX), then through more phone lines, and modems for fibre, then to another phone line, then to another DSL line (via more modems), then onto some other remote computer (probably exactly the same as the one under your desk).

    THEN ALL THE WAY BACK AGAIN.

    It's 'links in a chain' you will never know which is the weakest link or which link is going to break, but you know that if you'r chain is VERY VERY long, it has many many more links in it, each one could be the 'weakest link'...

    And how all that could possibly be more secure, more stable, more reliable, or more controllable than hosting your system locally. Having a chain with 3 or 4 links not possibly hundreds or thousands.

    That is apart from all the issues (legal) of 'record keeping', and many companies would NEVER want their data and files anywhere NEAR the internet (thank you very much)...

    So far I have yet to see any articles that actually state what benefits being 'in the cloud' there are ?

    Moving to 'the cloud' appears to be a retrograde move, that i do not think will every be 'mainstream', its just too 'iffy'.

    Bigger the 'system' the more outages you MUST have, how big is the 'cloud' ?
    Aussie_Troll
  • RE: Glitch and recent system outages

    I totally agree with this "I???m not saying that we should accept anything less than perfection but I am also enough of a realist to understand that nothing in this world is perfect. The challenges Jeff describes in his book are only the tip of the problem - and - it???s a problem that will only grow more apparent as we incorporate more software into our automobiles, homes, work, etc" more power

    <a href="http://www.rentalprotectionagency.com/tenant-screening.php">Background Check Rental</a>
    apollosan