Questioning IT

If you have Unix as your basic systems environment you can usually afford full redundancy - and one of the things that means is that your business should never find itself in a data recovery situation.

This is the 15th excerpt from the first book in the Defen series: The Board Member's IT Brief.

This section is concerned with things you should talk to your CIO about - informally, but with attention.

Topic one: disaster avoidance

Unix Architectures

One of the big benefits of Unix is that it leads to smaller systems staff, thereby enabling both a leadership approach to getting the job done and the use of redundant data centers for even relatively small organizations.

Once properly set-up, particularly if you use smart display desktops like this older X-terminal or Sun's newer Sun Rays for your desktops, dual Unix data centers are virtually unstoppable short of a national scale catastrophe.

Partners to whom you have legal obligations, for example, can be given access to both systems - in fact, if their access is via the internet, this can be automated to the point of complete invisibility. Similarly data, permissions, licensing, and other components can be shared between the centers such that users cannot tell which center they're on - and will simply not notice if one gets blown up.

Those who have done it, can do it; those who haven't, usually can't.

Of course, setting things up that way is one thing - having the courage to drill the team by randomly shutting down one of the sites is quite another. And yet, it's the only way to be sure -because there are all the usual gotcha's, and teams that don't practice, will turn out to have forgotten something important when it's for real.

What counts when reality strikes is having done it, regularly, successfully. Anything else, like a beautifully documented plan with nicely color printed communications and responsibilities charts of the type considered critical in the mainframe and Wintel worlds, is garbage. My advice? throw those out and consider sending the CIO with them.

The bottom line on Unix recovery is simple: you can afford redundancy and so should never have to recover, but you still have to insist on doing shutdown drills because teams that have done it, can do it, while those who haven't, usually can't.


Some notes:

  1. These excerpts don't include footnotes and most illustrations have been dropped as simply too hard to insert correctly. (The wordpress html "editor" as used here enables a limited html subset and is implemented to force frustrations like the CPM line delimiters from MS-DOS).

  2. The feedback I'm looking for is what you guys do best: call me on mistakes, add thoughts/corrections on stuff I've missed or gotten wrong, and generally help make the thing better.

  3. When I make changes suggested in the comments, I make those changes only in the original, not in the excerpts reproduced here.