A lesson for the cloud: 100 percent uptime achieved -- for 16 years

It's all about expecting reliability.
Written by David Chernicoff, Contributor

For more than 16 years, a NetWare 3.12 server had been doing its job. Its current administrator, who was just a child when the server first went into service, was faced with the prospect of finally decommissioning it, as it’s antique 5 1/4" full height 800 MB hard drives were finally dying. And as reported by Ars Technica, the administrator decided it was finally time to put the server to rest last Friday

Today, the achievement seems amazing. More than 16 years without a glitch.  But the truth is more that the company had a use for the server for the last decade or so, not that the operating system kept working.  Because that was what NetWare did. It just worked. And while this story is amazing for the length of time the physical hardware survived, it’s only the most recent story about NetWare doing something that seems amazing, in a world where servers are reset, rebooted, and reconfigured at a rapid pace.

In 2001, a university found a NetWare server that had been lost for four years. IT knew it existed; they could see it and manage it on their network, but no one had any idea where it was physically located. It was eventually discovered during some renovations, when a hole was punched in a wall and it was found that a previous renovation had built a wall that blocked off the server closet where it was located.

Back in the NetWare days, servers staying up for years were not uncommon. Novell even had a page of screen shots from consoles showing extended uptimes that customers submitted. Domain controllers, file and print servers, application servers, it usually didn’t matter. The software just didn’t crash; the majority of users surveyed back then said that their servers were brought down only when they needed to be updated.

That’s the level of reliability that cloud service providers need to strive for. For users to depend on their services,  they need to just work. No excuses, no finger pointing, no questions. When clients boot up their systems, the provider services just need to be available. 

We’re getting there, but unplanned, unexpected, unprepared-for outages still plague the business, as well as a general unwillingness for providers to just say, “we screwed up; we will fix it and make it better.”

See also:

Editorial standards