On Wednesday, hosting company GigeNet had to shut down their Chicago datacenter, which resulted in outages for their customers around the world. The shutdown only lasted two hours, but anytime a hosting facility has to go offline, you know that a variety of customers are going to be impacted.
When the smoke alarm report hit the local fire department their response was to request that GigeNet shut off the power to the entire datacenter.
Now normally when I hear about a commercial services datacenter going offline I usually drill down into their backup, disaster recovery and business continuity plans, just to see where they went wrong. Unsurprisingly, it is often a case of simply the lowest cost option means that customers aren’t paying for significant reliability. But in the case of GigeNet, the failure was because they had to turn the power off, and it wasn’t their idea.
GigeNet reported that early on Wednesday afternoon they had a smoke alarm go off. And in a response to a customer post they followed up their public message to customers about a possible service interruption with the fact that there was no fire, just the smoke alarm.
In most situations the issue would then be to identify the source of the alarm, check for fire alarms, isolate the source, and perhaps shut down a rack or module to determine the problem, if you determine that the alarm is real. But GigeNet didn’t have that luxury.
When the smoke alarm report hit the local fire department their response was to request that GigeNet shut off the power to the entire datacenter. In response, and I’m sure due to legal and insurance requirements, GigeNet did so, in an apparently orderly fashion, as they were back online in just a few hours. But what this meant was that no matter how GigeNet had designed and implemented technology and practices to keep their facility running in the event of minor disruptions like this, that all went out the window when the local authorities said “shut it down.”
Certainly, concerns over a fire are reasonable, but with a highly instrumented datacenter, it should be possible to quickly determine the actual problem. And your relationship with local authorities is going to have an impact in determining if they trust you to make an assessment of the problem or react like the fire department in GigeNet’s case.