When a power outage that affects a mere 30,000 to 50,000 customers knocks out some of the more popular sites on the Web you know disaster recovery plans have some holes.
On Tuesday afternoon a host of sites such as Craigslist, Red Envelope, Yelp Technorati and ZDNet had trouble delivering pages. The San Francisco Chronicle detailed the outage, which affected the 365 Main data center (see Techmeme).
These outages are disturbing. In theory, these sites should have gone to backup power. Where were 365 Main's backup generators? Data Center Knowledge reports that 365 Main's generators failed too. Meanwhile, a disaster recovery site should have kicked in. Preferably these sites would not be on the same power grid as their headquarters.
One question about this outage irks me: What if this outage was something worse--say a terrorist attack or Katrina? I'll tell you what would happen: Sites would have been out of business. And when your business is a Web site that fact is a tad alarming. I've noticed a few holes on the disaster recovery front of late: For instance, NetSuite relies on one third party data center facility in California to deliver its services. That means one power outage or earthquake and NetSuite customers have issues. At least NetSuite plans on adding another data center.
Perhaps I'm a little more in tune with the importance of disaster recovery since I'm in New York City primarily. I also remember those disaster recovery tips from the likes of Cantor Fitzgerald and the New York Board of Trade. They lost buildings and/or employees on Sept. 11, 2001. Rest assured they have their plans buttoned down. In New York your plans have to be set. Some businesses are in disaster recovery mode now after a blown steam pipe near Grand Central shut down the area.
And if you don't have the resources of the the big guys hopefully you at least have some space on layaway with a vendor like Sungard.
The basics are: Have your backup site on a separate grid; test your backup plan quarterly and keep your site running with data centers in multiple locations. The financial services folks have this drill down. They have prepped for everything from avian flu to another terrorist attack.
Apparently, others haven't learned the disaster recovery lessons. Some people just have to learn the hard way.