X
Tech

Twitter datacenter failure highlights fundamental flaws

Fast growing company decides to go with their own dedicated datacenter without a knowledgable hand in control. Chaos ensues.
Written by David Chernicoff, Contributor

In a Reuter's story Friday night, there was public confirmation of rumors that had been floating about; Twitter's move, in July 2010,  to their own datacenter facility in the Salt Lake City area, a turning point which many analysts had marked as the moment at which Twitter became a major entity in their own right, had been an abject failure. In fact, Twitter had quietly packed up their hardware and moved it into a shared hosting facility in Sacramento, CA

The move from renting servers from NTT America to their own dedicated facility was a big one and, in many ways, was quite the coming of age statement by Twitter, which has been valued by investors at over $7 billion dollars.  But it appears that soon after moving into the new facility, they started moving out.

While Twitter, well known for its Fail Whale images when it has been unable to keep its services up and running, their experiences with their own datacenter seem to fundamental failures in understanding how to implement a datacenter, if sources can be believed.  And in many ways, that's a Fail Leviathan; a failure orders of magnitude beyond the commonplace whale.

The Reuter's story reports that at least two fundamental infrastructure pieces were not in place when Twitter made their move into the facility. There was no redundancy in the links from the datacenter to the Internet, and less than half the promised power was available.  Given those conditions, it would seem like a move to the facility was ill-advised. In fact, the single-point of failure in Internet connectivity would seem to rule out its use by a company that is effectively non-existent without that link to the net.

And I'll give them the benefit of the doubt on power availability and presume that there was sufficient power for their current needs but the lack of contracted power meant that growth would need to be curtailed until it was available. Oh, and to add the proverbial insult to injury, the roof leaked, requiring datacenter staff to physically move servers out of the way of the internal rainfall after every shower.

Someone made the decision to ignore these very basic facility issues. It wasn't Twitter's newly hired VP of Operations, a position supposedly non-existent when the decision to move into the SLC facility was made, but someone at a fairly high level had to make the choice that these deficiencies were acceptable and would be able to be fixed before a problem developed.  Someone who should have realized that decisions of this magnitude shouldn't be made without a basic understanding of how datacenter facilities are supposed to work.

I'm guessing that someone else decided that it was a really bad choice.

Editorial standards