Twitter datacenter failure highlights fundamental flaws

Twitter datacenter failure highlights fundamental flaws

Summary: Fast growing company decides to go with their own dedicated datacenter without a knowledgable hand in control. Chaos ensues.

SHARE:

In a Reuter's story Friday night, there was public confirmation of rumors that had been floating about; Twitter's move, in July 2010,  to their own datacenter facility in the Salt Lake City area, a turning point which many analysts had marked as the moment at which Twitter became a major entity in their own right, had been an abject failure. In fact, Twitter had quietly packed up their hardware and moved it into a shared hosting facility in Sacramento, CA

The move from renting servers from NTT America to their own dedicated facility was a big one and, in many ways, was quite the coming of age statement by Twitter, which has been valued by investors at over $7 billion dollars.  But it appears that soon after moving into the new facility, they started moving out.

While Twitter, well known for its Fail Whale images when it has been unable to keep its services up and running, their experiences with their own datacenter seem to fundamental failures in understanding how to implement a datacenter, if sources can be believed.  And in many ways, that's a Fail Leviathan; a failure orders of magnitude beyond the commonplace whale.

The Reuter's story reports that at least two fundamental infrastructure pieces were not in place when Twitter made their move into the facility. There was no redundancy in the links from the datacenter to the Internet, and less than half the promised power was available.  Given those conditions, it would seem like a move to the facility was ill-advised. In fact, the single-point of failure in Internet connectivity would seem to rule out its use by a company that is effectively non-existent without that link to the net.

And I'll give them the benefit of the doubt on power availability and presume that there was sufficient power for their current needs but the lack of contracted power meant that growth would need to be curtailed until it was available. Oh, and to add the proverbial insult to injury, the roof leaked, requiring datacenter staff to physically move servers out of the way of the internal rainfall after every shower.

Someone made the decision to ignore these very basic facility issues. It wasn't Twitter's newly hired VP of Operations, a position supposedly non-existent when the decision to move into the SLC facility was made, but someone at a fairly high level had to make the choice that these deficiencies were acceptable and would be able to be fixed before a problem developed.  Someone who should have realized that decisions of this magnitude shouldn't be made without a basic understanding of how datacenter facilities are supposed to work.

I'm guessing that someone else decided that it was a really bad choice.

Topics: Hardware, Data Centers, Storage, Social Enterprise

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

10 comments
Log in or register to join the discussion
  • RE: Twitter datacenter failure highlights fundamental flaws

    Really old news
    mrlinux
  • RE: Twitter datacenter failure highlights fundamental flaws

    Executive management never listens to their IT people. I'm sure somebody in IT said "this is a bad idea" and were ignored because "the data center says it's a good idea, you must be wrong, go back to your desk and work" . They also probably were required to work nights and weekends to make it happen while everybody else in the company was partying... has anybody checked to see if they're having trouble paying bills?
    dbeecher@...
  • RE: Twitter datacenter failure highlights fundamental flaws

    @james347 Just because you can't effectively use a tool doesn't mean it's useless!
    While Twitter has more than its fair share of vapid comments, it's a very useful tool for marketing & communications monitoring when used properly.
    jmwells21
  • RE: Twitter datacenter failure highlights fundamental flaws

    Story is an amazing tale of how a company could be taken down from a bad decision - and one wonders how the shortcomings were communicated that the decision to go forward was made. Unbelievable.
    i_jessica
  • RE: Twitter datacenter failure highlights fundamental flaws

    Sadly, we have a lot of new upstarts in the online world who think they know more then they do. Im sure someone thought a data center is "just like a desktop only more computers, right?"

    This is how Windows gets into data centers and how bad data centers in general get built.
    jeffpk
    • RE: Twitter datacenter failure highlights fundamental flaws

      @jeffpk - I dont understand why you have mentioned Windows - surely not because of power consumption?
      chrisjrmason@...
    • RE: Twitter datacenter failure highlights fundamental flaws

      @jeffpk ... +1
      tom@...
  • Windows gets into data centers because of ignorance

    @chrisjrmason
    I believe JeffPK is referring to the fact that data centers get built when a small company gets bigger and needs to have reliable power and networking for office computers. They were previously using windows, and even if they just need file sharing, they'll put in windows computers for that just because it's what they know, not because they've done a cost benefit analysis for the whole end-to-end proposition.

    Because windows has so much software running, out of the box, that's more software that can be exploited, and because of the historic design of windows (DOS inter process comms via COM/DCOM) all of the processes suckle off of each other and this allows one virus process to infect and essentially infiltrate a system completely with little real effort.

    Sure, over time, MS has tried to remedy these things, but practically, the lack of separation means that you will get to completely reload your machine when something falls apart, instead of just reinstalling/rebuilding the failed software bits as you would with a more reliable server OS such as Solaris, Linux, BSD and other UNIX like systems.

    These systems were designed from the start to support multiple users with authentication and authorization, so it is prevalent, reliable and configurable. In the '80s and '90s DOS/windows users on the internet news group stood around and said "We don't need no stinking multi-user, multi-tasking OS." Our system works great, as it is. They also said, "We don't need protected mode execution, you UNIX software bozos just need to write correct programs that don't crash."

    Now with the 3 decades plus of virus/worm infiltration into the OS "that works fine the way it is", I think that most of them have in fact learned why all of those things are advantageous in the internet environment.

    Server data centers need to be running as "small" amount of software as needed to perform the specific tasks needed so that there are far fewer exploitations available and far less software bugs/bad behaviors to deal with.

    Segregation of responsibility and distribution for redundancy is the way. Look at what Google has done to make sure that they never have a problem delivering services. That's proper Data Center design.
    greggwon@...
  • RE: Twitter datacenter failure highlights fundamental flaws

    I'm thinking the facility oversold its capability, and the poor techie sent to investigate bought every word. Middle management then bought the techie story and convinced the tops that the move was "good". Those risks were probably not even known until after move-in day.
    William_P
  • RE: Twitter datacenter failure highlights fundamental flaws

    @james347 ... Ditto for ALL such scrapiing, I mean, personal data collection points. Facebook is the worst of the bunch, LinkedIn the best, and it's mediocre at best. No worry; the "cloud" will fix it all. Yeah, right!
    tom@...