How Twitter tweets your tweets with open source

How Twitter tweets your tweets with open source

Summary: Twitter couldn't exist without open-source software, and they know it and they share their own code back in return.

SHARE:
twitter-linux
Twitter works wing in feather with Linux.

San Diego, CA: Some people may have been surprised when Twitter recently joined The Linux Foundation. You couldn't tweet about your dinner, your latest game, or the newest political rumor without open-source software.

Chris Aniszczyk, open-source manager at Twitter, explained just how much Twitter relied on open source and Linux at LinuxCon, the Linux Foundation's annual North American technology conference. “Twitter's philosophy is to open-source almost all things. We take our software inspiration from Red Hat's development philosophy: 'default to open.''”

Specifically, according to the company, “The majority of open-source software exclusively developed by Twitter is licensed under the liberal terms of the Apache License, Version 2.0. The documentation is generally available under the Creative Commons Attribution 3.0 Unported License. In the end, you are free to use, modify and distribute any documentation, source code or examples within our open source projects as long as you adhere to the licensing conditions present within the projects." Twitter's open-source software ware is kept on GitHub.

You're welcome to use this code. Indeed, Aniszczyk strongly encourages others to use and build on it. 

Twitter itself is famous, or infamous in some circles, for having been built on Ruby on Rails. Today though Aniszczyk said, Twitter has moved to Java and a list of open-source programs longer than your arm.

If Unix and Linux are operating systems that are made of many utilities loosely coupled than Twitter is a social network made up of many open-source programs loosely couped together. Some parts will be familiar to anyone in Linux or Web development circles.

Twitter's core operating system is Linux 2.6.39 and for its core database it uses MySQL. To manage the source code for the rest Twitter uses Git. Linus “Linux” Torvalds' other software baby.

But, let's cut to the chase, what actually happens when you tweet?

First unless you've never used “The Twitter,” you know that a tweet is a short of 140 characters or about 200 bytes. When you send this tweet it will soon be “fanned out” to the people who read your tweets. Sound easy right? “Wrong!” Proclaimed Aniszczyk.

The problem is the Twitter's scale. Twitter handles 2.8-billion tweets during a typical year. That counts to 5,000 tweets a second on average. But, Aniszczyk said, things aren't always average. When someone noticed the singer Beyonce showing a baby bump, traffic went up to 8,800 Tweets per second (TPS). The last SuperBowl? 12,000-plus TPS, and when someone got the idea that everyone should go see an anime movie and then tweet about it, Twitter faced one of its greatest challenges: 25,088 TPS.

What happens with each of these tweets is they put are registered as a status update. Then each one is given a unique ID using a program called snowflake. Next, it's geolocation data is noted by Rockdove, a program that hasn't been made open-source yet.

Each tweet is then checked by a combination URL shortener and spam detector called t.co. Once past this stage, each tweet is stored in MYSQL by Gizzard, a flexible sharding framework for creating eventually-consistent distributed datastores. Now, and only now is an HTTP 200 signal, meaning all has gone well, to your Web browser.

Of course at this point your tweet hasn't gone out to the world. First, your tweets get started on their way to Bing and other search programs using the Firehose application programming interface (API). Finally, your tweets are ready for fanout, that is heading to your friends, family, and fans.

The actual process is handled by FlockDB. This is an open-source graph database that sits on Gizzard and pulls data from MySQL. FlockDB contains all of Twitter's users and their relationships to one another. Now, armed with the your followers addresses your tweets are finally on their way.

The average time all this takes? About 350-milliseconds. Not bad for a system handling 5,000 TPS every day, 24-hours a day.

Twitter may be causing some of its would-be partners grief with tighter API rules, but the company itself does an exceptional job of delivering thousands of messages every moment of the day with open-source software.

Related Stories:

Twitter edges out third party clients with tighter API rules

5 tips to increase Twitter followers

Now businesses have invaded Twitter, what next?

Twitter makes 'cashtag' stock symbols official

CIO view: Five tips for using Twitter

Topics: Social Enterprise, Enterprise 2.0, Software Development, Open Source, Networking, Linux, Enterprise Software, Data Management, Data Centers, Big Data

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

15 comments
Log in or register to join the discussion
  • This is the power of Open Source

    We all use it, we all depend on it.

    It might never get 40% desktop penetration, but it still does far more heavy lifting than all the desktop market.

    And, being Linux, it will only get better.
    Michael Alan Goff
    • Thank you.

      Linux is about choice.
      RickLively
  • Would open source matter in this case?

    The power of Twitter or Facebook or even Google comes from the idea & vision which they had. They used the tools which they thought would be best suitable, taking parameters which they considererd like performance, cost, etc into account. Do you seriously believe that a service like Twitter cannot be created using Windows Server/Unix with Oracle/DB2/SQL Server?
    mm71
    • You alrready answered your own question ^^

      "They used the tools which they thought would be best suitable, taking parameters which they considered like performance, cost, etc into account." Just because the above parameters were not fulfilled effectively with other technologies they aint being used.
      Sridhar Mane
    • Yes, and no...

      As Sridhar mentioned, one of the parameters considered is "cost". If you're rolling your own anyway, why would you pay for a less flexible license with Windows Server or commercial Unix or Oracle or DB2 databases? Unless any of these provide any additional value, it is a waste of money.

      Sure Twitter *could* just as easily have been developed on these systems (SJVN hurts his own credibility by saying Twitter "wouldn't exist" without Open Source - a claim most people know is patently false), but there would have been a much greater expense involved, without any real extra benefit, and they would have had less flexibility to develop their platforms, as they would have had to develop within the bounds of a commercial license. This last issue could be a killer, because it would only take a more nimble set of developers working with more flexible, open systems to surpass them.

      Twitter, Google, Facebook and IBM have all proven pretty convincingly where open source software (particularly Linux) shines - custom, purpose-built, behind-the-scenes systems that are generally maintained by developers who know how to squeeze the most performance out of a system.
      daftkey
      • patently false?

        really, daft? just do the math and see if twitter was possible without OSS: start free service business model, take per-click ads on the plus side. Now - start subtracting: OS cost (per server license); app clustering and message passing; RAD tools; scripting language; system and network monitoring software; messaging software (you don't really think they could afford running MS Exchange, do you?). And just keep going.
        On the lighter note - i kinda like your new "subtle" FUD approach. :-P
        vgrig
        • Much more complex systems were built using commercial software.

          Not sure how suggesting that a system as complex as Twitter would have been possible - albeit more expensive - using commercial development platforms and databases is hardly spreading any kind of fear, or uncertainty, or doubt.

          What is it you fear? That it might actually be a correct assertion that some very complex systems - far more complex than Twitter - can be built using commercial closed-source software?

          Does it cause you to doubt that open source software is, in fact, the be-all and end-all of the software development world, just to suggest that there are alternatives?

          Or does it give you uncertainty in the level of benefit that Twitter actually derived from open source software as opposed to their own developers' blood sweat and tears?

          I was at least giving credit where credit was due. It appears you can't even handle it when someone who you dislike is agreeing with you.
          daftkey
  • inspiring story.

    leverage open source, design for failure.
    LeoRenCn
  • Steven - good post.

    Overall I have to say, this is probably one of the best articles I've read from you in a long time, Steven. Thanks for an actually very informative article - I'm sure more than a few of us weren't aware of the actual complexity of a tweet.

    The only thing in your article that I take issue with (and it was mentioned by others above) - "Open Source first" was a strategic decision that they made early on - not a direction that was somehow foisted upon them. To say they "wouldn't exist" without open source software is in a way a back-handed comment, and really discounts the amount of work these developers have done themselves to build such a powerful platform. Twitter deserves kudos for making a strategic decision to use open source software.
    daftkey
    • "To say they "wouldn't exist" without open source software...

      ...is in a way a back-handed comment"

      No , it's not - they would either need to write equivalent software (including OS) which would be impossible or buy commercial software (which even with Windows HPC license - that didn't exist btw when twitter started - times number of servers, well...).
      Bottom line - no preexisting OSS - no twitter; no twitter - no software "these developers" wrote... Not even a single line.
      vgrig
      • The fanboi doth prostest too much..

        "No , it's not - they would either need to write equivalent software (including OS) which would be impossible or buy commercial software (which even with Windows HPC license - that didn't exist btw when twitter started - times number of servers, well...)."

        Well if we're going to muddy the waters by talking about what existed when Twitter started, and then try to compare to the number of servers Twitter has, now, then I guess we can come up with all sorts of ridiculous arguments, can't we?

        Firstly, Windows isn't the only commercial software out there, and it's not the only OS that can host commercial applications or database systems. I don't know if you've heard, but DB2 and Oracle - they can run on Linux. Or Solaris. or zOS. I think Oracle can even run on Mac OS X if they wanted. And "when Twitter started", they probably wouldn't have been large enough that, if they DID go with Windows, they would have had to use an HPC license, even if it did exist.

        "Bottom line - no preexisting OSS - no twitter; no twitter - no software "these developers" wrote... Not even a single line."

        The method of developing the software would have been different; the platform that the software runs on would have been different; but the overall functionality would be the same. It would have been more expensive and less flexible to do so, but it would have been possible. To say that it would be "impossible" is stretching just a tad too far.

        The use of Open Source software was a strategic decision and direction taken by choice, not by necessity. There are plenty of technologies in the commercial world that would have facilitated Twitter's development.
        daftkey
        • Only problem is...

          "The use of Open Source software was a strategic decision and direction taken by choice, not by necessity. There are plenty of technologies in the commercial world that would have facilitated Twitter's development."

          ...why didn't they choose them. Which defeats the whole purpose of your spin.
          Cylon Centurion
          • I said exactly why they likely chose open source tools..

            It provides benefits that closed source software cannot provide - flexibility and cost savings.. No spin there - unless you want to dispute the fact that open source software generally provides these kinds of benefits.

            Just because something is the best solution doesn't meant that it is the only solution. You might think it is "impossible" to deliver a load of lumber to a job site with anything but a picker truck, but that would also be patently false - you could also deliver it with a regular pick-up truck.
            daftkey
          • But didn't you know, daffy?

            Free is bad. Free means inferior. Never take anything for 'free'.

            See, one has to pay tens of thousands of dollars out the ying-yang in order to feel legitimate. That's why if you take anything for free, you are a "freetard".

            The windows fanbuis told me so, so that makes it true. Right?

            ;)
            Cylon Centurion
          • And your point is... what exactly?

            There are fanbois on both sides of the fence - they're pretty easy to spot. They tend to talk in absolutes and think the world is always black-and-white.

            Unfortunately, these people sometimes get into the back offices of large organizations. Fortunately, they don't stay there very long.
            daftkey