How Twitter tweets your tweets with open source

Twitter couldn't exist without open-source software, and they know it and they share their own code back in return.
Written by Steven Vaughan-Nichols, Senior Contributing Editor
Twitter works wing in feather with Linux.

San Diego, CA: Some people may have been surprised when Twitter recently joined The Linux Foundation. You couldn't tweet about your dinner, your latest game, or the newest political rumor without open-source software.

Chris Aniszczyk, open-source manager at Twitter, explained just how much Twitter relied on open source and Linux at LinuxCon, the Linux Foundation's annual North American technology conference. “Twitter's philosophy is to open-source almost all things. We take our software inspiration from Red Hat's development philosophy: 'default to open.''”

Specifically, according to the company, “The majority of open-source software exclusively developed by Twitter is licensed under the liberal terms of the Apache License, Version 2.0. The documentation is generally available under the Creative Commons Attribution 3.0 Unported License. In the end, you are free to use, modify and distribute any documentation, source code or examples within our open source projects as long as you adhere to the licensing conditions present within the projects." Twitter's open-source software ware is kept on GitHub.

You're welcome to use this code. Indeed, Aniszczyk strongly encourages others to use and build on it. 

Twitter itself is famous, or infamous in some circles, for having been built on Ruby on Rails. Today though Aniszczyk said, Twitter has moved to Java and a list of open-source programs longer than your arm.

If Unix and Linux are operating systems that are made of many utilities loosely coupled than Twitter is a social network made up of many open-source programs loosely couped together. Some parts will be familiar to anyone in Linux or Web development circles.

Twitter's core operating system is Linux 2.6.39 and for its core database it uses MySQL. To manage the source code for the rest Twitter uses Git. Linus “Linux” Torvalds' other software baby.

But, let's cut to the chase, what actually happens when you tweet?

First unless you've never used “The Twitter,” you know that a tweet is a short of 140 characters or about 200 bytes. When you send this tweet it will soon be “fanned out” to the people who read your tweets. Sound easy right? “Wrong!” Proclaimed Aniszczyk.

The problem is the Twitter's scale. Twitter handles 2.8-billion tweets during a typical year. That counts to 5,000 tweets a second on average. But, Aniszczyk said, things aren't always average. When someone noticed the singer Beyonce showing a baby bump, traffic went up to 8,800 Tweets per second (TPS). The last SuperBowl? 12,000-plus TPS, and when someone got the idea that everyone should go see an anime movie and then tweet about it, Twitter faced one of its greatest challenges: 25,088 TPS.

What happens with each of these tweets is they put are registered as a status update. Then each one is given a unique ID using a program called snowflake. Next, it's geolocation data is noted by Rockdove, a program that hasn't been made open-source yet.

Each tweet is then checked by a combination URL shortener and spam detector called t.co. Once past this stage, each tweet is stored in MYSQL by Gizzard, a flexible sharding framework for creating eventually-consistent distributed datastores. Now, and only now is an HTTP 200 signal, meaning all has gone well, to your Web browser.

Of course at this point your tweet hasn't gone out to the world. First, your tweets get started on their way to Bing and other search programs using the Firehose application programming interface (API). Finally, your tweets are ready for fanout, that is heading to your friends, family, and fans.

The actual process is handled by FlockDB. This is an open-source graph database that sits on Gizzard and pulls data from MySQL. FlockDB contains all of Twitter's users and their relationships to one another. Now, armed with the your followers addresses your tweets are finally on their way.

The average time all this takes? About 350-milliseconds. Not bad for a system handling 5,000 TPS every day, 24-hours a day.

Twitter may be causing some of its would-be partners grief with tighter API rules, but the company itself does an exceptional job of delivering thousands of messages every moment of the day with open-source software.

Related Stories:

Twitter edges out third party clients with tighter API rules

5 tips to increase Twitter followers

Now businesses have invaded Twitter, what next?

Twitter makes 'cashtag' stock symbols official

CIO view: Five tips for using Twitter

Editorial standards