Q&A with a Twitter technologist: 'Think: we're a startup'

Adams discusses what Twitter is really doing about security, and, quite frankly, why people should stop telling them what to do.

'Think: we're a startup'
Yesterday afternoon when U.S. Airways flight 1549 crashed into New York's Hudson River, microblogging site Twitter erupted as it always does during a major news event. At the time I happened to be chatting with John Adams, Twitter operations engineer and security industry veteran. "Twitter is going nuts," he said, and then he was gone.

Since Twitter launched, countless bloggers have earned a great deal of page views by presuming they understand the inner workings of Twitter's technology and business practices. Despite a history of stability issues and recent security concerns, however, Twitter is a strong company with a solid offering, rock star leadership and a quickly growing team and infrastructure. I spoke with Adams yesterday to find out more about what Twitter is really doing about security, the challenges of being a startup with such a rapidly growing user base, what the team does during crises such as U.S. Airways flight 1549, and, quite frankly, why people should stop telling them what to do.

Q. [Jennifer] What exactly happens when Twitter goes crazy like that (re: U.S. Airways)? What do you do?

A. [John]  The traffic surge was way worse than MacWorld, but we'd fixed the problems that caused us problems during MacWorld. When we saw the traffic we then watched our graphs to make sure that Twitter could handle the load adequately. If we begin to see tell-tale signs of issues, we react.

Q. Do you ever go to work and think, "Please no major events today" so you can get regular work done?

A. Yes.

Q. Last week's unrelated hack and phishing scams were all over the news. It was bold of Biz (Stone) to so openly discuss what happened -- what have you guys learned and changed?

A.  Twitter's compromise last week was of an admin account, as you know. We should not have allowed those tools to be accessible from the Internet, and that access has been greatly restricted now.

Q. Why were they accessible in the first place?

A. It's typical of startups to have a permissive environment until they grow larger and start taking security more seriously. I'm not saying this was intentional at Twitter, it's my opinion that many startups do not spend adequate time on security because they are far too busy keeping the lights on and developing code. Also, startups are usually coordinating many engineers and business people without a good infrastructure (say, VPN, best practices, and strong crypto).

Q. So was this all part of Twitter's "coming of age," so to speak?

A. As any service becomes larger and larger people will want to attack it because it becomes an attractive target. Saying " I hacked <joebob's web2.0 company>" vs "I hacked Twitter" has a completely different feel to it, right?

Next: Changes in Twitter's security -->

Q. What are you doing differently?

A. At the time of compromise, Twitter put together a tiger team of engineers, and we reviewed every access point in and out of the application. We increased the security of the sign-in mechanism, added security around the password and email change process, and restricted the support tools greatly. We've also put in a number of rate-limiting systems to throttle brute-force attacks -- in addition to the abuse based rate limiting we use on a day to day basis.

Q. In the case of the direct message phishing issue, what might you have done differently?

A.   Forcing SSL on our login page may have helped users identify that they were not connecting to Twitter and using Extended-Validation SSL certs. Currently we process logins in SSL but the body of the main page is HTTP. That may change.

Q. A lot of questions have come up about email verification during sign-up and Twitter's lack thereof. While not a surefire security step, is it something you're considering?

A. Possibly, that's a product decision.

Q. Without going into proprietary specifics, what security tools are you using?

A. A pretty standard array of network security tools like Nmap, Nessus, Metasploit and Backtrack. These tools provide good coverage of network and server level issues. To examine the Twitter site for potential security issue, FireBug, Tamper Data and some homegrown Perl scripts. Much of our work is on the defensive, not the offensive side of things. We use graphs, abuse logs, automated monitoring to alert us when things are being messed with.

Q. I assume you're tracking how many times someone tries to compromise Twitter, to some degree. Anything you can share?

A. The Twitter operations and security groups have an ungodly number of graphs that we look at. I do not think it's a quantifiable number. We experience all of the usual attacks. We have large numbers of people attempting to game our search engines, spam Twitter, breach accounts, etc.  After last week's attack we saw a large number of "me too" copycat dictionary attacks. So, probably the best answer is "all day every day." It's a bit like spam really. The spam flood doesn't stop -- ever.

Q. There's a misconception that Twitter doesn't have a security team, just maybe a couple "security guys." What's the set-up?

A. Think: We are a start-up with people who have to do many different roles. There is no "security guy" -- there's a security group. We also have a dedicated spam team led by a couple of Ph.D.s we inherited from the Summize acquisition.

Q. Does Twitter use any outside security consultants?

A. People who work here have many good contacts in the industry and we use our contacts to test things on the site, but we have not hired anyone officially.

Next: Risks of an open developer network -->

Q. Twitter has a very open developer network - does that introduce risks?

A. We've made it so easy to interact with our API (for example, OAuth hasn't launched yet) that anyone can talk to the API and get data as another user, provided they have the user's credentials. OAuth is a tiny bit harder to implement, but it probably would have slowed down developers by increasing the barrier to entry. Now we're in the unfortunate position of having to deploy OAuth after the fact, which will force developers into changing their code to support it. There will probably be a three to six month window in which we use both basic auth and OAuth, though, to aid in the transition.

Q. Creating a higher barrier to entry for developers will slow them down, as you said. Is that a concern?

A. I don't think slowing down the rate of API uptake is good for anyone, but we must secure the service. There are a number of open OAuth implementations and examples that developers can use, so hopefully they won't be slowed down much. It should be as easy as adding another library. OAuth would only be required for API methods that require authentication. We still have many public methods.

Q. At Twitter's size what are the challenges you face in keeping forensics records?

A. Disk space is a big one. Log rates and messages per second is another. We have been able to overwhelm standard Unix syslog with our servers and have had to look at other technologies like Scribe and Thrift, both open source projects out of Facebook.

Q. Is it annoying when people write blogs suggesting what you should do with your infrastructure / security?

A.  Yes, because they do not have all of the data that we do. They do not know what we know. I feel that most of the infrastructure discussions have faded away as we've improved the service's reliability.

Q. But now everyone is trying to tell you guys how to do security. With your strong background in security -- dating back 20 years -- does it get under your skin?

A.  No I don't take it personally. It stings when people suggest things to us or when we are bitten by exploits, like the dictionary attack, when there are tickets in our database from months ago that say "fix this". Sometimes when we have spare cycles we will audit parts of the code base or identify things that could possibly harm us. We file tickets on this, and wait for it to pass through the developer queue. People get busy, stuff doesn't get fixed as quickly as we'd like, and when something gets exploited that we've already identified as a problem, it's not a good moment.

Next: Addressing unsolicited advice -->

Q. What is more important, the security of the servers running Twitter or the code that is Twitter?

A. Funny you should ask. We had someone here from Google Code earlier today who was a big advocate of open source. He said that giving away source code was a good idea because what makes the company is the company's database, operations, and marketing. If someone stole Twitter's source code it isn't much of a competitive advantage, right? We'd never want that to happen, but we have many open source projects here and we do give away a fair amount of code. Security of the database is far more important than security of the code, as far as i am concerned.

Q. Finally, anything else you want to say to the folks giving you unsolicited security advice?

A. I want to state that all of these efforts are serious team efforts and that Twitter has gone to a very serious, science and metric-driven approach to getting things done. We make decisions based on numbers. I think what I mean to say is that we do not react until we have numbers to prove that we are doing will work. Lots of people make technology decisions by saying "oh, such and such company uses X, we should use product X." With us it is "We will try to use product X in staging, and we will see if it works. If it works, we will push data at it until we start to see failure" and then we know the upper bound of such a product. This doesn't get around issues like when memcached randomly fails, or a cache server deadlocks for no apparent reason. Open source code is wonderful but it's certainly not free from bugs or security issues.

Photo credit Duncan Davidson

Newsletters

You have been successfully signed up. To sign up for more newsletters or to manage your account, visit the Newsletter Subscription Center.
See All
See All