Facebook's new commenting system: engineered for real-time updates, reduced latency

Summary:Facebook's new commenting system, which the company said today that it rolled out a couple of weeks ago, does more than just bring an IM-like conversation to the comments that appear under wall posts on Facebook pages.The system lightens the load on Facebook's servers by utilizing a "write locally, read globally" approach to managing the traffic of people who are just reading content vs.

Facebook's new commenting system, which the company said today that it rolled out a couple of weeks ago, does more than just bring an IM-like conversation to the comments that appear under wall posts on Facebook pages.

The system lightens the load on Facebook's servers by utilizing a "write locally, read globally" approach to managing the traffic of people who are just reading content vs. traffic from those who are posting comments to that content.

In a post on Facebook's Engineering blog today, the company's Ken Deeter posted a behind-the-scenes look at how FB is directing traffic to save server space and decrease latency. To understand the challenges that FB engineers were facing, Deeter noted that every minute, the site serves more than 100 million pieces of content that may receive comments. In that same minute, users hammer out 650,000 comments - all of which need to be routed to the pages of people who are viewing that comment. He wrote:

To be able to push information about comments to viewers, we need to know who may be viewing the piece of content that each new comment pertains to. Because we serve 100 million pieces of content per minute, we needed a system that could keep track of this "who's looking at what" information, but also handle the incredible rate at which this information changed.

Storing these one-to-one, viewer-to-content associations in a database is relatively easy. Keeping up with 16 million new associations per second is not.Up until this point, Facebook engineering had built up infrastructure optimized for many more reads than writes. But now we had flipped the game. Every page load now requires multiple writes (one for each piece of content being displayed). Each write of a comment requires a read (to figure out the recipients of an update). We realized that we were building something that was fundamentally backwards from most of our other systems.

The "write locally, read globally" approach means that the company is "deploying distributed storage tiers" to manage the "writes" but less frequently has to collect info from across its data centers. Deeter's example does a better job of explaining it:

For example, when a user loads his News Feed through a request to our data center in Virginia, the system writes to a storage tier in the same data center, recording the fact that the user is now viewing certain pieces of content so that we can push them new comments. When someone enters a comment, we fetch the viewership information from all of our data centers across the country, combine the information, then push the updates out. In practice, this means we have to perform multiple cross-country reads for every comment produced. But it works because our commenting rate is significantly lower than our viewing rate. Reading globally saves us from having to replicate a high volume of writes across data centers, saving expensive, long-distance bandwidth.

Bottom line: the next time you see a new comment magically appear on a post while you're either hammering out your own comment or scrolling along just catching up on your friends' latest adventures, know that the Facebook engineers literally flipped the entire process on its head to make it better.

Topics: Social Enterprise, Data Centers, Hardware, Storage

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Related Stories

The best of ZDNet, delivered

You have been successfully signed up. To sign up for more newsletters or to manage your account, visit the Newsletter Subscription Center.
Subscription failed.