Facebook reveals ranking system behind searching posts via Graph Search

Facebook reveals ranking system behind searching posts via Graph Search

Summary: Accumulating hundreds of terabytes of data daily is a daunting challenge for any company -- even Facebook.


Facebook has had a number of notable (and sometimes controversial) alterations to its publishing and search abilities.

One that might not have garnered as much attention is the ability to search individual posts using Graph Search.

With one billion and counting new posts published to the world's largest social network each day, that equates to more than one trillion total posts being added to the index, churning out hundreds of terabytes of data.

Accumulating that much data on a daily basis (and then repurposing as useful information) is a daunting challenge for any company -- even Facebook.

But the Menlo Park, Calif.-based company is well known for building its own engineering systems from the ground up, with a growing datacenter presence worldwide, spanning from Oregon to Sweden.

Originally conceived as a hackathon project two years ago, Ashoat Tevosyan, an engineer on Facebook's search quality and ranking team, explained further in a blog post on Thursday that most search queries typically result in more, well, results than any user cares to navigate.

He also admitted that the general Facebook posts index is much larger than other search indexes on the platform. Thus, the objective has become to develop algorithms that can determine and rank which results should be deemed the most relevant.

To surface content that is valuable and relevant to the user, we use two primary techniques: query rewriting and dynamic result scoring. Query rewriting happens before the execution of the query, and involves tacking on optional clauses to search queries that bias the posts we retrieve towards results that we think will be more valuable to the user. Result scoring involves sorting and selecting documents based on a number of ranking "features," each of which is based on the information available in the document data. In total, we currently calculate well over a hundred distinct ranking features that are combined with a ranking model to find the best results.

Tevosyan noted that the engineering team must also closely monitor the heavy workloads bombarding datacenters given high amounts of traffic to the site via desktop and mobile channels.

Facebook has more than one billion users.

Thus, Facebook engineers have identified 70 different kinds of data for sorting and indexing, housed in a production MySQL database.

Tevosyan revealed that a few dozen engineers on the Graph Search team, and he also hinted at more search abilities added and modifications in the near future.

Topics: Web development, Apps, Software Development, Social Enterprise

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


1 comment
Log in or register to join the discussion
  • Searching posts of specific users is much easier.

    Most Facebook content has privacy settings that limit the scope to who the content is available. So when performing search queries for posts created only by the friends of the user who is doing the search should be much faster and easier. In fact, what is beneficial is to just search for the posts made by specific user. For instance, I have more than a thousand photo albums posted on Facebook. I'd love to be able to quickly find an album by querying it by name.

    Fortunately Facebook does let you do that by using their API. And that prompted me to write an app that lets users search for posts they made or for the posts of a specific user or page. You can check it out here: http://searchforposts.com

    I really have no idea why Facebook has not built the functionality to search for your own posts.
    Raphael Pungin