Facebook has a new tool to spot spammers, and it's already taken down billions of accounts

Even equipped with a brand-new, high-performance algorithm, Facebook is unsure that it will ever win the war against fake accounts.
Written by Daphne Leprince-Ringuet, Contributor

Chasing fake accounts on social networks is a high-tech game of cat and mouse, and as soon as one troll is down, another one pops up. But Facebook has revealed that it has a new trick up its sleeve to better identify spammers – an improved weapon-of-choice that attackers won't be able to dodge as easily as before, according to the social media giant.

Facebook's engineers have developed a more efficient machine-learning tool, which has already helped take down 6.6 billion fake accounts in the past year. And that number doesn't even include the additional "millions" of "attempts" to create fake accounts that are blocked daily, according to Bochra Gharbaoui, data science manager at Facebook.

The reason for the platform's recent success in the war against spammers and scammers is a technology called Deep Entity Classification (DEC). The name reflects the complexity of the tool, which leverages machine learning to analyse not only the accounts that are active on Facebook, but also each individual profile's behaviour and interaction with the rest of the community.

Facebook's engineers refer to the "deep features" of each account, which are the behavioral patterns of profiles, rather than the direct characteristics of an account. In other words, instead of only registering details like the creation date of an account or the number of friend requests it has sent, DEC also looks at all of the properties of the profiles, groups or pages that a particular user has made contact with. 

DEC doesn't only register which groups a particular individual has joined, but also the number of admins in the group, its members, or the time since the group was created. The tool won't only look at the number of friend requests sent by one profile, but also the number of requests sent by the accounts that this profile has befriended.

After extracting these "deep features", the algorithm can aggregate them to find out things like the mean number of users in all the groups that one profile has joined, or the maximum number of groups that a given profile's friends are part of. DEC can effectively map the complexity of every Facebook profile's network of friends, groups, pages and so on. 


The algorithm can aggregate deep features to find out things like the mean number of groups per friend for a given profile.

Image: Facebook

"You end up with tens of thousands of features per account, that reflect the whole model around the account," said Daniel Bernhardt, engineering manager at Facebook. "So you can see, through a number of signals, if a user is trying to misrepresent their identity. It's not so much about the content of an account, but about how that account interacts with others on the platform."

With about 20,000 features gathered for each profile, the main advantage of the new tool is how difficult it is to reverse engineer. Traditional machine-learning methods, said Bernhardt, are too easily tricked by attackers. If an algorithm only looks at a handful of features to determine if an account is fake, spammers can easily work out how to look real – "just like a virus", explained Bernhardt, that only needs a couple of mutations to get past the body's defense system.

Most of the tools put forward so far to detect fake users follow the model that Bernhardt wants to avoid. Engineers train algorithms on a limited set of characteristics directly linked to the account, such as "spam commenting", or "engagement rate", to detect fake profiles. "The problem is adversarial robustness," said Bernhardt's colleague Gharbaoui. "Adversaries can control direct features quite easily, for example by managing the number of friend requests they send out. That's why you need to look at deep features – the graph and the model around the account."

Facebook's team said that DEC has already had promising results. The tool, in the months that it has been deployed on the social network, has taken down billions of fake accounts and reduced the estimated volume of spammers and scammers. Gharbaoui said that the estimated volume of fake accounts on Facebook is now 5%.

But while the new algorithm might let the social media giant win one battle, Facebook is nowhere near certain to win the war against fake accounts. It is only a matter of time before attackers figure out a way around the improved tool. The platform's engineers say so themselves: "Adversaries move fast," said Gharbaoui. "Their adaptation cycle is fierce, and it's getting more sophisticated."

SEE: 10 tips for new cybersecurity pros (free PDF)

And the stakes are getting higher. Only last year, Facebook removed a global network of more than 900 accounts, pages and groups that had used sophisticated methods such as AI-generated profile pictures to spread pro-Trump narratives to 55 million users. Operated by accounts based in Vietnam and in the US, the fake profiles were feeding into a complex network, by administering groups, increasing the membership of these groups, and liking posts on pages. 

That was a few months after the social media giant removed 2,600 fake pages, groups and accounts that were engaging in "coordinated inauthentic behavior". Facebook's head of cybersecurity policy Nathan Gleicher said at the time: "While we are making progress rooting out this abuse, it's an ongoing challenge because the people responsible are determined and well-funded. We constantly have to improve to stay ahead."

Given the unprecedented level of sophistication seen in recent attacks, it is unclear how long Facebook's new algorithm will withstand the creativity and resilience of attackers motivated by financial gain and other forms of abuse. Bernhardt and Gharbaoui, although enthused by the performance of DEC so far, conceded that the field is still "very strongly attacker-controlled".

As adversaries change their behaviors, so will Facebook's new tool need re-training. And even enhancing the algorithm will have to wait until engineers figure out the new tactics adopted by their opponents. "The adversarial nature of this space is new," said Bernhardt. "It continues to be a big problem, and it's an on-going evolution. We are still at the beginning." At the moment, it's not even certain there will ever be an end. 

Editorial standards