Facebook hits back at claims its AI has minimal success in fighting hate speech

Facebook integrity VP has responded to new claims that its efforts to fight hate speech is not working.
Written by Aimee Chanthadavong, Contributor
Image: Facebook

Facebook integrity VP Guy Rosen has shut down claims that the AI technology it uses to fight hate speech is having little impact, saying it's "not true". Instead, he claimed the prevalence of hate speech on Facebook has been down by almost 50% in the last three quarters.

"We don't want to see hate on our platform, nor do our users or advertisers, and we are transparent about our work to remove it," Rosen wrote in a blog post.

"What these documents demonstrate is that our integrity work is a multi-year journey. While we will never be perfect, our teams continually work to develop our systems, identify issues and build solutions."

Rosen's post was in response to a Wall Street Journal article that reported, based on leaked internal documents, the social media giant's AI technology created to remove offensive content such as hate speech and violent images has had little success.

The report pointed out that a team of Facebook employees in March estimated the AI systems were removing posts that generated 3% to 5% of the views of hate speech on the platform, and 0.6% of all content that violated the company's policies against violence and incitement.

However, Rosen said "focusing just on content removals is the wrong way to look at how we fight hate speech".

"That's because using technology to remove hate speech is only one way we counter it. We need to be confident that something is hate speech before we remove it," he said.

"If something might be hate speech but we're not confident enough that it meets the bar for removal, our technology may reduce the content's distribution or won't recommend Groups, Pages, or people that regularly post content that is likely to violate our policies. We also use technology to flag content for more review."

Instead, he outlined that Facebook measures its success based on the prevalence of the hate speech people see on its platform, declaring its only five views per every 10,000 on its platform. 

"Prevalence tells us what violating content people see because we missed it. It's how we most objectively evaluate our progress, as it provides the most complete picture," he said.

Rosen also took the opportunity to point out that the WSJ report "misconstrued" its proactive detection rate, another metric the company supposedly uses to tells how good its technology is at finding offensive content before people report it to the company.  

"When we began reporting our metrics on hate speech, only 23.6% of content we removed was detected proactively by our systems; the majority of what we removed was found by people. Now, that number is over 97%," Rosen claimed.

Last month, Facebook said it made advancements to its AI used to help with content moderation, including introducing its Reinforcement Integrity Optimizer (RIO), which guides an AI model to learn directly from millions of current pieces of content to evaluate how well it was doing its job.  

This blog post by Rosen is the latest statement issued by Facebook as it tries to dispel scathing claims about its operations. Earlier in the month, CEO Mark Zuckerberg publicly addressed allegations that the social media giant prioritises profit over safety and wellbeing, saying that also was "just not true".

"The argument that we deliberately push content that makes people angry for profit is deeply illogical," he said.

The response was after Facebook whistleblower Frances Haugen fronted the US Senate as part of its inquiry into Facebook's operations, accusing the social media giant of intentionally hiding vital information from the public for profit. During her testimony, she labelled the company as "morally bankrupt" and casting "the choices being made inside of Facebook" as "disastrous for our children, our privacy, and our democracy".

Related Coverage

Editorial standards