Head fake: MIT work shows fake news detection isn't quite there yet

Efforts to detect fake news are not as advanced as they would appear, given that the best practices so far rely on pattern detection that can itself be exploited by malicious actors, according to new research from MIT.
Written by Tiernan Ray, Senior Contributing Writer

How far does the world have to go to detect fake, computer-generated writing? Quite a bit farther, if recent research by MIT scientists is correct. Fake detection requires a lot of reliance by artificial intelligence on statistical patterns, patterns that themselves can be faked. 

On Thursday, MIT artificial intelligence scientist Tal Schuster and colleagues from Israel's Tel Aviv University and Cornell University posted a blog item about two recent research reports they published regarding "fake news" and how to spot it. (Facebook's AI research team had a hand in supporting the work.)

The upshot of the research is that picking out a machine-generated text is not enough: A neural network will have to also separate what's valid, truthful text, perhaps created by a human but maybe also those created by a machine, from text that is malicious and misleading. 


The conundrum of fake news detection, say MIT researchers, is that valid, factually correct writing can come from automatic, machine-generated text, and false information can come from human hands, so one has to go deeper than merely detecting what stuff is generated by a machine and what's generated by a person.


The basic problem is that AI, when used to spot a fake, often relies on statistical clues in the text, clues that can be misleading. In the first paper of the two, Schuster and colleagues pick up where scientists at the Allen Institute of Machine learning left off earlier this year. You'll recall that the Allen Institute scientists in May introduced a neural network called "Grover" that could be used to uncover text that was automatically generated by similar networks, such as the "GPT2" language network of OpenAI. In other words, one neural net was used to catch another. 

The key to Grover was that GPT2 and language models like it, such as Google's "Bert," leave a kind of trace or signature in how they construct text. They pick combinations of words that are more mundane, less inventive, than human writers. By detecting that signature Grover was able to tell when a piece of text was made by a machine. That approach to detecting fake news has come to be referred to as the "provenance" approach, meaning it tells fake from real by looking at where the generation of words comes from, human or machine. 

Grover is good, the MIT team acknowledged, but the problem is not all machine-generated text is fake in the sense of being misleading. More and more, machines could be writing valid text, to aid publishers. The same programs that help automate news article production for legitimate news sources could be used to make up misleading articles if a malicious party got hold of the code. How, then, do you tell good from the bad?

Also: To Catch a Fake: Machine learning sniffs out its own machine-written propaganda

That's not easy. Schuster and colleagues take CNN news articles, written by people, and have Grover complete the original article with a novel, machine-generated sentence, either true or false. A second network had to tell which sentences were true, which false. Sometimes it did okay, but only if it was first exposed to training examples of the fake and true sentences. That way, it could see the patterns of language use that the neural network used in construction true versus false statements. When it wasn't given those specific examples during training, the accuracy of the detector plunged.

In a second, subtler test, if the human-written text was subtly modified, say, by having Grover insert negation words, such as "not," the detector failed to sort out true and false, meaning, its accuracy was no better than random guessing. 

The conclusion that Schuster and colleagues reach is that without very specific examples to work from, neural nets like Grover are hopeless. Given that, they suggest the neural net needs something more, it needs to incorporate some knowledge that reveals the "veracity" of the text. 

"We recommend to extend our datasets and create a benchmark that represents content's veracity in a wide range of human-machine collaborating applications, from whole article generation to hybrid writing and editing," they write. 

"This reflects a definition of fake news that incorporates veracity rather than provenance."

In the second paper, the authors find a similar kind of problem with a popular dataset for fake news detection, called "FEVER," which stands for "Fact Extraction and Verification." FEVER was introduced last year by Cambridge University and Amazon researchers and is meant as a resource upon which to train neural nets to detect fake articles and other fake texts such as product descriptions. Human annotators pored over Wikipedia article to extract sentences and supporting text to form a collection of 185,445 "claims," statements of fact that can be either true or false, such as "Barbara Bush was a spouse of a US president during his term" (true, she was the wife of the first President Bush, H.W.) 

FEVER is supposed to tell how good a neural net is at figuring out if a claim is true based on the related sentences. But Schuster and colleagues found that patterns of words in the claim were a tip-off to the neural network so that it could guess correctly without even consulting the evidence. For example, if sentences contained the words "did not" or "yet to" or other similar word pairs, they were more likely than not to be claims that would be refuted by evidence. In this way, the neural net wasn't really learning anything about truth and falsity, it was just keeping track of statistics of word pairs. 

Also: High energy: Facebook's AI guru LeCun imagines AI's next frontier

Indeed, when the authors reformulated the sentences in FEVER, they could cause the neural net's performance to plunge. "Unsurprisingly, the performance of FEVER-trained models drop significantly on this test set, despite having complete vocabulary over-lap with the original dataset," they write. 

The moral of the experiment, they write, is that going forward, neural nets for fake detection need to be trained on a data set that is cleansed of such biases. They offer such a dataset, a version of FEVER where the individual sentences are re-weighted so that the giveaway phrases carry less impact. The authors express the hope that such a more-balanced dataset will lead to natural language models "performing the reasoning with respect to the evidence."

Taken together, the two reports are another reminder that performance metrics for neural nets on tests can be misleading. Understanding what's true and what's false in sentences appears a harder task for a computer than might have originally been supposed. 

A lot more work will be needed to move AI beyond pattern recognition and toward something that can stand up to algorithms in malicious hands. 

Editorial standards