If you're reading these words, rest assured, they were written by a human being. Whether they amount to intelligence, that's for you to say.
The age of writing by a machine that can pass muster with human readers is not quite upon us, at least, not if one reads closely.
Scientists at the not-for-profit OpenAI this week released a neural network model that not only gobbles tons of human writing — 40 gigabytes worth of Web-scraped data — it also discovers what kind of task it should perform, from answering questions to writing essays to performing translation, all without being explicitly told to do so, what's known as "zero-shot" learning of tasks.
The debut set off a swarm of headlines about new and dangerous forms of "deep fakes." The reality is that these fakes, while impressive, should easily yield to human discernment.
The singular insight of the OpenAI team, and it is a truly fascinating breakthrough, is that the probability of predicting the next word in a sentence can be extended to predicting the point of an utterance, meaning, the objective of a task.
As they write, "Language provides a flexible way to specify tasks, inputs, and outputs all as a sequence of symbols." That leads to unsupervised learning by the machine, where no explicit goal needs to be set to train it.
The project, by researchers Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, was the shot heard round the world on Valentine's Day, and the press went to town with it.
"This AI is Too Powerful to Release to the Public" was the headline by PC Mag, fairly representative of the tone in the past 24 hours.
It was not merely the breakthrough test results of the new neural net, dubbed "GPT-2," that grabbed headlines. Even more striking to many was the decision by Radford and colleagues not to disclose the details of their network model, or release the code, for fear it could be used for malicious purposes.
As the authors explained in a blog post Thursday, in addition to many good uses of the technology that are imaginable, "We can also imagine the application of these models for malicious purposes," including "generating misleading news articles."
Adding fuel to things is the fact that OpenAI is backed by, among others, Tesla CEO Elon Musk.
As Marrian Zhou with ZDNet's sister site, CNet, wrote, "Musk-backed AI group: Our text generator is so good it's scary." Ed Baig of USA Today lead with, "Too scary? Elon Musk's OpenAI company won't release tech that can generate fake news."
It helps the general dismay that the work was done with the participation of Ilya Sutskever, who has contributed so much to advance the art of natural language processing. He was instrumental in creating widely used approaches for "embedding" words and strings of words in computer representations, including "word2vec" and "seq2sec."
Although the code is not being released, some journalists were given a demo of the technology this week and seemed generally impressed. Vox's Kelsey Piper used the tool to finish the article she started about GPT-2. Given a single sentence about GPT-2, the machine dashed off multiple paragraphs in keeping with the theme, hence, perhaps convincing as an article at a passing glance.
The results discussed in the formal paper, "Language Models are Unsupervised Multitask Learners," show the system performed well in several benchmark tests, beating previous state-of-the-art natural language processing models.
But fear not, much of the output of GPT-2 doesn't hold up under careful scrutiny.
The examples provided by OpenAI show a distinct lack of logical coherence. In addition, some all-too-familiar artifacts of computer output, such as duplication of terms, appear in many examples.
The overall feel of the texts is not unlike the feeling of the most advanced chat bots, where one has an experience of something less-than-intelligent at work.
The best examples OpenAI produced are fake news stories, where the form of the genre, which is already fairly disjointed, smoothes the lack of logic. It's a bit like what Stephen Colbert once coined as "truthiness."
Two fake news pieces, one about the theft of nuclear material, and one about Miley Cyrus being caught shoplifting, convincingly ape the typical bag of facts contained in newswire copy.
The best example offered is a fictitious news account about unicorns being discovered in the Andes. The nine paragraphs of the piece are a compelling read that smells like standard journalistic fair. It's a bit hard to judge, though, because it has no basis in any actual logic about scientific process or facts of the Andes region (nor facts about unicorns).
When GPT-2 moves on to tackle writing that requires more development of ideas and of logic, the cracks break open fairly wide. A sample essay about the U.S. civil war, prompted merely by a single sentence, "For today's homework assignment, please describe the reasons for the US Civil War," shapes up as something that could well be submitted in a class. But it is a jumble of disjointed and inchoate factoids and opinions. Some high school essays are just that much of a mess, but they would be shot down as gibberish anyway.
Examples contained in the formal research paper show similar weaknesses. One short piece takes as input the human-written paragraphs of a fashion blog post by Ethan M. Wong of Street x Sprezza from 2016. The machine proceeds to bollux-up all the references into an utter mess.
- 'AI is very, very stupid,' says Google's AI leader (CNET)
- How to get all of Google Assistant's new voices right now (CNET)
- Unified Google AI division a clear signal of AI's future (TechRepublic)
- Top 5: Things to know about AI (TechRepublic)
In another instance, the machine is fed some human-written text about tourist attractions in Spain. The machine proceeds to generate fine English sentences about the history of Moorish Spain, but the information is not internally consistent. The "Reconquista" of Spain is described first as the establishment of a Muslim dynasty in Spain, and then subsequently as the end of Muslim rule. This machine historian, in other words, roams all over the place without discipline.
None of which, however, should diminish what appears a substantial accomplishment for the OpenAI team. Not only have they trained a machine to produce perfectly valid sequences of words based on human examples, without any labels on those examples; they have also shown that the computer can guess the task simply by inferring it from the text itself.
The authors sum up with the observation that despite some fine achievements on benchmarks, much work remains to be done.
"There are undoubtedly many practical tasks where the performance of GPT-2 is still no better than random," they write. "Even on common tasks that we evaluated on, such as question answering and translation, language models only begin to outperform trivial baselines when they have sufficient capacity."
Previous and related coverage:
An executive guide to artificial intelligence, from machine learning and general AI to neural networks.
The lowdown on deep learning: from how it relates to the wider field of machine learning through to how to get started with it.
This guide explains what machine learning is, how it is related to artificial intelligence, how it works and why it matters.
An introduction to cloud computing right from the basics up to IaaS and PaaS, hybrid, public, and private cloud.