Boaty McBoatface, meet Parsey McParseface.
Okay, so Boaty McBoatface never actually became the name of the British government's new polar research vessel, but that didn't stop tech giant Google from using its own version of the monicker for its newly open-sourced English language parser.
More specifically, Google today is releasing SyntaxNet, its open-source neural network framework implemented in TensorFlow. Today's release includes all the code needed to train new SyntaxNet models, as well as Parsey McParseface, which is essentially the English language plug-in for SyntaxNet.
Google said SyntaxNet provides a foundation for its Natural Language Understanding (NLU) systems, for instance the voice recognition capabilities of Google Now. Parsey McParseface, Google explained, is built on machine learning algorithms that work to analyze the linguistic structure of language in order to understand the functional role of every word and grammatical building block in a sentence.
"One of the main problems that makes parsing so challenging is that human languages show remarkable levels of ambiguity," Google explained in a blog post. "It is not uncommon for moderate length sentences -- say 20 or 30 words in length -- to have hundreds, thousands, or even tens of thousands of possible syntactic structures. A natural language parser must somehow search through all of these alternatives, and find the most plausible structure given the context."
Google claims Parsey McParseface has achieved 94 percent accuracy when it was put to task on English language news articles. While the accuracy is not perfect, Google insists it's high enough to be useful in a bevy of applications. That's because correctly interpreting the grammatical structure of a sentence is critical if a computer is to properly act on a sentence's meaning.