Google announces Neural Machine Translation to improve Google Translate

The Google Neural Machine Translation system 'surpasses' the results of all other machine-translation solutions currently available, with GNMT now being used for Chinese-to-English translations.
Written by Corinne Reichert, Contributor

Google has announced a neural machine translation (NMT) system that it says will reduce translation errors across its Google Translate service by between 55 percent and 85 percent, calling the achievement by its team of scientists a "significant milestone".

Announced on the 10-year anniversary of Google Translate, Google Brain Team research scientists Quoc Le and Mike Schuster said Google has used machine intelligence to improve its image and speech-recognition systems, but had previously found it "challenging" to improve machine translation until now.

"Ten years ago, we announced the launch of Google Translate, together with the use of Phrase-Based Machine Translation as the key algorithm behind this service," they said.

"Today, we announce the Google Neural Machine Translation system (GNMT), which utilizes state-of-the-art training techniques to achieve the largest improvements to date for machine translation quality."

Unlike the currently used phrase-based machine translation (PBMT) system -- which translates words and phrases independently within a sentence, and is notorious for its mistranslations -- neural machine translation considers the entire sentence as one unit to be translated.

Researchers have been working on improving neural machine translation over the past few years, with Google's scientists finding a way to make it work on large data sets while maintaining speed and accuracy.

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation [PDF], by Quoc Le, Mike Schuster, Yonghui Wu, Zhifeng Chen, and Mohammad Norouzi et al, said the system is "state of the art" for English-to-French and English-to-German translations in particular, reducing errors by 60 percent on average.

"Our model consists of a deep LSTM [long short-term memory] network with eight encoder and eight decoder layers using residual connections as well as attention connections from the decoder network to the encoder. To improve parallelism and therefore decrease training time, our attention mechanism connects the bottom layer of the decoder to the top layer of the encoder," the technical report said.

"To accelerate the final translation speed, we employ low-precision arithmetic during inference computations. To improve handling of rare words, we divide words into a limited set of common sub-word units ('wordpieces') for both input and output. This method provides a good balance between the flexibility of 'character'-delimited models and the efficiency of 'word'-delimited models, naturally handles translation of rare words, and ultimately improves the overall accuracy of the system."

Google used the example of a Chinese sentence being translated into English:

(Image: Google)

"First, the network encodes the Chinese words as a list of vectors, where each vector represents the meaning of all words read so far ('Encoder'). Once the entire sentence is read, the decoder begins, generating the English sentence one word at a time ('Decoder'). To generate the translated word at each step, the decoder pays attention to a weighted distribution over the encoded Chinese vectors most relevant to generate the English word," Google said.

"Using human-rated side-by-side comparison as a metric, the GNMT system produces translations that are vastly improved compared to the previous phrase-based production system."

Google has launched GNMT in production across Google Translate on web and mobile for the Chinese-to-English translation pair, accounting for around 18 million translations daily. It will roll out GNMT with more languages over the coming months.

(Image: Google)

"Our key findings are: That wordpiece modeling effectively handles open vocabularies and the challenge of morphologically rich languages for translation quality and inference speed; that a combination of model and data parallelism can be used to efficiently train state-of-the-art sequence-to-sequence NMT models in roughly a week; that model quantization drastically accelerates translation inference, allowing the use of these large models in a deployed production environment; and that many additional details like length-normalization, coverage penalties, and similar are essential to making NMT systems work well on real data," the technical report concluded.

Google used its machine-learning toolkit TensorFlow and its Tensor Processing Unit (TPU) chips to bring GNMT into production with sufficient latency and power.

At Google I/O in May, Google said machine learning and artificial intelligence will be its big bet into the future of the business.

"We're working hard at core use cases on mobile," said CEO Sundar Pichai during the Google I/O keynote.

"We are pushing ourselves really hard so Google is staying a step ahead of our customers."

Staying ahead in mobile, according to Pichai, means providing a better, more assisted experience by integrating machine-learning tools through its cloud platform and open-source community via TensorFlow.

"We can do things never could do before," said Pichai.

"We believe we're at a seminal moment in next 10 years and want to take next step to be more assistive. We want to be there for our users. We want to help you get things done in the real world and understanding your context.

"The real test is whether humans can achieve a lot more with AI assisting them. We look forward to building this future."

Editorial standards