Google Translate might be the go-to translator service for most people, but Microsoft's Translator is catching up with the addition of 12 new languages and dialects.
Microsoft Translate now supports 103 languages with the addition of 12 languages spoken by 84.6 million people: those languages include Bashkir, Dhivehi, Georgian, Kyrgyz, Macedonian, Mongolian (Cyrillic), Mongolian (Traditional), Tatar, Tibetan, Turkmen, Uyghur, and Uzbek (Latin).
Google announced support for 108 languages in Google Translate after a rare update to language support last February, which added Kinyarwanda, Odia, Tatar, Turkmen, and Uyghur to the list.
Both companies are using artificial intelligence in their cloud infrastructure to reach different language groups across the world.
"With this release, the Translator service can translate text and documents to and from languages natively spoken by 5.66 billion people worldwide," the Microsoft Research group said in a blogpost.
While Microsoft now covers the vast majority of people, its – and Google's – advances in translation come as hundreds of about 7,000 languages globally die out each year.
Microsoft began its machine translation systems more than 20 years ago and targeted its KB or Knowledge Base articles that, for example, accompany its Patch Tuesday release notes.
In 2003, a machine translation system translated the entire Microsoft Knowledge Base from English to Spanish, French, German, and Japanese, and the translated content was published on its website, making it the largest public-facing application of raw machine translation on the internet at the time.
"Microsoft evolved the systems further based on statistical machine translation (SMT) models and made it available to the public through Windows Live Translator, the Translator API, and as a built-in function in Microsoft Office applications."
The big change came with neural machine translation (NMT) and Microsoft's decision to move its translation systems to NMT and models based on transformer technology, which allowed it to train models on smaller amounts of material, such as documents, than previously.
"Using multilingual transformer architecture, we could now augment training data with material from other languages, often in the same or a related language family, to produce models for languages with small amounts of data –commonly referred to as low-resource languages," it notes.
But it still needs humans to translate text to build models about rarer languages, requiring people to translate documents from one language to another.
The goal is for Microsoft to develop Azure cloud services that help businesses expand their reach to customers in other markets where different languages are spoken.
The key tools to enable this are Azure Cognitive Services Translator APIs in the public cloud and in Microsoft's Azure Government Cloud. The Text Translation API is available in Docker containers, allowing customers to process content on-premises.