Google has launched its new near real-time transcription feature as part of Google Translate, which is available for eight languages.
Google showed off the AI-powered feature this January and will now roll it out on Android with support transcribed translations between any of a set of eight languages, including English, French, German, Hindi, Portuguese, Russian, Spanish, and Thai.
The Transcribe feature will roll out over this week, allowing Google Translate users to transcribe translated audio on the fly into text.
Google explains that Translate didn't previously cater to longer translated discussions, for example, at a conference, lecture or for story-telling.
"We identified that gap and consider it an important part of human communication. This feature that transcribes audio into translated text fills that gap," said a Google spokesperson.
Android users can try out the Transcribe feature after installing an update to the Google Translate app. Users then need to press the 'Transcribe' option available on the home screen of the app. They also need to choose the source and target languages from a dropdown menu. To start a transcription, the user needs to tap the mic icon.
The feature isn't available on the iOS Google Translate app yet but Google does plan to deliver it in the future.
The machine learning behind Transcribe builds on Google's previous work on its Live Transcribe accessibility feature on Android, which offers hearing-impaired users a real-time captioning service. Similar to Live Transcribe, the Google Translate Transcribe feature is powered by Google Cloud and its Tensor Processing Units (TPU), so transcription doesn't happen on device.
The Live Transcribe accessibility feature is built on the same automatic speech-recognition technology used to provide automated captions on YouTube videos and Google Slides presentations.
The Google Translate Transcribe feature is based on a combination of Google's real-time continuous translation with TPU hardware.
As a Google spokesperson put it, Google engineers stacked machine translation on top of a real-time automatic speech-recognition system, allowing the system to generate a new translation for every update to a recognized transcript.