/>
X

Google Cloud Platform launches text-to-speech service to compete with AWS Polly

The new service from Google Cloud Platform highlights how it is leveraging models and technology from the search giant's Deepmind subsidiary.
larry-dignan-eic.jpg
Written by Larry Dignan, Contributing Editor on

Video: How machine learning's big data loop works

Google Cloud outlined Cloud Text-to-Speech a machine learning service that uses a model by Google's Deepmind subsidiary to analyze raw audio.

With the move, developers will get more access to the text to natural sounding speech technology used in Google Assistant, Search, Maps and others.

According to Google, Cloud Text-to-Speech can be used to power call center voice response systems, enabling Internet of things devices to talk and converting text-based media into spoken formats.

Google Cloud Text-to-Speech allows customers to choose from 32 different voices in 12 languages. You can also customize for pitch, speaking rate, volume gain and format.

Read also: What is cloud computing? Everything you need to know about the cloud, explained | How to choose your cloud provider: AWS, Google or Microsoft?| Top cloud providers 2018: How AWS, Microsoft, Google Cloud Platform, IBM Cloud, Oracle, Alibaba stack up

The primary competition for Google Cloud Text-to-Speech will be Amazon Web Services' Polly, which enables 47 voices. Polly is also used for use cases in call centers and applications.

Read also: re:Invent: Amazon Web Services adds more data and ML services, but when is enough enough? | Re:Invent 2017: AWS all about capturing data flows via AI, Alexa, database, IoT cloud services | Cloud AutoML: How Google aims to simplify the grunt work behind AI and machine learning models

The rollout of the service also highlights how Google is leveraging Deepmind technology for Google Cloud Platform. The Deepmind technology used in Cloud Text-to-Speech is called WaveNet. A year ago, WaveNet would create raw audio waveforms from scratch using a neural network trained by speech samples.

When given text, WaveNet would generate speech from scratch one sample at a time for accuracy.

But with an update, WaveNet is running on Google Cloud's TPU infrastructure and can generate raw waveforms 1,000 times faster than before. Fidelity and speed allow WaveNet to create more human audio.

Related stories

Related

Google makes it easier to ditch the iPhone with Switch to Android app update
switch-to-android-blog-header

Google makes it easier to ditch the iPhone with Switch to Android app update

Mobility
How to make a file available offline with the Google Drive Desktop Client
The macOS Finder window showing the virtual Google Drive link in the sidebar.

How to make a file available offline with the Google Drive Desktop Client

Google
Store up to 10TB of files in the cloud with a $90 lifetime Prism Drive plan
replace-this-image.jpg

Store up to 10TB of files in the cloud with a $90 lifetime Prism Drive plan

Deals