X
Business

Google improves speech recognition for Contact Center tools

A series of new and improved features for Google's contact center AI should make it easier for developers to build high-quality voice bots and generate more accurate transcriptions.
Written by Stephanie Condon, Senior Writer

Google on Tuesday announced new and enhanced contact center tools, with improvements to the underlying speech recognition technology. The improvements, which are the most significant since Google announced its Contact Center AI last July, will impact services for building voice bots as well as services for transcribing conversations. 

For building better bots, Google is introducing a new feature to Dialogflow, its development suite for building conversational interfaces. Called Auto Speech Adaptation, the feature effectively adds context to conversations. Context can help a live person or virtual agent understand, for instance, when a customer is talking about "mail" rather than "male" or a similar-sounding word like "nail."

pasted-image-010.png

Auto Speech Adaptation, which is available in beta, automatically adds appropriate context to Dialogflow from the training phrases and other agent-specific information available. A developer can activate Auto Speech Adaptation by clicking the "on" switch in the Dialogflow console. In some cases, Google said, the feature can improve accuracy of virtual agents by more than 40 percent. 

"With the flip of a switch, you're basically getting custom speech recognition," Google Cloud's Dan Aharon said to ZDNet. 

By tackling a common business challenge -- efficiently building high-quality Interactive Voice Response tools (IVRs) that can actually help customers -- Google's contact center tools are providing the AI company one foothold into the enterprise market

"Up until now, IVRs were pretty basic and the user experience was such that people just wanted to press zero or shout 'representative' and escape the IVR as soon as possible," Aharon said. "We want to help build experiences that actually help people get a high-quality service they appreciate and doesn't require them to repeat themselves too much or take them through complicated menus."

Google Cloud's AI tools are rapidly gaining traction among developers building bots, Aharon said. At the Google Cloud Next conference in April, the company said there were more than 850,000 developers in the Dialogflow community -- up from just over 150,000 two years prior."  .  

"A lot of those are longtail developers, but we also have thousands of enterprise customers engaged with Contact Center AI and Dialogflow and Speech" Aharon said. "In terms of numbers of transactions, we've crossed into the billions a long time ago. It's at scale and growing really, really fast."

In addition to the new Dialogflow feature, Google is rolling out baseline model improvements to its Speech-to-Text transcription tools for IVRs and phone-based virtual agents. The new model is 15 percent more accurate than the version extended to all customers in February. 

Meanwhile, three updates to Google's SpeechContext parameters, all in beta, should also significantly help developers building contact center applications. With SpeechContext parameters, developers can add contextual information -- such as industry jargon -- in Cloud Speech-to-Text to make transcriptions more accurate. 

First, Google is adding SpeechContext classes, so developers can add a whole class of words, rather than adding them one by one. Next, with SpeechContext boost, developers can fine-tune the likelihood that conversations will include a certain phrase. Lastly, Google has expanded the number of "phrase hints" per API request from 500 to 5,000. 

Meanwhile, Speech-to-Text also now supports MP3 files. It also now supports streaming audio for up to five minutes, with the ability to start a new streaming session where a previous one left off -- for effectively endless streaming. 

"In the past for the type of leading IVRs like a lot of customers are creating, they'd need to hire a professional services firm to build these experiences, and it could cost millions of dollars," Aharon said. "Now we're giving you a lot of that power through APIs."

Prior and related coverage:

Editorial standards