Building a voice interface: The do's and don'ts

There's huge potential in conversational interfaces, but there are also pitfalls to avoid, Google's Daniel Padgett explained at the Google I/O conference.
Written by Stephanie Condon, Senior Writer

With speech recognition technology at human parity, voice has been touted as the next big interface. But as with any new technology, there are major challenges that come with the opportunities.

At the Google I/O conference, Daniel Padgett, the conversation design lead for Google Assistant, walked through some of the strategies and considerations for developers interested in building conversational interfaces.

The value of voice, he said, "boils down to three things: speed, simplicity, ubiquity."

Done well, voice interfaces are "faster than pulling your mobile phone out of your back pocket," Padgett said.

On top of that, they're easy to use because just about everyone knows how to have a conversation. "They've been doing this... since they were born," he said. "There's really nothing for them to learn. The promise is: say what you want, get what you want."

Meanwhile, there's massive potential in the growing number of devices that can be outfitted with a microphone and a speaker.

That said, not every use case is suited for voice: "The thing you need to make sure is it's easier than the alternative," Padgett said. "If it isn't, it probably isn't worth pursuing."

With that in mind here are some pointers Padgett offered for building voice UIs:

    Don't: Overestimate the technology

    While the word error rate is impressively low, "recognizing the words they say is much different than understanding what they mean when they say them," Padgett said. To some extent, recognition is a "solved problem," but understanding is a more difficult nut to crack. If a user asks, "What's the weather in springfield?" you'll have to figure out which Springfield. Turning to context -- such as the user's location or past activity -- may help.

      Do: Make people comfortable

      Use everyday language users can relate to. Ask questions that are easy to answer. Structure information in a way that supports easy recall.

        Do: Acknowledge ambiguity

        When you don't know the answer, let users clarify. "Taking the extra step is way better than the corrections they'll have to make if you get it wrong," Padgett said.

          Don't: Try to "voice-ify" your app

          When it comes to functionality, "there's a tendency to say everything needs to be covered," Padgett said. Yet not all functions are simple enough for voice. Voice should handle functions without many steps involved, like quick transactions.

            Do: Listen to your users and keep iterating.

            Look at what people are looking for, and try to design for that. "This is not once and done, you want to make sure this is an iterative thing," Padgett said. "People are literally telling you what they want, so listen to them."

            Editorial standards