iPhone 5: The hurdles to overcome for voice control

The next iPhone is expected to make voice control a big part of the interface, but there are some big hurdles that must be overcome to get owners to use it.
Written by James Kendrick, Contributor

The next iPhone will be unveiled this week by Apple, and as is usually the case information has trickled out about what we can expect from Cupertino. One of the expected new features of the iPhone 5 is Assistant, a spoken interface that lets users tell the phone what to do by speaking into the phone.

Assistant is reported to be the result of Apple's purchase of Siri, and early reports say it is thoroughly integrated into the iPhone system. The fact is this is nothing new, and Apple may find it as hard to get iPhone owners to adopt the voice control as others have before them.

Voice control, or speech recognition, has been around for a very long time. I have been using it since the premier solution came from IBM, Via Voice. The technology is quite complex, as voice control systems must take any spoken phrase, parse it, and then correctly convert it into text a system can understand. Since no two people speak the same phrases the same way, it is difficult to implement properly on systems with limited processors and memory.

Speech recognition is already used quite a bit in the real world. If you have ever called a company and spoken your menu choices to get sent where you needed to be you have experienced speech reco first-hand. Owners of Android phones/ tablets have enjoyed voice control for a while, with Google search having a very good speech interface.

Many Android phone owners are not aware that Google released Voice Actions a year ago that does a lot of things that the next iPhone's Assistant is supposed to do. This voice control has been integrated into Voice Search, and make it possible to send text messages, and emails by speaking them into the phone. It is integrated into Google Navigation, and the simple phrase "navigate to Reliant Stadium Houston" is all you need to speak to fire up the proper navigation action.

Demo of Voice Actions by Google

Apple will no doubt make Assistant very natural for the user, as that is a trademark of its implementation of a technology like this. I have no doubt we will see ads showing how easy it is to use voice to make the iPhone do things, and user awareness will not be a problem for Apple like it is for other platforms. Even with good user awareness, Apple will have to overcome significant hurdles to get widespread adoption of its speech recognition system.

One technical challenge that such systems must overcome is the handling of background noise. People tend to be in noisy environments when using a gadget like a phone, and this makes it difficult for the recognition system to accurately determine what the speaker is saying. It is essential to isolate and remove background noise from the spoken commands, especially when that includes other folks speaking in the same room as commonly occurs. Background noise must be completely removed from the sessions for the system to have any shot at accurately interpreting what the user speaks into the phone. This is a lot harder than it sounds.

Another challenge that Apple faces with its Assistant feature on the new iPhone is social in nature. Users have been slow to adopt speech reco in the past because they are self-conscious when using them. Many folks are embarrassed to speak commands into a phone. This is human nature, as we as individuals don't like to give the appearance that we are doing things out of the norm, and telling your phone what to do falls in that category. Apple's implementation of speech reco will have to be done in such a way that doesn't make the user feel self-conscious when using it.

If Apple can deal with these two hurdles, it has a shot at making speech recognition a part of the everyday use for iPhone owners. As stated, this is not new technology by any means but if Apple can get iPhone owners to use it regularly it will be a big move forward in mobile.


Editorial standards