The problem with Google Voice Actions for Android

Google Voice actions is a "cool" new feature for Android (Froyo 2.2).
Written by Garett Rogers, Inactive

Google Voice actions is a "cool" new feature for Android (Froyo 2.2). With this new feature you are able to tell your phone what to do, with just your voice. How about some of these examples:

1) "send text to bob hey are you coming for lunch or what?" 2) "note to self don't forget our anniversary" 3) "listen to bob marley" 4) "call the hot tub factory"

Pretty cool stuff -- everything just works the way you would expect. Kind of. The feature is awesome, but after all is said and done, I think there may still be a few issues that need to be worked out before people can effectively make use of this.

The biggest problem with things like this is that people don't know how to talk to computers yet. What do I mean by that? Well, if you look at a similar problem, searching the web, you will notice that talking to a search engine and your buddy across the cubicle wall is completely different.

In real life, the more detailed your question is, the better your answer is. But when talking to a computer, the more detailed your question is, the worse you answer is -- people have learned how to use keywords to get the best results.

The same is true with voice actions -- when talking to a human, the more detailed your "command" is, the better your results. If you make any kind of detailed call to action for a computer, you will get worse than bad results. The problem is that people don't know how to effectively use keywords in every day speech.

When a user attempts to use a keyword, they end up having to think -- people get frustrated when they have to think when they speak. Speaking is supposed to be effortless -- and if the recipient of a message doesn't understand, it's job is to clarify. Currently, there is no "clarify" option when talking to your phone, and therefore, it's really tough to formulate an accurate command on the fly, until you have practiced a lot.

So, if practice makes it workable, then why is that an issue? People lose interest when it's easy to fail -- and right now, it's really easy to fail. This feature isn't much different from other stuff already on the market, but hopefully in later versions, I will be able to say stuff like:

1) "uh, can you get collin on the line?" 2) "send a text to uhh... tony that says i'm... umm... out of the office and.... .... ... not to bother trying to call me until tomorrow" 3) "what's that band playing at GM Place... err, I mean the cube tonight?"

People don't formulate perfect sentences when they speak naturally -- but computers currently expect it. I'll be interested to see what this type of technology works like in 5 years -- but as of right now, I dont' care for it. I used it once, and it failed -- I'm done.

Editorial standards