Googling Google

Christopher Dawson, Sam Diaz and Matt Weinberger

The problem with Google Voice Actions for Android

By | August 13, 2010, 2:34pm PDT

Google Voice actions is a “cool” new feature for Android (Froyo 2.2). With this new feature you are able to tell your phone what to do, with just your voice. How about some of these examples:

1) “send text to bob hey are you coming for lunch or what?”
2) “note to self don’t forget our anniversary”
3) “listen to bob marley”
4) “call the hot tub factory”

Pretty cool stuff — everything just works the way you would expect. Kind of. The feature is awesome, but after all is said and done, I think there may still be a few issues that need to be worked out before people can effectively make use of this.

The biggest problem with things like this is that people don’t know how to talk to computers yet. What do I mean by that? Well, if you look at a similar problem, searching the web, you will notice that talking to a search engine and your buddy across the cubicle wall is completely different.

In real life, the more detailed your question is, the better your answer is. But when talking to a computer, the more detailed your question is, the worse you answer is — people have learned how to use keywords to get the best results.

The same is true with voice actions — when talking to a human, the more detailed your “command” is, the better your results. If you make any kind of detailed call to action for a computer, you will get worse than bad results. The problem is that people don’t know how to effectively use keywords in every day speech.

When a user attempts to use a keyword, they end up having to think — people get frustrated when they have to think when they speak. Speaking is supposed to be effortless — and if the recipient of a message doesn’t understand, it’s job is to clarify. Currently, there is no “clarify” option when talking to your phone, and therefore, it’s really tough to formulate an accurate command on the fly, until you have practiced a lot.

So, if practice makes it workable, then why is that an issue? People lose interest when it’s easy to fail — and right now, it’s really easy to fail. This feature isn’t much different from other stuff already on the market, but hopefully in later versions, I will be able to say stuff like:

1) “uh, can you get collin on the line?”
2) “send a text to uhh… tony that says i’m… umm… out of the office and…. …. … not to bother trying to call me until tomorrow”
3) “what’s that band playing at GM Place… err, I mean the cube tonight?”

People don’t formulate perfect sentences when they speak naturally — but computers currently expect it. I’ll be interested to see what this type of technology works like in 5 years — but as of right now, I dont’ care for it. I used it once, and it failed — I’m done.

Kick off your day with ZDNet's daily e-mail newsletter. It's the freshest tech news and opinion, served hot. Get it.

Topics

Garett Rogers has always had a deep interest in computers and the Internet, which led him to a degree in Computer Information Systems. He is currently employed as a programmer for iQmetrix.

Disclosure

Garett Rogers

Garett Rogers is employed as a programmer for iQmetrix, which specializes in retail management software for the wireless industry. He has no other formal associations with any software or hardware companies.

Biography

Garett Rogers

Garett Rogers has always had a deep interest in computers and the Internet, which led him to a degree in Computer Information Systems. He is currently employed as a programmer for iQmetrix, which specializes in retail management software designed specifically for the cellular and electronics industry.

Garett's journey into Google started with his employer asking him to "get a better rank on Google." Diving into search engine optimization sparked his curiosity for how things work and led him to create a blog dedicated to what interests him most--Google.

Talkback Most Recent of 17 Talkback(s)

  • It's awesome! you're full of hot air!
    If you have actually tried it driving through a busy highway, you would realize just how great it is.

    Beats the pants out of Microsoft SYNC.

    You will see first Microsoft follow and Apple shall resist for a while, though eventually it will cave in like it did with multitasking.
    ZDNet Gravatar
    Uralbas
    13th Aug 2010
  • Uh no Microsoft cant follow because they have already demod this on WP7
    it's impossible for them to follow google on this because they already beat google to it. they demo'd it at SpeechTEK 2010 in New York
    ZDNet Gravatar
    Johnny Vegas
    13th Aug 2010
  • ZDNet Gravatar
    Roque Mocan
    13th Aug 2010
  • WoW! That was quick, as usual...
    It never takes long for Microsoft/ZDNet to poop on any new cool TECH from Google or Apple. Usually less than 24 hours. Some "expert" or pundit is always at hand, ready to go.
    ZDNet Gravatar
    zato_3@...
    13th Aug 2010
  • Absolutely correct
    If you have to worry about syntax _at all_ the feature ends up being complicated than just interacting with your screen the way we currently do.

    It's great for things like voice dialing, where there isn't really anything to parse, but for everything else? I'll use it when I don't have to _think_ about using it.
    ZDNet Gravatar
    RidleyGriff
    13th Aug 2010
  • RE: The problem with Google Voice Actions for Android
    @RidleyGriff
    Even for voice dialing, unless you know very few people, with very few numbers, it is useless. Say you know three Bobs, and they have home phones and office phones, and home mobiles and office mobiles, and maybe numbers in other countries, by the time you get it figured out, it is easier to handle the phone for the correct call.

    It is great for simple people with simple lives, but they don't really need the feature.
    ZDNet Gravatar
    jorjitop
    15th Aug 2010
  • RE: The problem with Google Voice Actions for Android
    1) I disagree about keywords in search. I find that a natural language query finds people who are asking the same question I have, and then leads me to good answers: "How do I boil a soft-boiled egg"

    2) I do agree about the uh-umm factor in speech recognition.

    2) I also agree about phrasing with speech recognition. Have you used goog411? It expects certain words in certain order. If you ask for a business with three or for words and your city, state has more than one word each it gets confused.
    ZDNet Gravatar
    emellaich
    13th Aug 2010
  • RE: The problem with Google Voice Actions for Android
    this is cool, but I want natural langauge conversations with the computer, the computer should act like a 'real' human with whom I can talk. Siri is like that I guess from what I've heard. I hope Apple does not patent Siri and we get similar technology into all OSes. But Google voice for actions is a start no doubt.
    ZDNet Gravatar
    kdsandeep@...
    13th Aug 2010
  • totally wrong.
    "The problem is that people don?t know how to effectively use keywords in every day speech."

    Replace "keyword" by mouse, keyboard, gestures, buttons, sliders, switches and we see why this is monumentally bad logic. We've been adapting complex ways to adapt to machines or electronics for centuries that all are far less e...fficient than what we'd like. We can't "think drive" a car yet but when it comes be sure it will consistently beat using a steering wheel. Any difficulties that will emerge will be deemed insignificant compared to the gains of switching to such an efficient technology (thought=action)...Google didn't just use any keywords either: "directions to" , "note to self" ,... " call " , "map of" are going to be picked up and sucked into the lexicon of Android devices faster than the button set up of a feature phone did because any inaccuracy is marginal compared to the huge gain in productivity such a device offers. "call john martin at home" takes all of 2 seconds to say and initiates in seconds...actually finding John in an address book and calling take orders of magnitude more effort. Even in the case where call user icons are placed on the screen top as short cuts (assuming it were possible) the efficiency of navigating through the user icons falls over time as more icons pile up (you must visually search and tap)...the voice method is linear efficiency with time ...always "call blah blah". This linearity across action is why the blogger is completely wrong.
    ZDNet Gravatar
    sent2null
    14th Aug 2010
  • Re: totally wrong.
    @sent2null I'm not arguing against the possibility of the future -- I think we are all in agreement that talking to computers will one day be second nature. Right now it isn't though.

    You currently have to think before you speak, and while you are speaking to make use of features like this -- that's the problem. It's not a matter of how fast you can say something: "takes all of 2 seconds to say"... but it takes you 5 seconds to decide what to say -- and there is still a relatively high probability of failure.

    One day we won't have to think before we speak, and that's what I'm waiting for.
    ZDNet Gravatar
    Garett
    14th Aug 2010
  • RE: The problem with Google Voice Actions for Android
    @Garett

    I wasn't arging that computers will be second nature as much as I was saying that new ways to interface with our technologies win when they are more efficient than what they replace...even if they also introduce new bugs.

    "You currently have to think before you speak, and while you are speaking to make use of features like this -- that's the problem. "

    It's LESS of a problem than actually speaking it, that is what you aren't seeing. It is linearly efficient to say "call blah blah" when compared to searching an address book. Another example "navigate to blah blah" today, how do you do that on a smart phone (forget a desktop), on the smart phone you take the phone, find and tap the maps icon and then have to type in the location. With voice you tap the voice icon and say it, if you are clear it gets your intent straight away if you are not you have an immediate partial destination ready to correct to send. It's less work in either case compared to the alternative...those are just two easy examples.

    I am not at all convinced that the technology using the very carefully chosen keywords (the smart move) is not ready...people will adapt to those key words just as some people have adapted to that annoying (to me) little ball on the blackberry. If they can learn that paradigm...being a bit clearer with their words is trivial by comparison.
    ZDNet Gravatar
    sent2null
    14th Aug 2010
  • "I used it once, and it failed ? I?m done."
    Wow...really? How do you even type an article? I used Android voice commands long before 2.2 Froyo for texting and some web searching. It isn't perfect, but it sure beats trying to type while driving and possibly parking your car up another car's rear end or finding yourself dead in a ditch on the side of the road after the crash!

    Do you discard everything when it fails once? I know if I did that I could not use an operating system (I've had crashes or sorts in Windows, Linux and MacOS) nor could I used most programs as they have all had their share or problems and failures as well.

    No offense Garett, but you should either rewrite this article and remove the emotional bias or just plain delete the article as it is not based in reality. In "the real world" most technologies have little quirks and issues that are mediated by a number of factors. For example, with the voice commands, do you have an accent? Where you in your car with all the background noise? Was the radio on? Were you in an office with others talking around you? Environment plays a huge factor in any voice recognition.
    ZDNet Gravatar
    ExploreMN
    14th Aug 2010
  • Google always does things half azz.
    There isn't a single feature in Android that works right. The reason, Google releases everything with MINIMAL testing and no SQA.

    Most features are hacks, done in agile development with basically no requirements. The end result: Half-azz features that barely work.
    ZDNet Gravatar
    wackoae
    14th Aug 2010
  • RE: The problem with Google Voice Actions for Android
    @wackoae Maybe you're just rubbish at understanding things that aren't spoon-fed to you? Android works just fine, and is more flexible than the competition.

    As for the article, it's true that some bits of the new actions are a bit dodgy. I still don't understand why they implemented "set alarm" when the current version of Desk Clock doesn't support it and Google isn't offering an update. But "note to self" is really useful and clever and eliminates the need to stop and open a notepad when you want to remember something.
    ZDNet Gravatar
    cyberc9000
    15th Aug 2010
  • Android works just fine???
    Only if you barely use anything and/or ignore the bugs. Just take a look at the bug lists (filter on defect)

    Dude, the reason why Android is updated with a ridiculous frequency is because it is buggy. Even then, every update introduces an array of new bugs .... because new features are barely tested before they are released.

    That is not FUD. It is a fact that even people from Google agree to in the support forums.
    ZDNet Gravatar
    wackoae
    15th Aug 2010

Talkback - Tell Us What You Think

Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]

The best of ZDNet, delivered

ZDNet Newsletters

Get the best of ZDNet delivered straight to your inbox

Facebook Activity

White Papers, Webcasts, & Resources