How Apple's Siri really works

How Apple's Siri really works

Summary: How does Apple's Siri really work? A SmartPlanet article lays out how voice recognition on a smartphone really works, step by step.

SHARE:
TOPICS: Apple
30

Apple's Siri is sassy, clever and occasionally useful.

But how the hell does it really work?

"Voice recognition" is what Siri does, but those words alone don't reveal how the system actually gets your words right when you say, "Send message to Jason Perlow: Go get a shave, Linux Lover."

But a lengthy feature article over at our sister site SmartPlanet has the dirt, step by step:

The sounds of your speech were immediately encoded into a compact digital form that preserves its information.

The signal from your connected phone was relayed wirelessly through a nearby cell tower and through a series of land lines back to your Internet Service Provider where it then communicated with a server in the cloud, loaded with a series of models honed to comprehend language.

Simultaneously, your speech was evaluated locally, on your device. A recognizer installed on your phone communicates with that server in the cloud to gauge whether the command can be best handled locally -- such as if you had asked it to play a song on your phone -- or if it must connect to the network for further assistance. (If the local recognizer deems its model sufficient to process your speech, it tells the server in the cloud that it is no longer needed: "Thanks very much, we're OK here.")

The server compares your speech against a statistical model to estimate, based on the sounds you spoke and the order in which you spoke them, what letters might constitute it. (At the same time, the local recognizer compares your speech to an abridged version of that statistical model.) For both, the highest-probability estimates get the go-ahead.

Based on these opinions, your speech -- now understood as a series of vowels and consonants -- is then run through a language model, which estimates the words that your speech is comprised of. Given a sufficient level of confidence, the computer then creates a candidate list of interpretations for what the sequence of words in your speech might mean.

If there is enough confidence in this result, and there is -- the computer determines that your intent is to send an SMS, Erica Olssen is your addressee (and therefore her contact information should be pulled from your phone's contact list) and the rest is your actual note to her -- your text message magically appears on screen, no hands necessary. If your speech is too ambiguous at any point during the process, the computers will defer to you, the user: did you mean Erica Olssen, or Erica Schmidt?

There's a whole lot more to learn in the article, including a history of research around the technology and exploration into what Google, Microsoft and others want to do with it. (What are you waiting for? Go read it.)

Voice recognition has been around in some form for years, but it's pretty neat to see exactly what happens when you press that button.

Topic: Apple

Andrew Nusca

About Andrew Nusca

Andrew Nusca is a former writer-editor for ZDNet and contributor to CNET. During his tenure, he was the editor of SmartPlanet, ZDNet's sister site about innovation.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

30 comments
Log in or register to join the discussion
  • Imitating Kinect

    So, it's true!

    Apple's "Siri" is in fact just copying one of the features of Kinect, as launched way back in 2010!
    Tim Acheson
    • Both Google and Microsoft said they do not need AI assistant

      Google's Rubin was especially clear about that.<br><br>However, the same Rubin said that original, genuine Android UI is the best thing -- only to "change mind" about it after Apple announced iPhone. <br><br>Then Rubid (Schmidt, Page, etc) decided to commit an IP theft, throw away "perfect" genuine Android UI and use Apple's finger-based UI as core principle.<br><br>Having this story, we can conclude that <b>Rubin obviously lied</b> and Google will offer Android version with the build-in AI assistant (not just "Voice Actions", as Android has now) sooner than later.
      DDERSSS
      • And Kinect is only licensed technology, so MS is out of the picture here

        And it is certainly not an AI assistant.
        DDERSSS
      • Licensed technologies do not count?

        @DeRSSS
        Then Siri does not count. Apple is licensing voice recognition from Nuance so Apple is out of the picture here.

        Care to reconsider your stance on licensed technologies?
        toddybottom
    • My Android Phone Has Been Doing This For A Long Time...

      @Tim Acheson
      As you've pointed out, the Kinect blows the doors off of anything that apple has ever created, but truly gets no credit from the masses. They simply use it and take it for granted. My Android phone blew me away when I started using it's voice command feature, and yet apple is in the process of convincing the world that it started the revolution. This is what apple has always done... No revolution here.
      Steve@...
      • RE: How Apple's Siri really works

        @Steve@... Google voice is crap. That comes from many people o use android devices. Now your just making yourself out to be a fanboi.
        illwill112
      • RE: How Apple's Siri really works

        @Steve@...

        Sorry but the Android voice capability is on par with what Windows Mobile 6.x series offered four years ago.

        I like Android, but lets get real.
        toadlife
      • Never Was A &quot;Fanboi&quot; For Android...

        @Steve@... I am a big fan of WebOS, which outdoes Android in many respects, but the voice feature on my Sprint phone is outstanding, and will do all of the things that apple is bragging about. PERIOD... So don't contradict what I've said, unless you have a Sprint Android phone. If you have apples latest piece of copyware, you haven't got a clue what mine will do.
        Steve@...
      • RE: How Apple's Siri really works

        @Steve@... So to turn it around on you unless you have the 4S you don't know crap about what it can do. See how that works.
        non-biased
  • RE: How Apple's Siri really works

    Interesting and based on the funny but smart AI responses it gives I think Apple and AI agenst have a the future
    http://thetechnologycafe.com/siri-and-its-jokesmore-shit-and-funny-stuff-siri-says/
    samzbest@...
    • RE: How Apple's Siri really works

      @samzbest@... I believe SRI natural language processing (from SIRI) system is as good as IBM's Watson, too bad both Apple and IBM depend on a third technology vendor called Nuance to covert speech to text, if this could be developed internally by Apple or IBM, you would reduce network bandwith, since you would only do one transmission, and all processing should be done in one place. This could improve the speed of the response.
      Gabriel Hernandez
      • RE: How Apple's Siri really works

        @Gabriel Hernandez

        Perhaps it is done this way to save the device battery as well?
        cowboys2000
    • RE: How Apple's Siri really works

      @samzbest@...

      Wow. Apple finally managed to copy ELIZA ;-)
      tonymcs@...
  • RE: How Apple's Siri really works

    This should be titled how Siri doesn't work!

    I have set appointments, found directions and asked humorous questions successfully with the App but I have only had very limited success sending text messages or calling friends with it!

    Here's an idea you bozos developing the product should really try (I believe V Lingo works tis way)...

    Limit your dictionary to the address book when somebody says any of the following...

    Text
    Send message
    Call

    The immediate word / words should at least weigh the address book more heavily than the normal dictionary!

    When I say text Ann, it should not search the damn dictionary and return an or and!

    V Lingo gets my wife's name first time, every time but siri struggles with the most basic of names!
    slickjim
    • RE: How Apple's Siri really works

      @Peter Perry It learns names as you go, maybe you are not correcting it but just abandoning a task if it was incorrect? If you have a particularly difficult name, you can also speak it into a phonetic pronunciation field on the contact.
      teetee1970
      • I would sooner bet he has never used Siri.

        @teetee1970

        I send texts, call people, make appointements, change appointements, make notes, create reminders, set alarms with no issue.

        The best part is I had to learn very little to interact with it. There are dozens of ways you can interact with Siri to get similar responses.
        Bruizer
      • RE: How Apple's Siri really works

        @teetee1970

        He tried it on his wife's phone but Siri knew exactly who he was and toyed with him just for kicks.
        rfoto
    • RE: How Apple's Siri really works

      @Peter Perry None believe you own an iPhone. Let me tell you this me and my wife have the 4s and my puerto rican accent is deep and my wife's Filipino accent is strong as well. We have no problems send text message at all.

      Your post sounds like your regular post history. Spreading FUD about apple because you don't like them. I have gotten a little to used to Siri now to do my appointments send texts and do web searches for me.
      illwill112
    • RE: How Apple's Siri really works

      @Peter Perry

      Then you've accomplished something. Sending SMS is one of its easiest functions, very difficult to screw up. Good work.
      dhmccoy
    • i thought your wife has an iPhone 4.. Siri only works of 4S..

      @Peter Perry.. to be frank.. i think you're fibbing and have never used Siri..
      doctorSpoc