How voice recognition will change the world

Summary:It's Friday morning and you're thinking about rewarding a week's worth of hard work by escaping the fluorescent lights of the office for a lunch hour in the sun.

It's Friday morning and you're thinking about rewarding a week's worth of hard work by escaping the fluorescent lights of the office for a lunch hour in the sun.

(Credit: CNET)

Eating alone is no fun, of course, so you might as well invite a good friend.

You pick up your new Apple iPhone 4S smartphone and press and hold the home button.

At the bottom of the screen, a little microphone appears. "What can I help you with?" it asks.

It beeps. You begin to speak.

"Send text to Erica: Let's have lunch."

In an instant, your intended message appears as tech on the screen, loaded into an abbreviated version of the phone's standard SMS form. Erica's name has populated the necessary field. Without missing a beat, it asks: "Ready to send it?" You reply "Yes". And off it goes.

It couldn't be easier, short of the phone reading your mind. The entire process takes three seconds at most.

In reality, the process you just experienced is the culmination of decades of intense interdisciplinary research. What was a nearly seamless transaction between your mouth and your phone was, in fact, a complex series of computational decisions designed to understand what took you, as an infant, years to master: language.

In the last 70 years, computers have amazed us with their ability to conduct simple calculations at astonishing speeds. With each passing decade, the computer's ability to process these calculations has advanced to the point where today, we are tasking it with our most pressing, complex problems: mapping global climate change, particle physics, materials science and more at the atomic level and beyond. Things that would take hundreds of humans hundreds of lifetimes to calculate on their own.

All of these phenomena operate under a strict series of laws, of course. Our ability to understand them is only as sufficient as our ability to assemble a large enough data set to represent them. And aside from the occasional once-in-a-lifetime discovery that redefines those rules — such as when scientists discovered bacteria last year that survived on arsenic, an element not among the six known to constitute life — the rules remain constant.

But language, that's a different story. Language is, essentially, a complex system of communication — but it's developed by humans, not the natural world, and certainly not by a computer operating in a binary existence. It is, then, inherently imperfect; it develops in different ways at different rates and it does not subscribe to a firm set of rules. In the world of computers and Mother Nature, slang like "You be illin'" is an anomaly of epic proportions. Multiple words for a single meaning collide with multiple meanings for a single word. Speaking it aloud complicates understanding even further. And worst of all, the rules (and accents) for the more than 3000 languages spoken worldwide are in a constant state of change.

It is, therefore, a tremendous feat to marry the rigid world of computers to the squishy, volatile world of spoken language. And yet your new smartphone just managed to accomplish it in seconds.

Topics: Mobility


Andrew Nusca is a former writer-editor for ZDNet and contributor to CNET. He is also the former editor of SmartPlanet, ZDNet's sister site about innovation. He writes about business, technology and design now but used to cover finance, fashion and culture. He was an intern at Money, Men's Vogue, Popular Mechanics and the New York Daily Ne... Full Bio

zdnet_core.socialButton.googleLabel Contact Disclosure

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Related Stories

The best of ZDNet, delivered

You have been successfully signed up. To sign up for more newsletters or to manage your account, visit the Newsletter Subscription Center.
Subscription failed.