Touch isn't Microsoft's only next-generation interface technology
Summary: While Microsoft's mult-touch capabilities (and lack thereof) are in the news daily, the company's speech engine and algorithms don't often merit a mention.At the SpeechTEK conference in New York City on August 3, Microsoft officials attempted to explain what the Redmondians have coming in the voice recognition and synthesis space -- without going so far as to announce undisclosed products.
While Microsoft's mult-touch capabilities (and lack thereof) are in the news daily, the company's speech engine and algorithms don't often merit a mention.
At the SpeechTEK conference in New York City on August 3, Microsoft officials attempted to explain what the Redmondians have coming in the voice recognition and synthesis space -- without going so far as to announce undisclosed products. And yes -- before you ask -- there is a cloud angle, like there seems to be for every Microsoft product and technology thesee days.
Zig Serafin, the General Manager of the "Speech at Microsoft" group, outlined for SpeechTEK attendees Microsoft's evolution in speech, a technology area that has been part of the natural user interface (NUI) focus for the Softies since 1993.
In 1999, Microsoft made its first speech-specific acquisition, the speech-toolkit vendor Entropic. In 2007, Microsoft spent $1 billion to buy speech-recognition vendor TellMe. But it wasn't until a little over a year ago that Microsoft consolidated its various speech-focused products and technologies into the Speech at Microsoft team, whose charter is "bringing speech to everyday life," Serafin said.
These days, Microsoft execs don't look at speech as a standalone product or technology. They see it as an enabler of other products. They also see it as an increasingly integrated piece of Microsoft's overall NUI plan.
Over the next 12 months, Microsoft will be bringing to market four new products that use its various speech technologies. The four:
Auto entertainment systems, like the Kia UVO announced at the Consumer Electronics Show at the start of this year. The first cars with UVO are due out this summer.
Windows Phone 7 devices, which have TellMe's speech technology is embedded right into the device shell. The phones will allow users to control dialing and search using voice, and integrated text-to-speech means the phones also will be able to "talk back" to users. (This is an example of what Microsoft execs mean when they talk about an "Internet of things" that connects up to the cloud)
Kinect sensors for Xbox incorporate voice-recognition capabilities, allowing users to pause, play, advance and stop games, TV shows and movies via voice commands
Corporate productivity products. There are more than 100 million Exchange users today who can make use of voice mail preview, voice translation and other voice-powered technologies that are built into the product (and will be built into Exchange Online, as Microsoft makes those features available to cloud users). Meanwhile, Microsoft's TellMe product currently is handling 2.5 billion calls a year, making use of TellMe's cloud back-end. (Interestingly, Serafin didn't mention Office Communications Server 14, which Microsoft is touting as its entry into the "enterprise voice" market.)
In the longer term, Microsoft is trying to help answer the question "When an we deploy systems with a human level of conversational understanding?" said Larry Heck, Chief Speech Scientist in the Speech at Microsoft group.
Heck told SpeechTEKers that there are three drivers that will help the company address this question:
- Data and relevant machine-learning algorithms
- Cloud-computing platforms, like Azure and TellMe Network's back-end platform
- Search
There needs to be a lot more data collected on user-machine interaction before Microsoft and others can realistically expect machine interfaces, including speech, to be more natural, Heck said. NUIs can help provide ubiquity, by enabling users to access data wherever they are, he acknowledged. But currently entry points like search engines aren't doing much to help advance work in making computers and devices more conversational. Users are accustomed to typing in a few keywords, rather than naturally phrased queries, but voice search on mobile devices more closely mimics human conversation, Heck explained.
Heck told attendees to "stay tuned" for new Microsoft products coming in the next few years that will reflect advances in conversational expression and understanding. (I'm guessing something like the client-plus-cloud patient-information systems Microsoft demonstrated at its Financial Analyst Meeting last week might be among those products to which Heck was alluding.)
Anywhere else you think Microsoft could, should or will incorporate speech recognition or synthesis technologies?
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
RE: Touch isn't Microsoft's only next-generation interface technology
Ink
RE: Touch isn't Microsoft's only next-generation interface technology
RE: Touch isn't Microsoft's only next-generation interface technology
RE: Touch isn't Microsoft's only next-generation interface technology
Depends on how you define it, I guess.
If you mean something that is "good enough," I think we can reach that via something like ALICEbot-like technologies.
Thing is, ALICE doesn't actually parse the words and assign meanings to them or anything like that - ALICE is just a matching engine with pre-determined responses.
But it works, and perhaps we could use it for something that resembles Star Trek computers, where you can pattern match speech well enough to create a command system.
For something more like actual human level intelligence, however, I don't think that's happening any time soon.
Depends in the human...
RE: Touch isn't Microsoft's only next-generation interface technology
Touch is another dead-end
Voice, body and face recognition with gesture based computing means the computer is doing the work. Glasses with head-up displays and other wearable interfaces and inevitably implants will all offer a new way to use computers.
When Scottie faced a Mac in one of the Star Trek movies, he first tried talking and then talking into the mouse before he realised he had to use the keyboard. It will not be long before people will be staring at a computer in bewilderment before finally realising they have to touch it to make it work - very quaint ;-)
I do recall talk of that on board the ship
RE: Touch isn't Microsoft's only next-generation interface technology
The same advertisers that brought us Seinfeld (lets play footsie and wiggle our shorts Bill), Laptop Hunters (that got all sorts of bad press for lies (incorrect pricing and customer never actually went into an Apple store) and portraying windows as "cheep"), And Windows 7 was Macs idea (where a college kid who can't get laid and get kicked out of his dorm room (by his Mac roommate) has to watch TV in the hall because he doesn't even have a friend whom he could visit).
I bet Kinect will<a href="http://www.gopsohbet.net" title="cinsel sohbet" target="_blank">cinsel sohbet</a> not be magical either.
IE8 had multi-process architecture before Chrome launched, and in fact<a href="http://sohbettir.com" title="sohbet" target="_blank">sohbet</a> was the first browser to announce the feature. <a href="http://www.gopsohbet.net" title="gay sohbet" target="_blank">gay sohbet</a> That's why both Chrome and IE use far more memory than the other browsers.<a href="http://www.alemchat.net" title="mynet sohbet" target="_blank">mynet sohbet</a> Chrome is a bit more strict than IE, IE will allow tabs with the same integrety level to <a href="http://www.eskimynetsohbet.com" title="mynet sohbet" target="_blank">mynet sohbet</a> share a single process.<a href="http://www.eskimynetsohbet.com" title="mynet" target="_blank">mynet</a> <a href="http://www.mynetci.com" title="mynet sohbet" target="_blank">mynet sohbet</a> Outside of that MS beat Google to the punch.<a href="http://www.mynetci.com" title="mynet" target="_blank">mynet</a> Good try though.<a href="http://www.indirmedenfilmizlehd.com" title="indirmeden film izle" target="_blank">indirmeden film izle</a>If MS came out with touch UIs for at least Word, Excel,<a href="http://sohbettir.com/forum" title="forum" target="_blank">forum</a> OneNote, and Outlook, with super slick, and highly<a href="http://eglence.sohbettir.com" title="youtube" target="_blank">youtube</a> effective integrated virtual keyboards, that would be mind blowing! I think <a href="http://sohbettir.com" title="canli sohbet" target="_blank">canli sohbet</a>that would be like lighting a rocket under PC touch computing.<a href="http://www.indirmedenfilmizlehd.com" title="bedava film izle" target="_blank">bedava film izle</a>
RE: Touch isn't Microsoft's only next-generation interface technology
It was Star Trek 4 The Way Home
RE: Touch isn't Microsoft's only next-generation interface technology
The only difference is you're touching a remote, not a touch screen.
You're the dead end.
RE: Touch isn't Microsoft's only next-generation interface technology
Remember the context of Star Trek. There is Jordie or Scott and they are the only ones talking. And the computer seems to focus on THEM! Not the minions that are floating around, who don't happen to be talking.
So if Jordie or Scott were not talking you would have to guess what they were doing. Or they would have to have characters saying, "hey Mr Scott what are you doing?" By having voice activated computers the show can offload the responsibility of the plot to the computer, which is you.
Maybe the 'big brains' at Microsoft shouldn't...
Not sure speech is all that useful
Later, I started using voice with a Mac in the early 1990s. There have been various voice control efforts on Windows too over the decades, I used one in the late 1990s. But they just aren't appealing, a keyboard and mouse (or other pointing device like touch) is very much faster and more efficient. And makes far less noise. Go into a call center sometime and listen to the cacophony of sound. That is what busy offices will be turned into if they use voice control for their PCs.
As for games, if you have 4 kids playing a game of Super Mario Cart, which voice will the console listen to? How will it know commands from general "conversation"? Voice dialling has been available on many phones for years, but it's never been a killer feature, or even a sought after feature. I had a phone capable of voice dialling for years and never used it (voice dialling that is).
I think voice control systems are interesting and great for research, likely there are a few niches they fill perfectly. But they are few and far between, so don't expect a voice UI to shake the world just yet.
RE: Touch isn't Microsoft's only next-generation interface technology
RE: Touch isn't Microsoft's only next-generation interface technology
RE: Touch isn't Microsoft's only next-generation interface technology
>As for games, if you have 4 kids playing a game of Super >Mario Cart, which voice will the console listen to?
Kinect on xbox solves this in terms of motion. Each player is mapped so it knows when its actually you moving and not someone behind you walking by. Guessing they will map your voice as well eventually.
>Voice dialling has been available on many phones for >years, but it's never been a killer feature, or even a >sought after feature.
Depends on the appropriateness of the feature I guess. I used to use live search on my previous non "smartphone" I couldn't stand typing in long cities and then names of places I needed to search for. But the voice input was a godsend. My current smartphone doesn't do that.
RE: Touch isn't Microsoft's only next-generation interface technology