UK-based Novauris and Existor have joined forces to develop Siri-like speech recognition and AI chat technology to almost any device with a powerful enough processor.
UK-based speech technology specialists Novauris and Existor have teamed up on Siri-like speech recognition software for smartphones and other devices. Image credit: Novauris
The companies expect the technology to eventually feature in a range of devices, from smartphones and tablets to in-car systems and toys, they said in revealing their partnership on Tuesday at Mobile World Congress (MWC) 2012. It will also be released as an app on Apple's App Store, for iPads and iPhones, and in the Android Marketplace via an as-yet unnamed partner in the second quarter.
London-based Existor is the company behind Cleverbot, a chat bot that promises 'human-like interaction'. It has developed an embedded version of this technology that could be made to run on different devices, and that is being combined with Cheltenham-based Novauris's embedded automatic speech recognition engine to enable spoken conversations with the devices.
The technology behind it works in a similar way to the Siri virtual personal assistant found on Apple's iPhone 4S, according to Melvyn Hunt, co-founder of Novauris. However, instead of capturing voice samples and sending these back to remote servers, the embedded technology does all the processing locally. In addition to side-stepping the privacy issues thrown up by Siri's handling of data, this means it does not need an internet connection to work, according to Hunt.
"It takes a fraction of a second because it is done locally on the device. That's the difference to something like Siri, which is done over the network. That would be hell in this environment," he told ZDNet UK, referring to the scarce network connectivity at MWC.
"There are things that Siri can do that this thing can't do," he added. "But for everything this thing can do, we do it a lot better. We can't create text, for example; but what we can do is access things from very large sets of things very quickly and very accurately."
Devices need less than 10MB of RAM to run Novauris's embedded speech engine, and the company recommends a 400MHz ARM processor. However, it also has a 'NovaLite' system that can work with less than 5MB of RAM and a 300MHz ARM chip.
The app version for iPhones and iPads could have a cloudy future. Cambridge-based True Knowledge has run into problems with Evi, an app that performs many of the same functions as Apple's Siri. According to reports, the company has been told that its app will be pulled from Apple's App Store.
It takes a fraction of a second because it is done locally on the device. That's the difference to something like Siri, which is done over the network.– Melvyn Hunt, Novauris
However, Novauris and Existor are aiming to have their software pre-loaded on mobile devices by default. Hunt hinted at possible partnerships with at least two hardware makers, but could not confirm these.
"Now we have an alliance with LG and with Panasonic... we would expect them to be pre-loading it," he said. "I can't say for sure, but we would hope they will."
The Novauris technology can respond to natural language, Hunt said, in contrast with other voice control tools that require commands and queries to be spoken in a specific way. As an example, he described the technology in use for translation: if the user says "My TV is broken", wanting it translated to Japanese, it will display the Japanese result for that phrase; and if the user says, "Unfortunately, there is a fault with my television", it will display the same result, as it understands what the user wants.
Existor and Novauris plan to unveil a product aimed at children soon, which will chat and respond to children while they are learning or playing.
Hunt also suggested it could be a useful tool for avoiding learning whole new schemas of action. For example, the technology on an Android or iOS handset would mean a user would not need to know their way around the device; people would simply need to speak a command to be taken to the chosen option. This means it has the potential to make any content — for example, music tracks, contacts or apps — searchable on a handset, he said.
One of the limiting factors for using the technology on an Android smartphone is having fine enough control over the audio settings, Hunt noted.
"As soon as you go in a noisy environment the A2D [analogue-to-digital] fills up with background noise. Android phones don't give developers the control over analogue gain... The gain is just too high," Hunt said. "That's our biggest hardware limitation, rather than processor [power] or battery life,"
He added that the iPhone fared better with audio, although the gain is still a little too high. However, the technology has its own set of issues on the Apple smartphone.
"On the iPhone we can't do the full address search, because it's much more tricky to handle large memory [intensive tasks] on the iPhone than it is on Android," Hunt said.
Get the latest technology news and analysis, blogs and reviews delivered directly to your inbox with ZDNet UK's newsletters.