The flowering of voice control leads to a crop of security holes

The predicted explosion of voice-enabled apps in 2017 will increase the attack service of mobile devices and the systems they control, and create brand new privacy risks.
Written by Stilgherrian , Contributor

'Tis the season of cybersecurity threat predictions for 2017. Vendors' glossy reports shower onto the desks of customers and journalists like gentle Christmas snow. But so many of these reports, like so many snowfalls, are nothing but slush.

All year we've been hearing about the spreading plague of ransomware, and how the Internet of Things (IoT) will be a security nightmare. Remember the botnet made of video cameras? Vendors have been waving around phrases like "artificial intelligence" and "machine learning" and "threat intelligence sharing" like magic wands.

In too many of these reports, most of the predictions are obvious, even banal.

It's still early in the threat predictions season, though. Some vendors don't issue their predictions until March. But something has already stood out, in Forcepoint's 2017 Security Predictions.

"The number of apps designed to leverage voice-activated AI [artificial intelligence] such as Siri, Alexa and others, will explode in 2017, allowing a whole new threat vector to emerge," Forcepoint wrote.

"They may also pose unwanted risks, especially as regards access controls. New interface-based security risks will also accompany this app proliferation, allowing hackers to bypass existing security protection, leading to an increase in AI app-associated data breaches."

Want an example? Think back to early 2015, and Samsung's voice-activated Smart TV that could be controlled simply by talking to it. With just a few words, you could change the channel, adjust the volume, whatever.

Consumers weren't happy when they discovered how it worked.

"What people learned quickly is that in order to do that, it had to be listening all the time. So it was listening to everything going on, including when these televisions were being used in conference rooms and places like that," said Bob Hansmann, Forcepoint's director of security technologies, in a webinar last week.

"It was connected to the internet, and all of your conversations were being sent back to the manufacturer, and the manufacturer had the AI in their facility, on their server. It was a cloud service. And that cloud service would then send the command back to the TV."

Some of the newest voice-activated systems can even figure out which household member is talking. It's great for tailoring actions to individuals' preferences, but it's also yet another way to collect data about us humans. Perhaps for more targeted advertising, perhaps for something worse.

A proliferation of voice-activated apps means a proliferation of vendors tapping into the data stream, all wanting to know what's being said. Will that flow of voice data and command streams be architected and managed in a way that protects our security and privacy?

Relax. I'm sure the designers of all these new IoT devices will bring to their task the level of care and attention to security that we've become accustomed to.


As Hansmann reminded us: "Even when you aren't giving it commands, it's listening." Which brings me to my final point.

How will voice-activated systems know that the command they've received is legitimate?

Consider the case of the guy who fitted his home with an August Smart Lock. It's operated via Bluetooth and Apple's HomeKit, and can receive voice commands via Siri. This guy had set up an iPad Pro in his living room to control his HomeKit devices.

"Unfortunately, the setup opened up a huge security hole that serves as lesson [sic] of how smart home technology can backfire," reported Forbes. "His neighbor, who was coming by to borrow some flour, was able to let himself in by shouting, 'Hey Siri, unlock the front door'."

There's a big difference between speech recognition and voice authentication, folks.

Consider also, though, Adobe's VoCo voice project. Record enough of the target's voice, and VoCo can then synthesize literally any chosen speech in that person's voice. Feed in the written words, and out comes the sound.

VoCo is only a proof-of-concept at this stage, but still, you have to wonder about the life expectancy of voice authentication.

Meanwhile, if you've run out of flour, you know what to do.

Editorial standards