Researchers have found that MEMS microphones are so sensitive they can interpret light as sound, allowing an attacker to shoot silent commands to voice assistants from afar.
Since the bug is general to MEMS (microelectromechanical systems) microphones, the attack can work against all devices that use them, including Google Assistant, Amazon Alexa, Facebook Portal, and Apple Siri.
Injecting voice-commands to smart speakers from a long range might not sound like a major threat, but devices from Google, Amazon, and Apple are shaping up to be a main hub for controlling gadgets in the smart home, including lights, smart locks, and garage doors.
SEE: Amazon Alexa: An insider's guide (free PDF)
Amazon says that 85,000 smart home gadgets now integrate with Alexa, while Apple is trying to get more gadgets to work with its HomeKit system.
Given smart gadgets' central role, the MEMS mic vulnerability could allow an attacker to issue commands to do things like open a garage door, open doors protected by smart locks, or even unlock and start a Tesla that's connected to a Google account.
The laser study was conducted by researchers at the University of Electro-Communications in Tokyo and the University of Michigan, who detail their work in a new paper, 'Light Commands: Laser-Based Audio Injection Attacks on Voice-Controllable Systems'.
"We show how an attacker can inject arbitrary audio signals to the target microphone by aiming an amplitude-modulated light at the microphone's aperture," they explain.
"We then proceed to show how this effect leads to a remote voice-command injection attack on voice-controllable systems. Examining various products that use Amazon's Alexa, Apple's Siri, Facebook's Portal, and Google Assistant, we show how to use light to obtain full control over these devices at distances up to 110 meters and from two separate buildings."
The attack, dubbed LightCommands, works because the diaphragm in microphones converts sound into electrical signals. The research details how an attacker can use silent laser beams to cause vibrations in the diaphragm and then issue commands.
The researchers' video shows how LightCommands work. Source: YouTube
The key condition required for the attack is a line of sight to the device. The researchers only demonstrated the laser-based audio injection from 110 meters away because it was the longest hallway available to them.
To accurately focus a laser on a target from that distance only required a commercially available telephoto lens, a tripod, and maybe a telescope to see the target device from a long distance.
A key issue that could force OEMs to adapt threat models is that most voice-command systems lack proper user authentication because it's assumed that users must be close to the device, which is typically shielded by walls, doors and windows. Light-based command injection may change the equation.
The attack is interesting because there's no immediate and automated method of detecting whether someone is using a laser to commandeer a device with a MEMS microphone. Since there's no sound involved, a user could monitor for light beams reflected on the device.
And the researchers theorize that the attacker's first step would be to set the device's volume to zero to avoid detection. From there, the attacker could buy things on Amazon or Google, or worse, open the garage door. How vulnerable a house is to the attack depends on how many smart things are connected to it.
Interestingly, the researchers found that Google Home and Amazon Alexa smart speakers block purchasing from unrecognized voices, but they do allow previously unheard voices to execute commands like unlocking connected smart locks.
Voice-controlled systems such as smart speakers also open up the possibility for PIN eavesdropping, allowing a remote attacker to use a laser microphone to steal codes.
The researchers describe several software and hardware mitigations that manufacturers can use to block laser command-injection attacks. For example, the voice-controlled system could ask the user a simple randomized question before executing a command. However, that solution could also annoy users.
Alternatively, smart speakers typically use multiple microphones, meaning that if only one of them receives a signal, the command should be ignored.
On the hardware side, manufacturers could also create a barrier that physically blocks laser beams, while allowing sound waves in. However a very determined attacker could boost the power of the laser and "burn through" any physical barriers.