Since my initial report on the Vista analog hole and getting confirmation of the flaw from Microsoft, Microsoft's MSRC blog downplayed the significance of this exploit and said that there was "there is little if any need to worry about the effects of this issue on your new Windows Vista installation".
Fundamentally they acknowledge the problem, they say that they are looking into it and in the meantime give you an excellent pointer to where the issue could cause real harm, i.e. healthcare.
I also have objections to the fact that you can't do anything dangerous with it: downloading and executing a local privilege escalation is still eminently possible, you just need a suitable 0-day local privilege escalation for Vista. Indeed, any way to download and run arbitrary code as a valid user is never good news, this one just happens to be from the "neat trick" pile.
Scott M. Fulton III of BetaNews characterized this best as the "low-tech attack"
After well over a year of unprecedented beta testing, with engineers and amateurs alike poring over the possibilities of rootkits evading API queries deep in the recesses of memory, perhaps it's no wonder that obvious exploits such as this one went unnoticed until Vista was finally released.
InfoWorld Paul Roberts wrote:
Successful attackers would need to be physically present at the machine, or figure out a way to trick the computer's owner to download and play an audio recording of the malicious commands. Even then, the commands would somehow have to be issued without attracting the attention of the computer's owner.
That is not actually correct Paul. If you've ever been to those annoying MySpace pages or if you've ever seen those annoying popup/pop-under ads that automatically starts blasting music or sounds, you'd know how easy it is to play unwanted sounds on a computer. People leave their desks all the time with webpages open and webpages can have rotating ads that eventually play sounds.
Finally, attackers’ commands are limited to the access rights of the logged on user, which may prevent access to any administrative commands, Microsoft said in a statement.
As I've mentioned before, this is not a system level attack. The simulated attack that I pulled off deleted the documents folder and emptied the trash. Another attack I suggested using TinyURL to simplify a long URL to an EXE payload for download and execution was verified by a security analyst. That means user-level code can be executed by this "analog hole". User-level code can easily steal, delete, or encrypt all of your user data for ransom. Lastly Paul, this is NOT a SHOUTING hack. The sound levels did not have to be that loud, normal speaker levels worked fine.
The fundamental problem here is that Microsoft "extended" speech to be able to control the Operating System and Applications without considering the full security implications. If Microsoft had merely assigned a user-defined password with an automatic lockout after a certain amount of idle time, it would have made the generic attack impossible but they failed do that. So I'm asking Microsoft to reconsider their stance that "there is little if any need to worry" and implement some sort of safety mechanism rather than relying on the user to be self vigilant. It doesn't matter that there aren't that many people using this feature; Microsoft should fix it if they're going to offer it and market it as a key Vista advantage. Since Microsoft is promoting Voice recognition for healthcare, we should consider the safety of patient health records.
At present time, Vista Speech Recognition wakes up to the command "start listening". How hard would it be for Microsoft to make that a user-definable phrase or word? For example: A user would pick "Zelda" as the word to wake speech mode while someone else picks "439" as their wake word. How hard would it be for Microsoft to implement a wake timeout so that Speech Recognition would sleep after 5 minutes idle? How hard would it be for Microsoft to implement their excellent echo cancellation algorithm in Windows Messenger for Speech Recognition? I don't believe this is too much to ask.