At the Interspeech 2018 conference in Hyderabad, India, this week, Microsoft researchers will be talking up advances in overlapped speech recognition that they've achieved. Part of the solution they'll be outlining involves a new circular microphone array -- seemingly the one that attendees of Microsoft's Build 2018 conference saw in a demonstration, but about which Microsoft has declined to reveal specifics.
Microsoft and others working in the speech recognition field have been attempting to address the "cocktail party problem," i.e., the situation where speakers overlap in a noisy environment. Systems need to be able to identify a varying number of speakers with unknown identities, speech patterns and extraneous noise.
From an image that accompanies the September 5 blog post about the research paper (which I've embedded in my post above), it looks like Microsoft researchers have built a seven-channel conical mic array for meeting transcription as part of their solution. The system handles dereverberation, speech separation and automatic speech recognition, the research paper says.
I asked Microsoft if this is, indeed, the same device and if the company has considered turning the mic into a marketable product (by either Microsoft itself or its OEMs) at some point. No word back so far.
To Microsoft researchers knowledge, according to the blog post, this system "represents the first overlapped speech recognition system that has been demonstrated to work well for actual meetings with no prior assumptions."
Microsoft has used work from its researchers in the automatic speech recognition area in a number of its products, including Cortana, Skype Translator, Office Dictation, HoloLens and Azure Cognitive Services.
A brief history of Microsoft's Surface: Missteps and successes