Microsoft Teams: Get ready for clearer sound on your meetings thanks to this audio upgrade

Microsoft's Satin audio codec is coming to Teams meetings to bring clear audio even on really bad networks.

Microsoft is showing off improvements to its audio codec called Satin that should help improve the clarity of audio on Teems meetings on crammed networks with high packet loss. 

Microsoft brought the Satin audio codec to Teams last year, but its use has been limited to two-party calls. 

Today, many people are getting the job done from home on networks that are probably being shared with the kids for video, home schooling, gaming and so on, so Teams meetings could greatly benefit from the AI-powered codec, too. Any bitrate savings on audio can be used to improve experiences on other workloads like video or content sharing, Microsoft said.

SEE: Top 100+ tips for telecommuters and managers (free PDF) (TechRepublic)

"Satin is already being used for all Teams and Skype two-party calls and will roll out for Teams meetings soon," Microsoft engineers Jigar Dani Sriram Srinivasan say in a blogpost. 

The Satin codec currently operates in wideband voice mode within a bitrate range of just 6 kbps to 36 kbps. It will be extended to support fullband stereo music at a maximum sampling rate of 48 kHz in the near future, they said.

Satin is the evolution of Silk, the audio code developed by Skype during the dial-up era of the internet for audio at wideband (16 kHz) quality speech from 14 kbps. 

The engineers note that Silk was the default codec for Skype and Microsoft Teams and part of Opus, the audio-coding format for WebRTC – the project bringing real-time communications (RTC) to the browser.   

Despite all the improvements in broadband and mobile speeds, network quality and usage varies hugely.

"Satin can deliver super wide-band speech starting at a bitrate of 6 kbps, and full-band stereo music starting at a bitrate of 17 kbps, with progressively higher quality at higher bitrates," the Microsoft engineers explain. 

They also offer a rundown of the difference between narrowband, wideband, and super-wideband voice: 

  • Humans hear sounds at frequencies between 20 Hz to 20 kHz .
  • Early telephony systems used a sampling rate of 8 kHz and could reproduce frequencies up to 4 kHz (in practice up to 3.4 kHz). 
  • VoIP brought wideband speech (reproduce up to 8 kHz, sampled at 16 kHz) to bring crisper, more natural and intelligible sound.
  • Silk and Opus took advantage of this with super-wideband voice, capturing frequencies up to 12 kHz, sampled at 24 kHz.
  • Satin redefines super-wideband to cover frequencies up to 16 kHz (sampled at 32 kHz) for greater clarity and sibilance, and its efficient compression enables super-wideband voice at 6 kbps.

The Satin code also addresses situations where networks are experiencing high packet loss, which affects the quality of audio. 

"Satin encodes each packet independently, so the effect of losing one packet does not affect the quality of subsequent packets. The codec is also designed to facilitate high quality packet loss concealment in an internal parametric domain," the engineers note.