I recently interviewed Anthony Minessale, the author of FreeSwitch, the open source, class-5 switch. We got to speaking (actually, IMing) about high-definition telephony and how it fits into FreeSwitch. He had some pretty smart things to say about the area (no surprise) so I thought you might enjoy hearing about it as well. Ignore some of the FreeSwitch-promotion. The guy just can't help himself :-) And when you're done drop me a line and let me know what you think.
Take it away Tony....
The term “High Definition” is traditionally applied to a technology when a new innovation allows a noticeable improvement in quality or detail from an existing version. The superior picture quality of an HD TV for example. The improvement in telephone audio quality that HD-Telephony offers is far less profound than that of HD-TV or HD-Radio but it’s an improvement nonetheless. The biggest issue with HD-Telephony is that the majority of phones and phone switches were designed with the mentality that every call they would encounter would be the same audio format and sampling rate. As VoIP grows in popularity we already face this issue with the various encoding formats used to reduce the size of the media stream. Encoded audio cannot be manipulated or analyzed and must first be decoded to alter the volume for instance. Then it must be re-encoded to the proper format when passing the call to the public telephone network. Variances in audio sample rates add another dimension to this problem because the decoded audio may then need to be re-sampled before being re-encoded or processed. If a high-definition audio signal is re-sampled to a lower rate at any point, all benefits of using that format are lost. Therefore, it is important to ensure that any calls using an HD-audio format only pass through devices that use the same sampling rate as the original call. So at this time, using a higher quality audio format for calls destined for the public phone network is pointless. On the other hand, there are many benefits to supporting HD-audio in your application or device with little or no cost to the bandwidth usage on your network as long as you pay attention to when and why you are using it.
I have done a great deal of work for the Asterisk open source PBX project. Asterisk supports voice-over-IP but it was really designed to support legacy telephone equipment such as analog telephones and digital circuits where higher quality audio is far from a reality and of little concern. That was among the many reasons I decided to start FreeSWITCH. Since I had the luxury of knowing about the importance of varying sample rates, I was able to design my application to not only translate audio between various encoding formats but also to mix and resample audio as well. I think this has paid off in the long run because I now see many opportunities to take advantage of HD-audio as newer phones are being developed.
One situation where higher quality audio shines is when you have several phones on the same network that support HD-audio. In this case it will be possible to experience a noticeable improvement in quality with every call. When calls are destined for the public telephone network or some other legacy device that will only support lower quality audio, the switch can negotiate the call with the calling phone at the lower quality in anticipation. This puts the burden of re-sampling on the phone and since the phone was designed to operate at either format anyway it has little impact on performance. Once you have the logic in place to determine when to use high definition audio most of the disadvantages begin to melt away. In FreeSWITCH, we can take full advantage of high-definition conferencing, audio playback and speech generation / recognition when applicable. In the case of conferencing we can even allow legacy devices to join a high definition conference by re-sampling the audio to the correct format. As long as the format is chosen wisely or corrected to match the format of the destination there are no drawbacks to the addition of HD-Telephony into an existing architecture.
Another concern many have with the idea of HD-Telephony is that it appears to be counter-intuitive with the goal of efficiency. There is a misconception that because the quality of the audio is higher, then the amount of bandwidth necessary to transport that audio must also increase by the same factor. To transmit uncompressed audio, it will indeed require twice bandwidth of an 8khz stream to send a 16khz media stream. However, once encoded, a 16khz audio stream can use the same or less bandwidth than a standard 8khz g711 stream. It may be true that some encoding formats designed for 8khz can greatly reduce the network bandwidth required to send a high volume of calls but again, in those cases, the telephony switch can request the most optimal encoding format from the phone or provide the encoding to the desired format by way of software or hardware encoding. The basic principle is that if the one phone calls another phone on the same network or when the exchange dialed is a known high definition resource, it chooses a high definition format and when the phone calls an exchange that will put the call over a highly encoded trunk using something like g729 it will negotiate g729 with the phone from the beginning to get the optimal results of that media path.
Many new SIP phones now support high quality audio using g722 as well as low bandwidth codecs such as g729. The open source speex codec supports wideband 16khz as well as ultra-wide-band 32khz. Many new wireless phones support both narrow and wideband audio formats making it possible to complete a HD-audio call from your mobile phone to a HD-ready conference or SIP phone without giving up any additional resources or functionality. There is simply another factor to consider when negotiating the call and just because it’s more difficult to deal with, we should not ignore this technology that has actually existed for a long time. It may not offer many benefits to Ma Bell but HD-Telephony is emerging in the marketplace and when used properly, has the potential to raise our standards to the next level.