Unified Communications (UC) and Voice-over-IP (VOIP) are essential technologies that have become important fixtures in the modern enterprise. These tools allow us to collaborate with our colleagues and to work from anywhere, even from home or even a coffee shop.
Still, while UC/VOIP platforms like Microsoft Lync/Skype, Cisco UC/WebEx and Citrix GoToMeeting have solved numerous work issues, one problem they have not solved is overall voice and audio quality, particularly in situations where several people are attending a voice conference at the same time.
In fact in some cases, audio quality on a conference call using a UC system can end up being worse than POTS (Plain Old Telephone Service) or even a mobile phone call.
Why does this happen? Basically, most UC systems as well as mobile phone networks operate on the same basic principle, in that there is a single audio stream of compressed data for the call that is shared among the participants. Each participant's voice is sampled digitally, and then multiplexed into that audio stream of compressed data.
There are several things that will affect call quality using VOIP. First is the quality of the microphones and sound reproduction capability for each participant. Next is the sampling rate of each participant combined with the current health of the network on each end of the call, of which several more variables come into play, such as client-side bandwidth, network round trip time, jitter and packet loss.
An issue with one or more of these variables from any of the call participants can degrade an entire voice conference because there is only one shared data stream.
Voxeet, a start-up which has its engineering team in Bordeaux, France and corporate offices in San Francisco, seeks to solve these problems by introducing high-definition sampling, multiple data channels and 3D audio mixing to provide a unique, high-value experience in audio conferencing.
3D audio is certainly nothing new, it's been a widespread part of PC and console gaming technology since the AC'97 audio codec standard was introduced by Intel (which has since been superseded by Intel HD Audio). However it's never been used in audio conferencing before.
Voxeet version 3.5, which was released this week, is a client software program that runs on Windows, Mac, Android and also iPhone devices. While the iPhone version works just fine on an iPad, a native version is planned for release in the near future.
The software is built around the new WebRTC audio codec, which Google open sourced in May of 2011.
The Voxeet client connects with Voxeet's cloud service, which runs on Amazon Web Services. The AWS-based infrastructure primarily functions to coordinate calls, act as a chat server and perform scheduling, however all call processing occurs on the client device, not on the server.
Call scheduling is done by adding email@example.com as a virtual participant in your calendaring program of choice, along with your other intended call participants. The service then calls each participant at the appropriate time.
Currently, up to 8 people can call into a Voxeet conference, using either the client software or a PSTN dial-in bridge number. The only two authentication mechanisms Voxeet currently supports are Google+ and Facebook, and the software cannot do video calls, outgoing PSTN calls or whiteboarding/presentation sharing yet.
So it's not going to replace your corporate UC system in the immediate future.
But what Voxeet does do it does very, very well. Instead of having a shared data stream, each participant's voice is sampled in HD audio (48,000Hz) and stays an isolated stream. Much as a sound engineer in the music industry mixes separate vocal and instrumental tracks while producing a song, Voxeet mixes all the streams together to create one cohesive call experience.
Because each stream is a separate track, multiple people can talk simultaneously without talking on top of each other. This "mixing" effect can also be used so that two participants can have a private side conversation, or a "whisper" without interrupting the entire conference call.
Additionally, if one or more streams degrade, the Voxeet client can automatically perform processing to try to improve the call without degrading the call for the other participants.
Each audio stream is encrypted, and because there are multiple streams, this makes it that much harder to try to intercept a Voxeet conference.
Also because each person may be using different microphone audio hardware on their end with different input volume settings -- whether it is the cheap, ubiquitous Apple earbuds on an iPhone or something high-end like a USB-connected Jabra Evolve 80 on a PC-- each participant has the ability to move other participants around a virtual "room" that simulates a 10 meter square space.
If someone is talking too loud, you can push them towards the back of the room or off to the side. If someone is talking too quietly you can pull them closer to you. You can also mute individual people. Participants can also do instantaneous call hand-off between their devices by logging in with say, their mobile device, while their desktop client is running.
Voxeet has produced a demo video that replicates some of the overall effect of the call experience. To truly appreciate this you actually need to download the software and use the service, which is currently free. The company is working on various plans for monetization such as potentially licensing its technology to the existing UC players, and also offering premium services.
In our tests in using the Windows and iPhone version of the client software we were extremely impressed with Voxeet's sound reproduction quality and the 3D positioning effect, but we did have a number of issues connecting calls, presumably because the service has recently been rolling out changes to its back-end cloud infrastructure.
Overall I think Voxeet has some interesting value add for creating high-value voice conferencing experiences, and is definitely worth trying.
Have you been looking for a higher-quality voice conferencing experience? Talk Back and Let Me Know.