Artificial empathy: Call center employees are using voice analytics to predict how you feel

The technology can also help identify depression symptoms, and it may soon empower machines to act more human.
Written by Greg Nichols, Contributing Writer

Customer service calls can be ... infuriating. Part of the reason is that humans generally aren't great at reading subtle emotional cues, especially if we only have voice to go by.

At the same time, we often inadvertently broadcast unintended emotional signals, easily leading to miscommunication and discomfort over the phone.

But an MIT spinoff called Cogito is using voice analytics to help customer service reps better understand how customers are feeling. The technology behind Cogito's enterprise product, which can predict a customer's emotional state by analyzing tone and voice patterns, has also been used to identify signs of PTSD and depression in veterans.

It doesn't take a huge imaginative leap to envision the same technology giving computers and robots a simulated version of empathy.

The analytics Cogito developed arose out of conflict.

In 2001, MIT Media Lab professor Alex "Sandy" Pentland was in India to launch Media Lab Asia. "I noticed a lot of the meetings we had, particularly the board of directors, were awful," Pentland told MIT News, which has tracked the company it helped launch.

The problem, Pentland surmised, was the way people were communicating their ideas -- not necessarily the words they were using, but the tone and emphasis behind the words.

Out of that experience grew Pentland's infatuation with quantifying how people speak, which often stands in contrast to what people are saying. That is, Pentland wanted to understand subtle cues in speech and tone, as well as body language, that have nothing to do with language.

TechRepublic: Speech analytics: Why the big data source isn't music to your competitors' ears

To aid his effort, MIT researchers developed what they call sociometers-- name badges with embedded sensors that track patterns in speech and body movement during conversation.

The researchers were able to predict the outcome of interactions like job interviews to an extraordinarily high degree without actually listening to the words being spoken. There is, as behavioral scientists have long held, a rich layer of communication in every interaction that happens independently of language.

Pentland's research quickly turned to healthcare, where he found voice analytics could help detect symptoms of depression or determine whether doctors and patients really unstand each other during interactions.

More recently, DARPA and the US Department of Veterans Affairs have given Cogito, the company Pentland formed in 2007 with former MIT MBA student Joshua Feast, grant money to determine if the technology can be used to flag veterans likely suffering from PTSD, which could help the VA deliver services more effectively.

During a 2013 clinical trial supported by DARPA, Cogito noticed an increase in attitudes linked with PTSD among trial participants following the Boston Marathon bombing.

The money, of course, is in enterprise products.

About five million Americans work in call centers. Cogito created a product called Cogito Dialog, which analyzes voice signals to determine things like customer engagement and frustration. The software gives call center employees real-time feedback during calls, allowing them to adjust their approach.

TechRepublic: How Adobe is optimizing voice analytics in connected cars [Video]

A case study with Humana, a large health insurance company, revealed a 28-percent increase in customer satisfaction and a 63-percent increase in employee engagement during calls when using voice analytics tracking.

"It's aiding that intuitive understanding we have when we listen to people," Pentland said, "helping people do that better."

Cogito's enterprise clients include MetLife and other large insurance providers.

Being able to foster communication and connection beyond language could have interesting implications in diplomacy and conflict resolution, and it's bound to improve our experience with AI assistants and robots.

I'd certainly love to have a real-time analytical readout during political conversations at large family gatherings.

Editorial standards