Keypoint detection unlocks secrets of body language

Machines are pretty good at deciphering voices, but your slouch may say more
Written by Greg Nichols, Contributing Writer

What do a bunch of roboticists know about reading body language? A lot, it turns out.

Robots already interact with humans on factory floors and in semi-structured environments like hospitals. A big hurdle to bots making the leap to unstructured environments is the ability to decipher the complexities of non-verbal communication.

Researchers at Carnegie Mellon University's Robotics Institute just got a step closer by enabling a computer to understand the body poses and movements of multiple people from video in real time -- including, for the first time, the pose of each individual's fingers.

"We communicate almost as much with the movement of our bodies as we do with our voice," says Yaser Sheikh, associate professor of robotics at CMU, "but computers are more or less blind to it."

The research began with the Panoptic Studio, a two-story dome embedded with 500 video cameras. "A single shot gives you 500 views of a person's hand, plus it automatically annotates the hand position," said Hanbyul Joo, a CMU Ph.D. student.

The pose tracking research is the latest validation that it pays to invest in pure science. When the Panoptic Studio was built a decade ago with support from the National Science Foundation, it was not clear what impact it would have, Sheikh said.

Experimenting with pose detection in a sensor-rich environment ultimately allowed the researchers to detect the poses of a group of people using a single camera and an off-the-shelf laptop, meaning this technology is now cheap and highly flexible.

That's interesting, because personal computing is an obvious application. Tracking human motion would allow people to interact with machines naturally, and it won't be long before you can control computers through simple hand gestures, ala Tony Stark.

The security industry is also interested in the technology. Right now, human security personnel are trained to detect unusual body behavior in crowds of people, such as at airports or concerts. That's a bit like using a canary to monitor air quality--it sort of works, but it's not a great solution.

But the application I'm most excited about will be in robotics, where machines will perform better by being able to read human body language. Since intention is often subtly signaled before an action is carried out, a self-driving car could get an early warning a pedestrian is about to step into the street.

Personal care and assistive robots also need to read body language. Robot assistants will be able to change their behavior based on the mood of their human counterparts, and healthcare robots will be able to detect patient discomfort or fatigue by reading poses. The technology could lead to better behavioral diagnostics for conditions like dyslexia and depression.

The CMU researchers have released their computer code for both multi-person and hand-pose estimation, and they told me about 20 commercial groups, including automotive companies, have expressed interest in licensing the technology.

Editorial standards