Humans have stereo vision. With two eyes, two perspectives, and some hefty brain processing power, we can perceive distance by sight, a nifty trick if you need to hunt your food or hope to avoid spilling coffee on your keyboard.
But giving machines the same advantage has remained surprisingly elusive. Earlier this month, a Bay Area startup called StereoLabs introduced the first affordable high definition stereo camera. When coupled with a drone, autonomous car, or some other robot, the device can effectively give machines something like human vision, allowing for deft indoor/outdoor navigation at a price that's thousands or even tens of thousands of dollars less than the next cheapest technology. To create the ZED, the engineers had to figure out a novel solution to a problem that's made all other stereo cameras finicky, fragile, and mechanically overly-complex. The device costs less than 500 dollars and is certain to be a game changer for hardware developers.
Today's robots that navigate their environments autonomously rely on lasers, radar, infrared, or some combination of these technologies to gauge distance, recognize objects, and avoid collisions. The problem is that refined versions of those sensing technologies are very expensive. Google uses laser-based LIDAR on its self-driving car to sense objects. Its system can accurately pick up a pedestrian crossing the street 100 meters away, which is a real feat, but it also reportedly costs 60,000 dollars. And that's just one of the sensing technologies Google employs on its cars.
Some drones use infrared, a thermal sensing technology that's used in night vision, to perform collision avoidance. But there's a problem: IR sensors don't work well in daylight, which is a pretty significant limitation when you're talking about drones. Robots that do use electronic vision to sense their environment typically only have one functional eye in any given scenario, meaning depth and distance can't be achieved by visual input alone.
At first glance, it seems like a high definition stereo camera shouldn't be much more complex than a regular camera. Strap two of them together, do a little coding, and voila! The problem is one of alignment. "The most difficult part is making sure both cameras' sensors and optics are perfectly aligned," StereoLabs CEO Cecile Schmollgruber tells me. "This is practically impossible, even with hours of trying." Of course digital stereo cameras do exist, but to date they all share a common Achilles' heel. They must be calibrated constantly in order to maintain alignment, and they have sophisticated and ridiculously robust mechanical underpinnings to help users perform that task. If anything does get even slightly out of whack in the optics, including subtle variations in the lenses themselves, which are common, the resulting 3D images won't be representative of the environment they purport to capture, and the camera becomes useless.
StereoLabs began looking at the problem in 2008. The company got its start in the movie industry. "Production companies used to come to us," says Schmollgruber, "and they'd say, 'we're shooting in 3D, I have two huge cameras that aren't aligned, do you have a solution?' They would have to manually change position of their cameras and try to visually see the alignment errors. It took so much time, which is why so few movies are done in native 3D, as opposed to conversion."
StereoLabs' elegant solution seems destined for tech startup lore. Instead of investing in better hardware, which is expensive and failure-prone, they built software to recognize and accommodate for subtle calibration variations. Schmollgruber says it wouldn't have been possible just a few years earlier. "Computer vision has evolved so much, and all the power of graphics cards and GPUs has exploded," she says, "so the convergence of these things made it possible today." StereoLabs was the first to pounce on this new reality, and in one step the company slashed the entry price for stereo video five-fold.
The earliest consumer use of the ZED will likely be in drones. Collision avoidance is an obvious utilization, but as developers get deeper into the capabilities of the system, exciting new uses will surface. The company is working with developers of self-driving cars, but confidentiality agreements prevent it from divulging which projects it's involved with or what role the stereo vision technology will play. For now, the ZED is available off-the-shelf for anyone who has the technical know-how to create applications that utilize stereo vision. The camera plugs into a Jetson TK1 Board, a cheap computer, and can give users a 3D model in real time. This September, StereoLabs plans to release ready-to-go 3D scanning applications.
Expect to hear more about 3D vision in months to come. Machines just got one step closer to being able to interact with the world the way humans do.