The agency's "Mind's Eye" program (.pdf) is pursuing how to give machines such as cameras the ability to interpret scenes and objects they sense. The challenge: learn generally applicable and generative representations of action between objects in a scene directly from visual inputs. Then, given those representations, reason.
Here's an excerpt for the official announcement:
Humans in particular perform a wide range of visual tasks with ease, which no current artificial intelligence can do in a robust way. Humans have inherently strong spatial judgment and are able to learn new spatiotemporal concepts directly from the visual experience. Humans can visualize scenes and objects, as well as the actions involving those objects. Humans possess a powerful ability to manipulate those imagined scenes mentally to solve problems. A machine‐based implementation of such abilities would be broadly applicable to a wide range of applications.
In other words: take existing advances in recognition and add perceptual and cognitive reasoning, to tell a "narrative" rather than what simply is.
That means it's primarily a software game -- visual intelligence software must work with hardware to be "smart."
DARPA says one military capability is indeed a "smart camera with sufficient visual intelligence that it can report on activity in an area of observation."
Applications include surveillance systems where extreme limitations on payload size and available computing power are present, such as unmanned ground vehicles, or UGVs.
Image: Quevaal/Wikimedia Commons
This post was originally published on Smartplanet.com