Advancing artificial intelligence: Microsoft deploys corgis to beat Google on imaging

Microsoft says Project Adam has given it the world's best image classifier.

The race to advance artificial intelligence using cheap networked PCs kitted up to mimic the human brain has a new challenger: Project Adam, Microsoft's attempt at using deep learning to improve natural language processing, computer vision, and speech recognition.

Microsoft says it's taken a big step in creating true artificial intelligence (AI) with Project Adam — a deep neural network built on commodity hardware and adept at categorising different breeds of corgi.

Tech Pro Research

Windows 10 power tips: Secret shortcuts to your favorite settings

Are you tired of clicking through categories to find a specific Windows 10 setting? If you know the right commands, you can create shortcuts that take you to specific pages with a single click

Read More

The groundwork for Project Adam was laid by a 2012 project by Google, which saw the search giant demonstrate a network of 16,000 computers could teach itself to identify cat images drawn from YouTube.

"The machine-learning models we have trained in the past have been very tiny, especially in comparison to the size of our brain in terms of connections between neurons," Trishul Chilimbi, a Microsoft researcher behind Project Adam who's also been working on Bing, said.

"What the Google work had indicated is that if you train a larger model on more data, you do better on hard AI tasks like image classification."

Instead of Google's cats, Microsoft researchers put Adam to work identifying different breeds of the Queen Elizabeth II's preferred hound, the corgi, using 14 million images from ImageNet, an image database divided into 22,000 categories.

According to Microsoft, thanks to Project Adam's network of two billion connections, it's created the world's best image classifier that it says is "50 times faster" than Google's effort, more than twice as accurate, and requiring 30 times fewer machines.

The project aims to capture the potential of "hierarchical representation learning using big data", according to Microsoft. As it explains in a video, the technology could allow users to photograph food to immediately discover its nutritional information, or be put to work helping to detect diseases earlier.

Chilimbi said the "sweet spot" for the number of layers in a deep neural network is six — which is close to the human visual cortex. After that, each additional layer delivers smaller returns.

So the project's approach to learning the difference between different corgi breeds would be broken down into layers — for example, the dog's shape, followed by another layer that learns textures and fur, then another focussed on body parts such as the shapes of ears and eyes. The fourth layer would learn complex body parts while the fifth would be dedicated to "high level recognisable concepts" like a dog's face.

"The reason it's interesting is that each layer of this neural network learns automatically a higher-level feature based on the layer below it. The top-level layer learns high-level concepts like plants, written text, or shiny objects. It seems that you come to a point where there’s diminishing returns to going another level deep. Biologically, it seems the right thing, as well," said Chilimbi.

According to Chilimbi, it's still a mystery how deep neural networks can figure out to break down an image into levels of its features, for example, after being told that an image is a Pembroke Welsh corgi.

"There's no instruction that we provide for that. You just have training algorithms saying, 'This is the image, this is the label.' It automatically figures out these hierarchical features. That's still a deep, mysterious, not well understood process. But then, nature has had several million years to work her magic in shaping the brain, so it shouldn't be surprising that we will need time to slowly unravel the mysteries."

Read more on artificial intelligence

Show Comments