Hacking neural networks

Hacking neural networks

Summary: Anything that can be hacked, will be hacked — including neural networks. Here's what researchers have learned about surprising artificial intelligence behavior.

TOPICS: Storage, Software

Modern neural networks have achieved startling success in image and speech recognition — think Siri and Google Voice Search — using layers of simpler feature analyzers to break the problem down.

These loosely networked layers give the techniques their power, but also give rise to counter-intuitive behaviors.

In the recent paper Intriguing properties of neural networks by Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow and Rob Fergus found that artificial intelligence — like human intelligence — has some surprising behaviors. They looked at two areas — semantic analysis and image classification — but I'll focus on the latter.

Specifically, they found that in a state-of-the-art image recognition neural network that should robustly analyze slightly different images, that by

"...applying an imperceptible non-random perturbation to a test image, it is possible to arbitrarily change the network’s prediction. These perturbations are found by optimizing the input to maximize the prediction error. We term the so perturbed examples 'adversarial examples'."

These adversarial examples work across differently configured neural networks, even those trained on different images, suggesting that even AI systems have blind spots analogous to those of human intelligence.

The paper's examples of some adversarial images document their use of the term "imperceptible". Here's an example that shows (left) a correctly predicted image, (right) an adversarial, incorrectly identified image, and (center) a 10x magnification of the differences between the two:

Adversarial image example.

Imperceptible indeed!

The authors conclude:

"...if the network can generalize well, how can it be confused by these adversarial negatives, which are indistinguishable from the regular examples? The explanation is that the set of adversarial negatives is of extremely low probability, and thus is never (or rarely) observed in the test set, yet it is dense (much like the rational numbers, and so it is found near every virtually every test case."

The Storage Bits take

Images — and moving images — are a massive part of mankind's stored heritage. AI systems that scan and "recognize" their salient features vastly increase their searchability.

These systems also have application in self-driving vehicles, where adversarial images might have dangerous effects. While a remote possibility today, we have to consider how criminals, corporations, and national security agencies might take advantage of these counter-intuitive results to hack our digital world.

Topics: Storage, Software

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Networks

    Based what I've heard and read Hacking Nuetral networks are networked layers give the techniques their power, but also give rise to counter-intuitive behaviors. Unlike most computers which process data sequentially (at amazing speeds), evolution has decided the processing route was a bit better. No doubt, in my mind, because it’s hard to completely disable a huge, decentralized network. Phineas Gage is the best example of what i researched that of this that comes to mind, but all it takes is one look at the RIAA’s attempts to destroy such networks, to gain a good understanding of what is.
  • Neural Networks and Kukule's Benzene dream

    When neural networks begin having lucid dreams and then can can deduce correctly a previously unknown solution to a problem, then neural networks will have achieved "singularity".

    As Friedrich August Kekulé von Stradonitz once remarked in a lecture, “Let us learn to dream!”

    A nice blog article on this topic was posted by Ranjit Singh in 2009.