'Hallucinating' AI makes it harder than ever to hide from surveillance

Machine vision is one of the most important AI applications, and - for surveillance - one of the most controversial and underestimated. Now researchers have demonstrated that AI can construct a person's complete image from a partial photo.

A look into China's terrifying biometric surveillance system for its Muslim minority Gate-like checkpoints are being used to record biometrics and device digital fingerprints for Xinjiang residents.

Surveillance video is everywhere these days, and researchers are working on making it smarter and smarter. The latest advance is in the problem of constructing -- or "hallucinating" in machine learning ML parlance -- a complete image of a person from a partial or occluded photo.

Occlusion occurs when the object, or body, you want to see is partially covered by an intervening object or body. In a crowded public area, say Times Square in New York, surveillance cameras would rarely get an unobstructed view of a person of interest. 

Primers: What is AI? | What is machine learning? | What is deep learning? | What is artificial general intelligence?       

That's where the paper Can Adversarial Networks Hallucinate Occluded People With a Plausible Aspect? by researchers from the University of Modena comes in. Their goal was to construct a plausible representation of a person from a single obstructed image.

Results

You can judge their success from some reconstructions -- from a test dataset -- included in their paper. 

woman-walking.jpg

Occluded woman reconstruction.

From the paper.

I'm impressed.

The how

In the paper the authors note that they took a new approach to the problem

. . . by integrating the state-of-the-art of neural network architectures, namely U-nets and GANs, as well as discriminative attribute classification nets, with an architecture specifically designed to de-occlude people shapes.

Definitions are in order. U-nets are a convolutional network architecture designed for fast and precise segmentation of biomedical images. GANs (Generative Adversarial Networks) are used in unsupervised machine learning, where two neural networks contest with each other in a zero-sum game framework. The discriminative attribute classification net acts as a quality inspector, eliminating generated images that it can determine are fake, leaving only images that fool the AI.

The combination is, as the pictures above show, amazingly effective.

man-walking.jpg

The Storage Bits take

With remarkable speed, technologists have created the infrastructure for widespread surveillance, which is surely finding adoption in authoritarian countries. That's why I've written about the incredible advances in machine vision:

Here in America, where video is less widely deployed than in, say, the UK, we have a few more years of relative freedom from surveillance technology -- if you don't carry your smartphone. But as history shows, if a power can be abused, it will be.

I used to laugh at movie scenes where grainy surveillance video is repeatedly enhanced to make out the bad guys face or license plate. But I'm not laughing now.

The combination of 4k surveillance video -- and the cheap storage that makes it possible to keep forever -- and the rapid advances in machine vision, means that in a few years surveilling several billion people will only take a few thousand people and an enormous cloud infrastructure. Giving this power to governments can't help but have a chilling effect on the masses of people who simply want to be left alone to live their lives.

For in our surveillance future, we'll never be truly alone. Comments welcome!

Read also