Nvidia AI research points to an evolution of the chip business
Chip giant Nvidia’s head of applied machine learning research, Bryan Catanzaro, tells ZDNet how the entire process of graphics rendering is changing to one driven more by network models, less by hand-crafted features, and how it is already changing the company’s chip designs.
"We would love for model-based to be more of the workload," Catanzaro told ZDNet this week during an interview at Nvidia's booth at the NeurIPS machine learning conference in Montreal. Catanzaro was the first person doing neural network work at Nvidia when he took a job there in 2011 after receiving his PhD from the University of California at Berkeley in electrical engineering and computer science.
Model-based is a shorthand for replacing some of what used to be explicit programming with neural networks that infer the way to solve a computing problem. There were examples of that shift at Nvidia's booth that have broad implications for computing.
An example is a paper presented at the conference this week called Video to Video Synthesis, authored by himself and colleagues, along with a researcher from MIT's Computer Science and Artificial Intelligence Lab. The work seeks to synthesize videos of a street scene by "predicting future frames of video."
Traditionally, such tasks would be programmed by hand, the task of "rendering" being a laborious one. "Today video is triangle by triangle, and can cost a million dollars" or more, says Catanzaro.
Instead, the new approach takes videos of street scenes and feeds some of the frames to a generative adversarial network, or GAN, which then predicts frames of video. It builds upon prior work that synthesized images of things given a few samples.
The important part is that the GAN is figuring out the task of rendering, replacing laborious physics specifications in the traditional approach.
Nvidia has taken the approach to video game-style simulations, with some interesting results. In the Nvidia booth at NeurIPS, an arcade-style driver's seat was set up and visitors had the chance to drive through a simulated street scene. The street scene in this case showed some of the current limitations of the state of the art. Rather than looking photo-realistic, it had the feel of a water color painting, with colors and textures of buildings and cars shifting as one drove through the simulation.
Those artifacts, says Catanzaro, are a reflection of issues between translating from the "Unreal Engine 4" 3D system that is training the GAN in the case of the driving simulation. "When Unreal gives us the rendered world to begin with, and it doesn't do the full rendering, it just creates a sketch, it's too precise and everything is perfect.
"But lines and sketches from real videos are actually better, they are a little wavy; we need the rendering to be more like real video."
Catanzaro describes what sounds like a kind of dialectical process in future, where the source 3D rendering and the resultant generated rendering somehow make each other better. "We like to think of it as bootstrapping," says Catanzaro.
All this has implications for Nvidia's chip business: Catanzaro's boss is Jonah Alben, the head of GPU architecture at Nvidia. There is a link between what is learned about neural nets' capabilities and how it finds its way into silicon.
Specifically, the replacement of hand-coding in video rendering points the way to a time when model-based approaches will need more and more dedicated neural net circuitry.
"Computations in AI are a better match for semiconductor physics than traditional render tasks," says Catanzaro, comparing the traditional GPU work of "shader" units to the multiplier-accumulator tasks used for many neural network applications.
"The MAC [multiply-accumulate operation] is compute-bound rather than communications-bound," he explains. And so, "as transistors and wires get smaller, wires don't get small as fast as transistors," which means the traditional rendering pipeline gets gummed up by the problem of moving data over wires.
The reason Catanzaro is just fine with that is that Nvidia's chips have been changing their stripes more than some realize. Both of Nvidia's main chip architectures, Turing and Volta, feature tensor cores, which can operate directly on the tensor structures that compose the multiply-accumulate operations of the neural nets.
More and more, such tensor capability is taking over transistor space on the silicon die from traditional shader circuitry of a GPU.
"Tensor cores are a better fit for the future of semiconductor manufacturing technology," says Catanzaro. "They are more energy efficient, and I believe they will scale better than traditional GPUs."
He notes that a Turing-based device can produce sixteen trillion floating point operations per second in its traditional shader units, but the tensor cores beat that by a wide margin with 130 trillion per second.
There are of course a raft of startups aiming to take business from Nvidia by arguing the chip is now less than ideal for AI, including Bristol, England-based Graphcore.
But the proliferation of tensor cores in Nvidia chips means, "We agree with them more than they agree with their own statements," Catanzaro says with a smile.
Nvidia is making an end run around such competition, he suggests. "It's fine to talk about these systolic arrays," assemblages of multiplier-accumulator units that the startups tout, he says. "But we have been shipping this in silicon for years."
Innovative artificial intelligence, machine learning projects to watch