Facebook’s Yann LeCun says ‘internal activity’ proceeds on AI chips
Yann LeCun, the leader of Facebook’s AI Research unit, said the company is proceeding with internal activity on machine learning chips, though he is confident industry will come up with many new solutions to push forward processing of deep learning in specialized silicon.
In an interview with ZDNet at the International Solid State Circuits Conference in San Francisco, Yann LeCun, head of Facebook's AI Research team, said the company has its own work on chips for machine learning underway, though he sounded hopeful and confident that many solutions will come from the chip industry.
"Certainly, it makes sense for companies like Google and Facebook that have high volume to work on their own engines if the industry doesn't provide it," said LeCun, referring to the need for new kinds of processing technology optimized for deep learning.
"What Facebook has done traditionally is to partner with hardware vendors to entice them to build the stuff we think is good for us," he said.
LeCun has in prior interviews alluded to the possibility of internal chip development. However, "The difference now is that there is internal activity on this, which was very nascent at the time," he said, meaning, four years ago. Asked to expand upon what that internal activity includes at Facebook, LeCun demurred. "I probably wouldn't tell you," he said with a laugh.
Google's Tensor Processing Unit, or TPU, was mentioned by LeCun as an example of the kind of internal efforts by big companies he is talking about.
More broadly, LeCun reiterated essentials of his plenary talk regarding the need for a broad industry effort to address deep learning silicon at many points throughout the training and inference process.
For example, there is a compelling need for low-power chips to work on all the sensor data coming from mobile devices, but working on that data on the devices themselves, rather than sending the data to the cloud.
Then there are areas of compute "in the middle," such as traditional "offline" training of neural nets in the cloud and traditional inference in the cloud. Both tasks consume a lot of energy, so here, too, industry needs to supply more energy-efficient processing.
At the highest end of the deep learning food chain, in the R&D departments of Facebook and others, there is a need for more options beyond the dominant supplier Nvidia, whose GPUs are the de-facto solution for neural net training.
"At the very high end, what we need are competitors to the dominant supplier at the moment," said LeCun. "Not because they are not good at it, but because they make assumptions and it would be nice to have a different set of hardware that makes different assumptions that can be used for complementary things that the current crop of GPUs are good at."
However, how those alternative chips should be structured is an open question, said LeCun. It is clear that tomorrow's neural nets will be vastly bigger than today's, he said, because of things such as the need to take in an entire video feed and look at tons and tons of pixels, for things such as predicting motion from a video clip. But at the same time, such operations may have to be computed in a processing architecture that is different from today's matrix-multiply hardware. Matrices and tensors, the building blocks of today's AI hardware, will probably not be the ideal solution in future, he said.
State-of-the-art chips are "basically optimized to do lots of four-by-four matrix multiplies," said LeCun. "So, if you can reduce all of your neural nets to four-by-four matrix multiplies, okay. But it may not be an optimal way to do lots of convolutions," he said.
What could replace them? "I don't know. I think the real hardware geniuses will have to invent new ways to do those things."
"To some extent you could think of this as a similar set of operations that we currently do in neural nets, except that the way you access the data is through interactions; instead of things coming to you in a neat array, what you have is an array of pointers to fetch the data," for things such as processing graph-based data.
That could entail ways to optimize memory traffic, including "smart caching" or exploring parts of the graph ahead of time, said LeCun.
"In the end, it's going to be multiply-adds," said LeCun. "The question is, can you put it neatly in the form of either a bunch of dot-products, or a bunch of matrix-vector multiplies, or a bunch of matrix multiplies," he explained. "The current assumption is that you can reduce it to matrix multiplies. I don't think that is going to survive."
,Whatever forms the new chip architectures take, LeCun believes AI-focused hardware may take over more and more of the overall workloads of compute.
"I wouldn't try to speculate in terms of dollars, but in terms of FLOPS, or operations, if you go five, ten years in the future, and you look at what do computers spend their time doing, mostly, I think they will be doing things like deep learning, most of them. In terms of computation, not in terms of revenue, profit, number of computers, number of devices, but in terms of how are we spending our milliwatts or our FLOPS, they will be spent On neural nets."