The march of specialized chips for artificial intelligence continues unabated, and reports from some luminaries of the semiconductor industry point to a broadening out of the movement of machine learning parts.
The well-regarded chip-industry newsletter Microprocessor Report this week reports that cloud computing operators such as Amazon and some electronics giants such as Huawei show impressive results against the CPUs and the graphics processing units, or GPU, parts that tend to dominate AI in the cloud. (Microprocessor Report articles are only available via subscription to the newsletter.)
And a think-piece this month in the Communications of the ACM, from two legends of chip design, John L. Hennessy and David A. Patterson, explains that circuits for machine learning represent something of a revolution in chip design broadly speaking. Hennessy and Patterson last year received the prestigious A.M. Turing award from the ACM for their decades of work on chip architecture design.
In the Microprocessor Report editorial, the newsletter's principal analyst, Linley Gwennap, describes the rise of custom application-specific integrated circuits for cloud with the phrase "when it rains, it pours." Among the rush of chips are Amazon's "Graviton" chip, which is now available in Amazon's AWS cloud service. Another is the "Kunpeng 920" from Chinese telecom and networking giant Huawei. Huawei intends to use the chips in both its line of server computers and as an offering in its own cloud computing service.
Both Amazon and Huawei intend to follow up with more parts: a "deep-learning accelerator" from Amazon called "Inferentia" and a part for neural network inference, the part of AI where the chip answers questions on the fly, called "Ascend 310." Another one, the "Ascend 910," is a "massive datacenter chip," as Gwennap describes it.
In the case of both Amazon and Huawei, the issues is the lock on inference by Intel's Xeon processor, and the lock on cloud-based training of Nvidia's GPUs.
"Cloud-service providers are concerned about Intel's near-100% share of server processors and Nvidia's dominance in AI accelerators," writes Gwennap. "ASICs offer a hedge against price increases or a product stumble from either vendor."
While the ASICs won't easily meet Intel's Xeon performance, "The strong performance of the Ascend 910 shows Nvidia is more vulnerable," he writes.
The essay by Hennessy and Patterson takes a bit of a longer view. The problem for the chip industry, they write, is the breakdown of Moore's Law, the famous law of transistor scaling, as well as the breakdown of Dennard Scaling, which says that chips get generally more energy-efficient. At the same time, Amdahl's Law, a rule of thumb that says the bottleneck in processor performance is the number of sequential, rather than parallel, operations that must be computed, is in full effect. All that means chips are something of a crisis, but one that also presents opportunity.
Basically, chip design has to move away from general-purpose parts, to specialization, they argue. The death of Moore's Law and Dennard Scaling "make it highly unlikely, in our view, that processor architects and designers can sustain significant rates of performance improvements in general-purpose processors."
Instead, they see a continued move to "domain-specific architectures," of which AI chips are one prominent example. The DSA chips can make use of lots of tricks that don't work for general-purpose processors, such as a compiler approach for code called "very-long instruction-word," or VLIW.
"VLIW processors are a poor match for general-purpose code15 but for limited domains can be much more efficient, since the control mechanisms are simpler," they write.
Not only will DSAs serve AI well, but the authors predict they may be better than general-purpose processors at securing code, avoiding the recent chip exploits such as Spectre and Meltdown.
- 'AI is very, very stupid,' says Google's AI leader (CNET)
- Baidu creates Kunlun silicon for AI
- Unified Google AI division a clear signal of AI's future (TechRepublic)
Remember that Patterson was a key player in designing Google's Tensor Processing Unit, or TPU, chip, a prime example of an AI-centric DSA. The authors cover the details of the TPU in the article.
Aside from the TPU, and Nvidia GPUs, and Intel's own field-programmable gate arrays, and other offerings from the tech giants, there are "dozens" of startup companies that are "interconnecting hundreds to thousands of such chips to form neural-network supercomputers," Hennessy and Patterson observe.
They see more and more designs coming from startups given that it's relatively inexpensive to design and fabricate the more-specific DSAs, compared to a general-purpose part. They make a pitch for the "RISC-V" standard for chip instructions. RISC-V allows many chip companies to amend a standard set of instructions to tune parts to a given domain. Hennessy and Patterson write that the new era of chip design is akin to agile development in software, with lots of iterations that get new chips out the door fast and then improve from there.
The duo see a bright future for innovation. "This avalanche of [deep neural network] architectures makes for interesting times in computer architecture.
"It is difficult to predict in 2019 which (or even if any) of these many directions will win, but the marketplace will surely settle the competition just as it settled the architectural debates of the past."
Previous and related coverage:
An executive guide to artificial intelligence, from machine learning and general AI to neural networks.
The lowdown on deep learning: from how it relates to the wider field of machine learning through to how to get started with it.
This guide explains what machine learning is, how it is related to artificial intelligence, how it works and why it matters.
An introduction to cloud computing right from the basics up to IaaS and PaaS, hybrid, public, and private cloud.