A week ago, Silicon Valley startup Cerebras Systems announced the biggest chip the world has ever seen, the "wafer-scale engine," or "WSE." The chip is dedicated to solving machine learning problems in artificial intelligence and could lead to dramatic changes in the way deep learning networks are devised.
Competition is certain to follow, though it will be hard for others to equal what Cerebras has accomplished.
That's the view of a very seasoned venture capitalist who contributed to the roughly $200 million in startup capital Cerebras has received.
"Someone will bite the bullet and come and compete with a similar technology," says Pierre Lamond, who is a general partner with Eclipse Ventures of Palo Alto, California, a short trip from Cerebras's headquarters in Los Altos.
"It will take the competition at least two years to three years to come up with a product similar to ours," says Lamond in a conversation with ZDNet by phone.
Lamond has seen the whole sweep of the chip industry, having come to the US from France as a young man in the early 1960s to work at Fairchild Semiconductor, the precursor to Intel that kicked off the modern semiconductor business. Lamond was also a partner for 27 years at Sequoia Capital, the firm that started the venture capital industry in the valley.
Cerebras's product is not merely impressive, he says, it is the direction the industry must go to avoid the stagnation of current chip development.
"Moore's Law is not bad, but it's slowing down," observes Lamond, referring to the challenges that have limited progress in moving chips to finer and finer features. "The big question for a long time, and this is where Cerebras broke ground, is, OK, are we limited to a two-by-three, or three-by-five centimeter chip, or do we go to something much bigger?
"It became clear that unless you took a risk to make a very, very large single piece, and find a way to make it work as a system, there will be no progress."
Hence, making wafer-scale parts is not only a huge step for the chip industry, but it is also "may be the last step" in the evolution of semiconductor process technology, he muses.
Nvidia, the graphics chip maker to which Cerebras constantly compares itself, is "a very good company," says Lamond, and is making "very good devices."
"But they are limited by the size they are working with."
Cerebras, founded three and a half years ago, has solved numerous technical challenges in that time.
Lamond has seen first-hand how prior efforts to create an enormous chip failed. "I can tell you that making a full wafer is not easy," says Lamond.
He recalls talking with Gene Amdahl, the creator of IBM's "System 360" mainframe, who tried and failed in the 1980s to create a wafer-scale part with a startup called Trilogy. "I told him with the yields at the time, it was not possible for this thing to work," recalls Lamond. Several things have changed in the industry in the intervening decades that have made Cerebras's approach more feasible.
Start with the problem of how one creates a coherent system functioning across a wafer 12 inches in diameter that is usually meant to be cut into multiple parts for individual chips. "The way the wafer is organized, there are 84 tiles or bricks, as you might call them, and if one of them is bad, you need to have software to isolate that so it doesn't affect the performance of the whole wafer," explains Lamond.
The yield, the amount of good surface area, that Cerebras's partner, Taiwan Semiconductor Manufacturing, is getting in producing the WSE, is "amazing," says Lamond. Only about 1% to 2% of the wafer area is unusable in any wafer run, he says. "It's unbelievable, I have told TSMC I am extremely impressed with what they've managed to do with the yield."
Nevertheless, very good yields are not enough.
"The normal idea you have, that if we have a break, we will isolate it using a laser to cut the connection to the other bricks -- you couldn't do that because that would affect a lot more than just the single brick." It was, "very tough, very, very difficult to find a way," says Lamond.
A problem in the past was that there were not enough metal wires to interconnect the various sections of the wafer, explains Lamond.
"Today, it is very easy to have multiple layers of metal," he observes. "There are at least a dozen layers of metal in a Cerebras wafer, all used for interconnection." Back when Trilogy was working on the problem, "you could not do that."
There are other innovations in hardware over and above the basic fabrication breakthrough, notes Lamond. Cerebras developed a way to detect, in hardware, zero-valued elements in a neural network to avoid computing them. "It was a very astute development to discover the zeros in hardware, so you don't use software to do it," he says. "It saves cycles. That was a very clever development beyond the silicon development."
Cooling the chip and connecting it to the outside world are additional challenges with such a large part. That is why Cerebras is a systems company, not a chip company, notes Lamond. Cerebras is building the WSE into a complete computing system for machine learning, rather than selling chips as Nvidia and Intel do as "merchant" semiconductor suppliers. "That chip is never going to be for sale," says Lamond of the WSE.
Consider the heat challenges of a chip that runs at 15 kilowatts. "It consumes a tremendous amount of power, and you have to cool it in a way that is very, very reliable," says Lamond."You don't want the temperature to go up to 125 degrees celsius, let's say, keep it less than 75 [degrees celsius]. And that's not easy in itself, because you have a large surface area that you need to keep at a constant temperature during the whole operation of the system." Cerebras has developed an elaborate network of pipes to carry water to effectively irrigate the WSE.
All of that effort benefits from the fact that AI has come of age as a computing problem. The chip industry needed a product that could benefit from having a very large amount of silicon devoted to it. "Clearly, AI is a good choice," says Lamond. "It is massively parallel, with hundreds of thousands of cores that have to work in parallel, but from a logic point of view, it is not that complicated -- it's just adds followed by multiplies followed by adds," he says, referring to the vector-matrix multiplications that are the focus of machine learning.
Lamond sees a brilliant future ahead generating enormous leaps in performance with each new version of the WSE and the system that runs it.
"We can't sit down and say, we've done it, and twiddle our thumbs," he says. "We need to have the Gen Two, and we are working on that now; we have in the works Gen Three, and we are even thinking about Gen Four, because we need to get every year and a half, or two years, a major improvement in the performance."
"People won't buy another system if it's merely 20% better than a previous system," he observes.
Competition is inevitable, says Lamond. For any market to be viable, there have to be some additional suppliers. "Look, if there is no competition, generally there is no market, it is very rare you stay by yourself as a single player in a market."
Scaling the heights of engineering will prove challenging for Nvidia and anyone else who wants to compete with Cerebras, suggests Lamond. "There are no shortcuts," says Lamond. "People know it can be done, but to know how took a lot of work by a bunch of very, very talented engineers."
Competitors can't simply go to Taiwan Semi and order up another wafer-scale chip, because Cerebras, he notes, "ended up having some significant invention that will stay as trade secret."
As he puts it, "They'll have to start from scratch."