In a world dominated by cloud computing, when so much of technology seems diaphanous, fuzzy, opaque, and perhaps ephemeral, it's intriguing to encounter a career engineer who still speaks in the language of the concrete reality of building computers.
Such is the case of Radoslav "Rado" Danilak, a luminary in the semiconductor world, and to many, a very vivid character.
"This is the most crazy project of my life," said Danilak in a recent interview via Zoom with ZDNet. He was referring to trying to build a chip that will run just about every computing operation in the world. It will emulate the instructions of Intel's x86 chips. It will also run the instructions that operate the billions of ARM chips that populate devices large and small. And it will have its own unique instructions that will run a variety of artificial intelligence functions.
The Prodigy processor, as it's called, is that most mythical of Silicon Valley creations, an everything computer that can do all, with amazing cost benefits and energy savings.
To Danilak, who speaks in a rapid-fire mixture of technical brilliance, philosophical rumination, and outright boasting, the most ambitious project can sound like merely something to keep from being bored.
"Some people like fishing, that's their hobby, I love building stuff," he said. "I start getting depressed after a few months, so, let's pick up some insane challenge, let's pick up the problem where the biggest company like Intel is lagging and let's try to solve it."
The biggest challenge, in this case, is Moore's Law, or, rather, the breakdown of Moore's Law. It is widely recognized by anyone outside of Intel that the famous rule of thumb that says that transistors in a chip double every eighteen months, while the cost halves, has broken down. Transistors still get smaller, in fact, but the performance increases they used to yield have shrunk, so that today's latest cutting-edge chips from Intel only yield a fraction of the speed-up they used to.
"In the Eighties and Nineties, the performance of a single machine grew by about 100 times over ten years," observed Danilak, referring to Intel's x86 improvement. "Now, you get a factor of three to four over ten years, if you are lucky."
The reasons for the breakdown can sound a bit abstruse, but fortunately, Danilak has a habit of reassuring his interlocutor with prompts, using encouraging phrases like, "as you know from high school math."
"As you know from high school, the resistivity is proportional to the cross section," said Danilak. In case you actually don't know any such thing, Danilak is able to unload a data dump of incredible wealth and precision about just what is going on.
The problem is not the transistors Intel is making. It is the wires. Metal wires connect transistors in a chip. But the wires have been getting smaller and smaller along with the transistors. As the wires have gotten smaller, they are creating an obstruction to the flow of bits from transistor to transistor. Think about pulmonary obstruction and stents.
"The wire got smaller faster than the transistor sped up," explained Danilak. "So all the tens of thousand of engineers at Intel, they were just fighting the wires to get some incremental small performance improvement."
"That's the reality, it's device physics," said Danilak, with the elegant thud of a closing measure by Beethoven after a beautiful passage of soaring technical detail.
To address that problem, the Prodigy chip is doing something clever: instead of figuring out with each tick of the chip's clock what instructions to send where, it uses smart compiler techniques to decide in advance what is the best way to pack chip instructions into clumps, known as bundles. Packing the instructions in a smart way means fewer trips back and forth around the processor, reducing the amount of traffic on the wires.
"The problem is the distance" on the wires, explained Danilak, "So, we asked a question: Can we separate computation from communication? If you get the data in right place, then we don't need to move it."
To keep results of calculations in the chip near where they need to be at each moment, Tachyum's software "reorganizes the flow graph of execution," is how Danilak describes it.
"It turns out that 93% of the cases, our compiler succeeds in placing instructions in the same execution unit, so we don't move the data, and if we don't move the data, we don't charge and discharge wires, and we get lower power in the process."
To be fair, the chip doesn't even exist, yet. It has been demonstrated in software. Next month, Danilak and team expect to have a working prototype built in an FPGA, or field-programmable gate array, a kind of chip that companies use to make a trial run of a chip design before building the real thing.
By December, Danilak told ZDNet, the company will "tape out," industry jargon for sending the plans for a chip to the factory. Three or four months later, samples are reviewed. If all looks good, sometime in the latter half of next year, Prodigy will go into mass production.
Danilak already has a plan for how manufacturing of the chip will be more cost-effective. Different versions of the chip will be made with different features on the same silicon wafer. That way, if there are any defects in manufacturing, which always happens with chips, they can be ameliorated by making some chips that have lower performance and selling them at a lower price.
"If you have a defect in one half of the chip, you don't throw them in the trash, this is half-performance, half-capacity chip, we pull that from the trash bin."
\Such cheap chips can be a low-cost replacement for Intel's Xeon, Danilak insists. The Prodigy portfolio of chips will thus scale from chips costing hundreds of dollars to chips costing several thousands of dollars, depending on their performance capabilities.
Now, a chip that promises all things to all people is a lofty promise, and will incur skepticism. Danilak's description of the production schedule is at least a year behind what industry observers had originally expected a couple years ago. And the ability of the chip to emulate instructions of both x86 and ARM chips has raised eyebrows. Such backward-compatibility has always come with trade-offs that make universality less appealing.
Nor has Danilak revealed any hard benchmarks for the performance of the chip. The Prodigy chip tends to be described as being an order-of-magnitude faster than Intel chips, at least, which is way too vague for most chip industry folk.
All those aporia, those question marks, stand in contrast to recent progress in the marketplace by competitors such as Cerebras Systems and Graphcore, two promising startups that are already shipping novel AI computers with novel chips.
To Danilak, skepticism is to be expected in proportion to ambition.
"Very few people go against Intel, take on the problems everybody knows in device physics, and that many people tried and failed to do it," conceded Danilak.
The whole venture is perhaps somewhat less improbable given that Danilak is quite accomplished, and given that he has assembled an all-star cast, an A Team of the chip world.
Danilak has worked all over Silicon Valley, including a stint at Nvidia from 2003 to 2007, during which time he helped to develop what's called the "General-Purpose GPU," or GP-GPU, graphics chips that can handle heavy tasks such as data analytics and high-performance computing, not just video games.
More than just work experience, he has been a serial entrepreneur who has founded and sold numerous companies, including SandForce, a flash storage company sold to chip maker LSI logic in 2010; and Skyera, another flash company, sold to to Western Digital in 2015.
The supporting cast is impressive. Co-founder Rodney Mullendore is also a seasoned chip veteran, with extensive work on government technology projects as well, such as the Department of Energy's telemetry systems. Co-founder Igor Shevlyakov is another veteran with special expertise in compiler technology thanks to a stint at software maker WindRiver.
The person in the hot seat for making sure the Prodigy chip hits its deadlines, Krishna Thatipelli, has worked in almost every major chip company in the Valley over decades.
Advisors to Tachyum include AMD veteran Fred Weber and ARM veteran Steve Furber.
"We are the best of Nvidia, the best of ARM, and the best of Intel, thats the good ingredients you have," is how Danilak summarized it.
Danilak has targeted among his first customers the so-called hyper -scale data center operators, such as Facebook. They have remnant capacity in their data centers because their load on the chips they buy from Intel and Nvidia is less than optimal. A universal chip such as Prodigy can squeeze many more cycles out of those servers, contends Danilak, because it will be versatile enough to switch between kinds of computing tasks.
"Average utilization is less than 40%, and they are built for peak demand," explained Danilak, referring to data centers, based on publicly available data published by Facebook. "A $4 billion asset is used less than 50%."
"Now imagine," he said, "you had a chip where instead of turning off the server, you could use it for AI training" in the slow periods. Such a chip gives you something for nothing, he contends, because it recovers lost utility. "We are competing at zero cost." That's to say nothing of the savings on power, said Danilak, as his more-efficient chip reduces operating power budgets.
No insane challenge would be complete, of course, without AI.
Thousands of Prodigy chips, operating in concert, will be power-efficient enough to simulate the human brain, Danilak believes. Tachyum advisor Steve Furber, who is a professor at the University of Manchester in the U.K., helped build something called SpiNNaker, a parallel-processing technology that emulates neuronal signaling. He also is an advisor to the Human Brain Project, a European Union collaboration to simulate the 86 billion neurons in the brain.
"You can do it with one billion ARM [chips], but it's not practical," said Danilak of the brain simulations. "You'll need a way-faster machine to keep it reliable," such as a universal chip like Prodigy.
To realize even the lofty ambition of shipping in volume, Tachyum will certainly need more money. The four-year-old company has thus far raised only a modest amount, $25 million in total, from equity fund IPM Growth. The rule of thumb for chip companies is that they need at least $100 million to get through production and set themselves up for follow-on versions of their product.
All that is academic, in a sense, for Danilak. He has not had trouble raising money in past. For the moment, he is balancing the expectations of investors with the engineering thrill of tackling difficult problems. "Our investors are kind of drooling that we should go IPO in ," he said. "hopefully they are right."
For a seasoned entrepreneur, whose intellectual acumen sits beside his M&A savvy, there are many ways to be proven correct.
"Intel is like this big, sick puppy," he observed. "Who knows: they might need new blood for their lives."
"So, you know, they might be, down the road, sitting around the table when the company will go public or get acquired."