SambaNova claims AI performance rivaling Nvidia, unveils as-a-service offering

The computer maker has made its custom machine generally available for purchase, but also is offering it on a rental basis for $10,000 per month.
Written by Tiernan Ray, Senior Contributing Writer

SambaNova says just one quarter of a rack's worth of its DataScale computer can replace 64 separate Nvidia DGX-2 machines taking up multiple racks of equipment, when crunching various deep learning tasks such as natural language processing tasks on neural networks with billions of parameters such as Google's BERT-Large. 

SambaNova Systems

The still very young market for artificial intelligence computers is spawning interesting business models. On Wednesday, SambaNova Systems, the Palo Alto-based startup that has received almost half a billion dollars in venture capital money, announced general availability of its dedicated AI computer, the DataScale and also announced an as-a-service offering where you can have the machine placed in your data center and rent its capacity for $10,000 a month. 

"What this is, is a way for people to gain quick and easy access at an entry price of $10,000 per month, and consume DataScale product as a service," said Marshall Choy, Vice President of product at SambaNova, in an interview with ZDNet via video. 

"I'll roll a rack, or many racks, into their data center, I'll own and manage and support the hardware for them, so they truly can just consume this product as a service offering." The managed service is called dataflow-as-a-service, a play on the fact the company emphasizes pitch that its hardware and software reroutes itself based on the flow of AI models put into the system.

The DataScale computer goes up against graphics chips by Nvidia that dominate training of neural networks.

Also: 'It's not just AI, this is a change in the entire computing industry,' says SambaNova CEO

Like other startups Graphcore and Cerebras Systems, SambaNova has taken a systems approach to AI, building a complete, finished machine with custom chips, firmware, software, and data and memory I/O subsystems rather than simply compete with Nvidia by selling cards. Even Nvidia recently rolled out its own dedicated AI appliance computer.

The DataScale system is advertised as being comparable to sixty-four of Nvidia's DGX-2 rack-mounted systems running the A100 GPU, but in only one quarter of a standard telco rack. 

The computer uses a custom chip with reprogrammable logic, called the Reconfigurable Data Unit, or RDU. It has its own software system called SambaFlow to lay out convolution or other deep learning operations in a way that uses the multiple RDUs. It has a high-speed fabric to connect the RDUs.

Also: AI is changing the entire nature of compute

And SambaNova even re-wrote some core applications, including natural language processing, to make them more efficient on benchmark tests.  

"We think of this as one of the largest transitions in the data center that we've seen in a very long time," said the company's co-founder and CEO, Rodrigo Liang, in the same video session. ZDNet spoke with Liang back in February, when details of the machine were still under wraps. Liang reiterated a claim made in February, that the focus of SambaNova is to affect a broad, deep change in computing overall.


"We are excited because we are uniquely positioned to be able to do something like this because we have ownership of all those layers of the stack," says Rodrigo Liang, center, co-founder and CEO of SambaNova Systems, with co-founders, left, Kunle Olukotun, and right, Christopher Ré. 


But the emphasis at the moment is making it "quick and easy" to get going with AI, as Choy puts it. 

"We are excited because we are uniquely positioned to be able to do something like this because we have ownership of all those layers of the stack," said Liang. "We aren't just building a chip that goes into somebody else's system, we are building everything all the way to the rack with the software integrated, and then application as well."

As an example of the out-of-the box ease, SambaNova is claiming superior benchmark results compared to Nvidia. For example, when training Google's BERT-Large natural language neural network, a version of the widely popular Transformer language model, SambaNova claims to have throughput of 28,800 samples per second, versus only 20,086 samples per second on Nvidia's A100-based DGX. 

The company, which is hosting a developer event for its software tools this week, even takes pre-built models such as Hugging Face, one of the most popular chat bots, and provides them in a pre-trained version that can be download and run on the SambaNova machine. 

Also: 'We can solve this problem in an amount of time that no number of GPUs or CPUs can achieve,' startup Cerebras tells supercomputing conference

"SambaFlow lets you take these existing models and get state-of-the art results in seconds," said Liang. 

Stand-alone pricing to own the DataScale is comparable to Nvidia's DGX-2, said Choy. 

An early customer that has purchased the system outright is Argonne National Labs. The lab, part of the U.S. Department of Energy, has worked with SambaNova on the kinds of gigantic projects that are the mission of the National Labs of DoE, such as COVID-19 research. 

"SambaNova is designed in some ways to bracket the performance that people typically see from GPUs," said Rick Stevens, associate laboratory director at Argonne, in an interview with ZDNet via video. "You can think of it as scaling above GPUs and below GPUs in a very efficient way."


The DataScale consists of multiple custom, reconfigurable processors called RDUs, joined together over a special high-speed fabric.

SambaNova Systems.

"It also has a very large memory, so you can train models that won't fit in a GPU." Stevens said the internal architecture connecting the processors with one another, and inter-leaving memory, has "headroom" to expand over time. 

Some deep learning neural networks may benefit more than others on the machine, a process Stevens said the lab is still figuring out. Argonne is running a variety of problems on cancer research, astronomy, and fusion reactors among others. They take advantage of different neural networks, including convolutional neural networks and something called "tomography GAN," a form of generative network. 

"It's certainly doing better than GPUs on these problems," said Stevens. A primary reasons is there isn't a memory plateau, as with GPUs, which hit the memory limit on the GPU card. "With the SambaNova, it's much smoother, you can explore a much wider range of number of model parameters." The larger memory also means you don't have to explicitly break up code into parallel operations, a particularly taxing dev task. 

Also: AI startup Graphcore says most of the world won't train AI, just distill it

"We're in a process of learning, which may take a year, how we can evolve neural architectures to best take advantage of the hardware," he said. The large memory means very large language neural networks, such as large Transformers, benefit in particular, but also very large generative models. He mentioned vector-quantized auto-encoders as another potential beneficiary. 

Argonne is evaluating SambaNova, Cerebras and other AI accelerators, with the intention of eventually making the systems available to various collaborators throughout the world. Stevens foresees technologies such as SambaNova's, as well as accelerators from Cerebras, Graphcore, Grok, or Intel's Habana unit, being built together in an exascale system.

"Future large-scale machines might have AI complexes as part of those procurements," said Stevens. "We are actively working on what are we going to deploy over the next five years as large-scale computing resources for the DoE," said Stevens. "And one of the questions is what will be the mix of architectures for that."

"They're just building blocks," said Stevens of the various accelerators, including SambaNova's. "You would challenge the integrator to think about how a system should be constructed that ties these things together, where you might want to drive the AI engine from an AI application that is running on a smaller part of the machine, in a tight loop of training or inference," said Stevens. "That is where things are going."

Stevens talked with ZDNet earlier this year about work Argonne has done speeding up COVID-19 research with the Cerebras computer, the CS-1. When asked how the two machines compare, Stevens declined to make comparisons between SambaNova and Cerebras.

Editorial standards