MLPerf benchmark results showcase Nvidia's top AI training times

For the first release of MLPerf, an objective AI benchmarking suite, Nvidia achieved top results in six categories.

The first MLPerf benchmark results are in, offering a new, objective measurement of the tools used to run AI workloads. The results show that Nvidia, up against solutions from Google and Intel, achieved the best performance in the six categories for which it submitted.

Also: Nvidia has novel approach to train robots to manipulate objects

MLPerf is a broad benchmark suite for measuring performance of machine learning (ML) software frameworks (such as TensorFlow, PyTorch and MXNet), ML hardware platforms (including Google TPUs, Intel CPUs and Nvidia GPUs) and ML cloud platforms. Several companies, as well as researchers from institutions like Harvard, Stanford and the University of California Berkeley, first agreed to support the benchmarks in May. The goal is to give developers and enterprise IT teams information to help them evaluate existing offerings and focus future development.

The newly-published results pertain to training ML models. The metric is time required to train a model to a target level of quality. The benchmark suite consists of seven categories: image classification, object detection, speech recognition, translation, recommendation, sentiment analysis and reinforcement learning.

Nvidia did not submit results for reinforcement learning, since it doesn't take full advantage of GPU acceleration, according to Ian Buck, Nvidia's VP of accelerated computing.

Also: IBM, Nvidia pair up on AI-optimized converged storage system

In terms of image recognition, using Resnet-50 v1.5 applied to the Imagenet, data set training with the DGX-2h took about 70 minutes. At scale, using a DGX-1 cluster, training time was reduced to 6.3 minutes.

For language translation, training the Transformer neural network in just 6.2 minutes with a DGX system at scale.

Nvidia was the only company that entered as many as six benchmarks, which Buck said demonstrated Nvidia's versatility. "People want to deploy one AI platform to do all these different kinds of workloads," he said, "to understand their data, whether it be images, speech, text or recommendation systems."

Google benchmarked accelerators available on Google Cloud infrastructure, with a focus on the latest Cloud TPUs (versions 2 and 3) and TPU v3 Pods. Google submitted results for image classification, object detection and translation.


Must read


Google noted that using 1/64th of a TPU v3 Pod, it achieved an image recognition training time of 60 minutes. Object detection with the TPU v3 Pod took 17.8 minutes and neural machine translation took 9.7 minutes.

Before now, it's been difficult to objectively compare different AI solutions, Nvidia's Buck said.

"AI is a rich topic that has many different kinds of networks, many different kinds use cases," he said. "MLPerf provides a level playing ground for companies to show what their platforms can do and deliver, and we welcome that."

Previous and related coverage:

Nvidia's Jetson AGX Xavier module now available for autonomous machines

The new module is tailored for the production of autonomous devices like drones and robots.

AI startup Flex Logix touts vastly higher performance than Nvidia

Four-year-old startup Flex Logix has taken the wraps off its novel chip design for machine learning. CEO Geoff Tate describes how the chip may take advantage of an "explosion" of inferencing activity in "edge computing," and how Nvidia can't compete on performance.