​How Red Hat Linux is helping reclaim the fastest supercomputer title for the US

It's not just the chips in the Department of Energy's record-breaking Summit supercomputer, which is setting new speed records; it's also Red Hat Enterprise Linux.
Written by Steven Vaughan-Nichols, Senior Contributing Editor

Video: US to invest $258M in supercomputer race with China

All the world's fastest supercomputers now run Linux, so it's no surprise that the US Department of Energy's Summit supercomputer at Oak Ridge National Laboratories runs Linux. Specifically, it runs Red Hat Enterprise Linux (RHEL).

Of course, Summit's 200-petaflop speed -- that's 200 quadrillion (peta-) floating point operations per second (flops) -- comes largely from its hardware. How fast is that? By comparison, China's Sunway TaihuLight, the official fastest supercomputer in the world, according to November 2017's Top 500 list, has a speed of 93.01 petaflops.

Read also: Buildah 1.0: Linux Container construction made easy | Linux Foundation: Microsoft's GitHub buy is a win

To put these speeds in more ordinary terms, according to Oak Ridge, "if every person on Earth completed one calculation per second, it would take 305 days to do what Summit can do in 1 second." Now that's fast.

Summit, which is still in beta, has a hybrid architecture. Each node contains two IBM POWER9 processors and six NVIDIA Volta V100 accelerators. Each NVIDIA GPU has 640 tensor cores. Besides making it fast as all get-out for traditional supercomputer jobs, it's also ideal for artificial intelligence (AI), machine learning, and neural networks. These nodes are connected with NVIDIA's high-speed NVLink. NVLink has a 25GB/s bi-directional data transfer rate.

Each node has over half a terabyte of coherent memory (high bandwidth memory + DDR4), which are addressable by all CPUs and GPUs -- plus 800GB of non-volatile RAM that can serve as a burst buffer or as extended memory. To provide a high rate of I/O throughput, the nodes are connected in a non-blocking, fat-tree using a dual-rail Mellanox EDR InfiniBand interconnect. Its storage can hold up to 250 petabytes of data, or about 74 years worth of high-definition video.

All together, the Summit has 4,608 nodes. With all that power, it's no wonder that Oak Ridge claims that, for AI applications, Summit will be the first supercomputer to reach the heretofore unobtainable exascale speeds of more than a billion calculations per second.

This speed is first being used to accelerate cancer research to take precision medicine from science fiction to reality. Precision medicine, armed with as much genomic, clinical, and laboratory data, as is available for a single patient and supercomputer horsepower, will enable doctors to make the best choice between available courses of treatment when, with ordinary medical science, they can't tell which one will work best.

To make all of this hardware and medical theory work, Summit relies on RHEL. As Chris Wright, Red Hat's CTO explained, RHEL "forms a common bridge at the operating system to effectively link all of Summit's resources together, making it easier for individual application stacks to take advantage of the specific resources that they need."

Wright added, "The open nature of Red Hat Enterprise Linux also allows Oak Ridge researchers to keep pace with the high-performance computing innovations in the Linux kernel while retaining a level of stability and support required for running mission-critical workloads."

Another Linux advantage is that Summit, unlike most supercomputers, doesn't use x86 processors, but POWER CPUs instead. And Linux, unlike most operating systems, can run on almost any hardware. According to Wright, "This highlights a new path emerging for not just supercomputing, but enterprise computing generally: The need for more seamless multi-architectural support." He makes a good point. For example, ARM processors are now being used in supercomputers.

Read also: How do you fix Windows dual-booted with Linux Mint? (TechRepublic) | Google's Chrome OS gets new app muscle with built-in Linux (CNET)

This "broader range of architectural choices available," Wright continued, "enables organizations to choose the computing backbone that best meets their unique needs, whether it's a traditional datacenter environment or a high-powered supercomputer like Summit."

At the same time, Wright concluded, "Despite the scale, processing capability, and 'intelligence' of Summit's composition, end users interact with something they understand: Linux. ... Red Hat Enterprise Linux provides a common, stable basis that ties together all of this innovation."

So, from the supercomputer to the cloud to the server and, yes, even to the desktop, notice how Microsoft keeps adding Linux distros to Windows 10. Linux rules.

Related stories

Editorial standards