Cray taps GPUs for 50-petaflop supercomputer

The hardware maker has put powerful Tesla-based graphical processors from Nvidia and AMD Opteron CPUs together to create the XK6 supercomputer, which it says can scale to 50 petaflops

Cray has married GPUs and CPUs to create the XK6 high-performance computer, which it says is capable of scaling up to 50 petaflops of computing power.

Cray XK6 supercomputer

Cray has married GPUs and CPUs to create the XK6 high-performance computer. Image credit: Cray

The XK6, announced on Tuesday, is made up of multiple supercomputer blade servers. Each blade includes up to four compute nodes containing AMD Opteron CPUs and Nvidia Tesla-architecture GPUs. It marks Cray's first attempt to blend dedicated GPUs and CPUs in a single high-performance computing (HPC) system.

"Cray has a long history of working with accelerators in our vector technologies," Barry Bolding, vice president of Cray's product division, said in a statement. "We are leveraging this expertise to create a scalable hybrid supercomputer — and the associated first-generation of a unified x86/GPU programming environment — that will allow the system to more productively meet the scientific challenges of today and tomorrow."

GPUs are increasingly being used in tandem with CPUs for high-performance computing (HPC) because they can add extra power for certain specific calculations. GPUs are better than CPUs at performing many small calculations simultaneously and so are useful for tasks that require the simulation of numerous simple bodies. For instance, simulating how a gas cloud could expand or change according to temperature would be a prime task for partial-GPU processing, as the GPUs would be adept at modelling the actions of the individual particles of gas.

Each XK6 node is made up of a 16-core Opteron 6200 CPU and an Nvidia Tesla x2090 GPU co-processor, an embeddable card designed to work in tandem with a CPU based on the GPU design. The chips, which have not yet been officially launched, communicate via Peripheral Component Interconnect Express (PCIe). Two nodes are connected to others through a Gemini interconnect ASIC (Application-Specific Integrated Circuit), which are then used to link through to the rest of the nodes within the cabinet.

Scaled up

The Cray system can be scaled from a single integrated cabinet — which can contain up to 96 CPUs and 96 GPUs — up to multiple cabinets linked together. Theoretically, it could deliver a maximum of 50 petaflops of raw computing capacity if it is upgraded with later generations of CPUs, GPUs and interconnect hardware, Cray said. A petaflop is equal to a thousand trillion floating point operations per second.

By comparison, the world's most powerful supercomputer, going by publicly disclosed details, is the Chinese Tianhe-1. As of October, the system was capable of a peak score of 2.5 petaflops.

Europe's Partnership for Advanced Computing in Europe (Prace), meanwhile, is embarking on a scheme to deliver an exaflop — 1,000 petaflops — of computing capacity to researchers by 2019.

Cray's XK6 hardware runs a high performance-specific version of the Suse Linux operating system and supports a range of HPC-specific variants of the Fortran, C and C++ programming languages.

The XK6 is expected to be available in the second half of 2011, Cray said, and it is possible to upgrade to it from Cray's existing XT4, XT5, XT6 and XE6 systems. Pricing was not disclosed.

Get the latest technology news and analysis, blogs and reviews delivered directly to your inbox with ZDNet UK's newsletters.