IBM details Blue Gene supercomputer

IBM is shedding light on a program to create the world's fastest supercomputer, illuminating a dual-pronged strategy, an unusual new processor design and a leaning toward the Linux operating system.
Written by Stephen Shankland, Contributor
IBM is shedding light on a program to create the world's fastest supercomputer, illuminating a dual-pronged strategy, an unusual new processor design and a leaning toward the Linux operating system.

"Blue Gene" is an ambitious project to expand the horizons of supercomputing, with the ultimate goal of creating a system that can perform one quadrillion calculations per second, or one petaflop. IBM expects a machine it calls Blue Gene/P to be the first to achieve the computational milestone. Today's fastest machine, NEC's Earth Simulator is comparatively slow--about one-thirtieth of a petaflop--but fast enough to worry the United States government that the country is losing its computing lead to Japan.

"Blue Gene is a completely oddball, you've-never-seen-anything-like-this-before design," said Illuminata analyst Jonathan Eunice. "It is not custom everything, (but) it is still very exotic compared to anything you can buy."

IBM has begun building the chips that will be used in the first Blue Gene, a machine dubbed Blue Gene/L that will run Linux and have more than 65,000 computing nodes, said Bill Pulleyblank, director of IBM's Deep Computing Institute and the executive overseeing the project. Each node has a small chip with an unusually large number of functions crammed onto the single slice of silicon: two processors, four accompanying mathematical engines, 4MB of memory and communication systems for five separate networks.

Joining Blue Gene/L is a second major experimental system called "Cyclops," which in comparison will have many more processors etched onto each slice of silicon--perhaps as many as 64, Pulleyblank said.

In addition, IBM probably will use the Linux operating system on all the members of the Blue Gene family, not just Blue Gene/L. "My belief is that's definitely where we're going to go," Pulleyblank said.

Blue Gene's original mission was to tackle the computationally onerous task of using the laws of physics to predict how chains of biochemical building blocks described by DNA fold into proteins--massive molecules such as hemoglobin. IBM has expanded its mission, though, to other subjects including global climate simulation and financial risk analysis.

"We're looking at broad suite of applications," Pulleyblank said, a move that will help IBM reach one of the goals of the Blue Gene project: to produce technology that customers ultimately will pay for.

IBM already has spent more than the original US$100 million budgeted for the project and won't meet its 2004 goal for the ultimate machine, but the company has made progress bringing its ideas to fruition.

IBM is building the processors for the first member of the Blue Gene family, Blue Gene/L, and expects to use them this year in a machine that will be a microcosm of the eventual full-fledged Blue Gene/L due by the end of 2004, Pulleyblank said. IBM also has begun designing the processors for Cyclops, which IBM internally calls Blue Gene/C.

The performance results of Blue Gene/L and Cyclops will determine the design IBM chooses for the eventual petaflop machine, Blue Gene/P, Pulleyblank said.

"The only thing that's sure is it will be an...architecture that will have massive amounts of parallelism in it. It will be a very power-efficient, space-efficient design," Pulleyblank said. How IBM reaches its petaflop-and-beyond goal is "going to depend in large part on what we find out when we start running on Blue Gene/L."

There are differences from what IBM originally envisioned. For one thing, the processors will be based on IBM's PowerPC 440GX processor instead of being designed from scratch. It's cooled by air instead of water. It has a different network. And there's less memory, though still a whopping 16 terabytes total.

Blue Gene/L will be large, but significantly smaller than current IBM supercomputers such as ASCI White, a nuclear weapons simulation machine at Lawrence Livermore National Laboratory, which will also be the home of Blue Gene/L. ASCI White takes up the area of two basketball courts, or 9,400 square feet, while Blue Gene/L should fit into half a tennis court, or about 1,400 square feet.

IBM's Blue Gene research has an academic flavor, but the company's ultimate goal is profit. IBM is second only to Hewlett-Packard in the US$4.7 billion market for high-performance technical computing machines. From 2001 to 2002, IBM's sales grew 28 percent from US$1.04 billion to US$1.33 billion, while HP's shrank 25 percent from US$2.1 billion to US$1.58 billion, according to research firm IDC.

Like an automaker sponsoring a winning race car, building cutting-edge computers can bring bragging rights that can help attract top engineers and convince customers that a company has sound long-term plans.

The design of Blue Gene/L
Blue Gene/L is an exercise in powers of two, starting with each of the 65,536 compute nodes. Each of the dual processors on the compute node has two "floating point units," engines for performing mathematical calculations.

Each node's chip is 121 square millimeters and built on a manufacturing process with 130-nanometer features, Pulleyblank said. That compares with 267 square millimeters for IBM's current flagship processor, the Power4+ used in its top-end Unix servers. The small size for Blue Gene's chips is crucial to ensure the chips don't emit too much waste heat, which would prevent engineers from packing them densely enough.

Two nodes are mounted onto a module; 16 modules fit into a chassis; and 32 chasses are mounted into a rack. A total of 64 racks will be installed at the Livermore lab by the end of 2004, with the first 512-node half-rack prototype to be built this fall at IBM' Thomas J. Watson Research Center.

"We're going to have first hardware this year. We are actually fabricating chips for this machine," Pulleyblank said.

All nodes are created equal, but 1,024 of them will have a more important task than the rest, Pulleyblank said. These so-called input-output, or I/O, nodes, will run an instance of Linux and assign calculations to a stable of 64 processor nodes.

These underling nodes won't run Linux, but instead a custom operating system stripped to its bare essentials, he said. When they have to perform a task they're not equipped to handle, they can pass the job up the pecking order to one of the I/O nodes.

"It will look like it has 1,024 I/O nodes, each of which manages a gang of 64 compute nodes," Pulleyblank said.

Running Linux, a move made possible by using the comparatively ordinary 440GX processor, was crucial to make the system useful. "It was absolutely clear by making it run Linux, we were opening it up to a broad range of applications we couldn't get otherwise," Pulleyblank said.

Of the two processors on each node, one will be devoted to number-crunching and the other to communicating with the rest of the system. In this configuration, the system should be able to perform at a rate of 180 teraflops, or 180 trillion calculations per second. In some cases where minimal communication between nodes is required, both processors of each node can concentrate on math, bringing the system performance to 360 teraflops, Pulleyblank said.

Communication among the nodes is a challenge IBM tackled by employing two primary networks. The first network is a mesh that connects each node to every other one, with a message traveling from one node to another having to hop across a maximum of 64 nodes in between. The second network is a branching tree structure that can quickly deliver messages to the entire collection of nodes or gather information from them.

When a message needs to be sent, "we automatically decide the better way to route it," Pulleyblank said. "Also interesting is that if one network fails, we can still completely run with the other network, but slower."

In addition, a third network uses a conventional 1-gigabit-per-second Ethernet technology. There are two management networks besides, one to help boot nodes and one to monitor and control them.

Blue Gene has some unusual features, but IBM has tried as much as possible to anchor the system to more mainstream technology. Staying on the beaten path is the best way to take advantage of technology that's improving fastest, Pulleyblank said, and it also makes it easier to create products out of the Blue Gene research.

"Our direction has been as much as possible to exploit these standard components," he said.

Editorial standards