What the DoD's PlayStation-powered Condor Cluster means for the future of supercomputing

A chat with Mark Barnell, one of the men behind the DoD's powerful new Condor Cluster, a supercomputer built from 1760 gaming consoles, about the PlayStation 3, distributed computing and the (possible) next big thing in supercomputers: smartphone processors.
Written by John Herrman, Contributor

Before the PlayStation 3 was released in 2006, there was already buzz that its unusual, IBM-designed processor might be valuable for supercomputing. "PS3s... could help cure Alzheimer's, Parkinson's or mad cow disease," wrote CNN in September of that year, referring to a plan to link idle consoles in people's homes, creating a massive, international, distributed supercomputing system. It worked.

A few years ago, the Department of Defense decided to take the PlayStation-as-a-supercomputer concept a bit more literally. This week, the Condor Cluster was unveiled. Built from 1,760 of Sony's gaming consoles--the older models, not the new PS3 Slims--the cluster is the most powerful interactive supercomputer available to various of the DoD, as well as one of the top 100 most powerful supercomputers in the world. It will initially be used, according the AFRL, for "neuromorphic artificial intelligence research, synthetic aperture radar enhancement, image enhancement and pattern recognition research."

I spoke with Mark Barnell, Air Force Research Laboratory high performance computing director and Condor Cluster project engineer.

How did a project this unusual get its start?

When was the project inaugurated about 4 and a half years ago, before I was in the position I’m in today, Dr. RichardLinderman--one of our senior scientists here at the Information Directorate--had this idea that the architecture of the Sony PlayStation’s cell processors, the Cell Broadband Engine, would have amazing features for supercomputing. At the time it cost between seven and ten thousand dollars for an IBM cell-based workstation.

Meanwhile, PlayStations were around $400, so the question came up: could we use these consoles instead? At the time, Sony and others within the industry fully supported doing that kind of thing.

The PlayStation lets you install Linux as another OS option, and it turned out that some of applications we were interested in work very well under the more restricted architecture of the Sony Playstation.

The first question was, definitely, will this even work? That question was answered about four years ago.

When we tried tens of units together, it worked well. When we tried hundreds, it worked pretty well. And now we’re at 1,760 of them, and our applications scale very well.

As for how this cluster came about, I made a proposal to the High Performance Modernization office in November of 2008, and two years later, the system is up and running. We had our formal ribbon-cutting on December 1st.

You’ve been quoted as saying that the Condor Cluster is the DoD’s most powerful “interactive” supercomputer. What does that mean?

From the perspective of most of the supercomputing centers in the DoD, when millions or tens of millions of dollars are invested, you don’t want to waste cycles. So these computers are run in what is called “batch mode.” They keep these systems running at very high levels, all of the time, so the applications that use them are carefully managed and optimized.

On this computer, we’re not tied to these metrics, mainly because it was so inexpensive. We do a lot of research and development on this system, so we start with only a few nodes, make sure [the software] works, and scale up from there. We have a lot of users [in the DoD] who, when they’re actually developing code, have a tendency to hang a machine or two. When you do that, the computer becomes ineffective until we reboot it.

So it’s like a test bed?

The people running projects on this cluster are trying things that may not work, and that gives them the ability to basically try a lot of different software techniques, and we’re not going to penalize them. If you took down some of the nodes at another [batch-based] site because your code was written poorly, or something else happened in your communication process, that would be seen as a negative.

What happens if game consoles move away from cell architecture?

I think in general, this opportunity to take something that’s coming from the globalization of large products, whether it be gaming consoles or--what we believe has big potential right now--general purpose graphical processor units, or high-end graphics cards for PCs. They’re becoming to us what the Sony PlayStation was before. If consoles fade away, and the cell processor has done its time, we’ll be able to replace it with hardware that’s extremely efficient, and very friendly to our budgets.

We see that a lot of resource centers have the same problems: cost, size, power consumption, heat. What will up happening in the next few years is that with commodity products like this, you’ll be able to get a lot more power and computational throughput without increasing power demands. That’s something we’re paying a lot of attention to.

What’s next for these "commoditized" supercomputers?

The processors in our iPhones and our Androids have the potential to be the next game-changer a few years down the road, when a similar event could occur. As was the case with the PlayStation, when Sony was selling millions of its devices, they drove the price of hardware down. Because there are millions and millions of smartphones, we could potentially “spin” those same products for our uses.

One advantage of using smartphone processors would be extremely low power consumption. How important is that factor in determining what’s next in distributed supercomputing technology?

Very important. Size and power constraints on supercomputing systems are a challenge in the Air Force and elsewhere. We think [smartphone processors] are an exiting area, which we’re going to leverage highly when it’s a little more mature. We maybe able to do things that we haven't even imagined yet.

This post was originally published on Smartplanet.com

Editorial standards