PS3 based super-computing cluster on Linux

PS3 based super-computing cluster on Linux

Summary: The real bottom line on the trade-off between cell's programming complexity and its performance potential is simply that we're a just language breakthrough from everybody's supercomputer being a rack of cell processors.

TOPICS: Hardware

As many people know Yellow Dog Linux, from Terra Soft, now runs on the Cell engine in Sony's PS3. That's very cool, but the thing many people may not realize is that Terra Soft isn't so much in the yellow dog business as it is in the supercomputing and life sciences software businesses.

Thus Terra Soft's recent announcement: Terra Soft to Build World's First Cell-Based Supercomputer is focused on the use of the PS3 hardware to run bioscience software, not on its ability to run Linux - an orientation that should be of interest to the folks at the University of Cincinnati Genome Research Institute whose IT people have embarked on a lonely quest to port and maintain similar software for Windows.

Here's the summary:


Glen Otero, Director of Life Sciences Research for Terra Soft Solutions explains, "This cluster represents a two-fold opportunity: to optimize a suite of open-source life science applications for the Cell processor; to develop a hands-on community around this world-first cluster whereby researchers and life science studies at all levels may benefit. Once up and running with our first labs engaged, we will expand the community through invitations and referrals, supporting a growing knowledge base and library of Cell optimized code, open and available to life science researchers everywhere."

Lawrence Berkeley National Lab is working with Terra Soft to optimize a suite of life science applications. Los Alamos and Oak Ridge National Labs are also engaged, with select universities coming on-board early in 2007. Terra Soft is working to optimize the entire Y-Bio bioinformatics suite.

Thomas Swidler, Sr. Director of Research & Development at SCEI states, "This cluster is for Sony a means of demonstrating the diversity of the PS3, taking it well beyond the traditional role of a game box. While we are not in the business of competing for the nor building cluster components, this creative use of the PS3 beta systems enables Sony to support a level of real world research that may produce very positive, beneficial results."

Regarding Terra Soft's contribution to the project, Swidler continued, "In working with Terra Soft, we found a single source for the operating system, cluster construction tools, and bioinformatics software suite. Again, their dedication to detail and professional results has surpassed our expectations. We are very eager for the completion of this initial phase in order that the research may begin."

The thumbnail on Cell, incidently is simple: it's IBM's current implementation of a communication and syncronization method bringing ordinary OpenGrid technology down to the chip level. Thus the cell patent is mainly about managing inter-processor communication both on and off the grid, the name derives from both the design idea itself and the ability to plug cell hardware together to form arbitrary processing grids, and the current implementation is a PPC based eight way grid with an embedded, 3.2 Ghz, G5+ derived, master controller.

Cell is fast enough that there's a serious payoff for facing the programming complexity that goes with it, but there's a problem: much of super computing relies on double precision arithmetic and the current Cell hardware is largely geared to single precision arithmetic. How effective can it be, therefore, for typical super computing tasks?

That's the question addressed in a recent Lawrence Berkeley research paper by Drs. Williams, Shalf, Oliker, and others. Here's their complete abstract:


The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. As a result, the high performance computing community is examining alternative architectures that address the limitations of modern cache-based designs. In this work, we examine the potential of using the forthcoming STI Cell processor as a building block for future high-end computing systems.

Our work contains several novel contributions. First, we introduce a performance model for Cell and apply it to several key scientific computing kernels: dense matrix multiply, sparse matrix vector multiply, stencil computations, and 1D/2D FFTs. The difficulty of programming Cell, which requires assembly level intrinsics for the best performance, makes this model useful as an initial step in algorithm design and evaluation. Next, we validate the accuracy of our model by comparing results against published hardware results, as well as our own implementations on the Cell full system simulator. Additionally, we compare Cell performance to benchmarks run on leading super-scalar (AMD Opteron), VLIW (Intel Itanium2), and vector (Cray X1E) architectures. Our work also explores several different mappings of the kernels and demonstrates a simple and effective programming model for Cell's unique architecture. Finally, we propose modest microarchitectural modifications that could significantly increase the efficiency of the Los Angeles basin and helping determine the precise point at which achieving over 200x the efficiency of the Itanium2 for SGEMM.

Overall, they conclude that the next generation cell product needs minor hardware change to scale efficiently for double precision work, but that the first generation is already between 3 and 60 times faster, and between 10 and 200 times more power efficient, than its competitors - numbers to keep in mind when you think about Apple's triumph in arranging to get dual core Xeon CPUs from Intel for only slightly more than than four times the $89 Sony is estimated to pay for an 8+1 cell at 3.2Ghz.

They're also numbers to keep in mind when thinking about next generation supercomputing. Terra Soft is mainly focused on biosciences applications and Yellow Dog Linux works now, but the real bottom line on the trade-off between cell's programming complexity and its performance potential is simply that we're a just language breakthrough from everybody's supercomputer being a rack of cell processors.

That may sound overblown, but consider this: you can buy Mercury's dual cell compute server from IBM (as the QS20 blade) at a list price of $18,995 - meaning that you could put 16 of these in a rack for less than $350,000 exclusive of disk and connectivity. In theory, that rack could sustain around 500 Teraflops - making it significantly faster than the IBM ASCI Purple and Bluegene/L combination for which the the Lawrence Livermore labs paid an estimated $290 million (including disk and connectivity) in 2005.


Topic: Hardware

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Not yet

    The PS3 is NOT made to be a computer. Important things like power supplies are not up to datacenter specs. Six hundred bucks is a bit pricey (you're paying for the Blu-ray) for a Cell - you could get 2 Xbox 360s for that price (2 x 3core Xenon).

    Other chips such as MIPS 24-core and Niagra 2 (now with floating point on ALL cores!) could do the trick for super computing. But neither of these is available in a low-cost package (like and Xbox or PS3).

    I'm more afraid of Iran buying a truckload of PS3s and then running nuclear simulations. Wasn't the PS/2 chip used in nuclear warheads? Maybe that was just a movie . . .
    Roger Ramjet
    • Technology leaks

      On 11/29/06 Roger Ramjet spoke and said:

      > I'm more afraid of Iran buying a truckload of PS3s
      > and then running nuclear simulations. Wasn't the
      > PS/2 chip used in nuclear warheads? Maybe that was
      > just a movie . . .

      Get used to it. Iran buying a truckload of PS3s right now seems far-fetched, unless someone starts writing good Muslim games for it. They have band satellite TV and Broadband Internet access. And I have no doubt they can make it stick for now.

      The fact is that one of the consequences of our globalized society is we don't have control of how technology is used. Whatever Mr. Bush and the people in Washington think right now, that's slipped below our purview and it is certainly credible that Al Qaeda or some other terrorist group can show the world how to build a supercomputer from networked PS3s or Xbox 360s.
      • Well

        We've been doing nuclear simulations for quite awhile now. Remember that Supercomputers buy you 3-5 years of conventional CPU development (that's one reason why the comparison of the $350k Mercury system to the $290M LLANL system wasn't quite fair)

        So there's no reason to think that the bad guys could use secondhand computers, outdated by a couple of years, to do sims that we were doing in the 90's. Why they would want to is another question-- the bad guys aren't so much into bomb development as they are into bomb-making, which, I believe entails another set of problems not really addressed by computation.
      • Computer Prices...

        The cost of computers is only an issue to people with a small amount of money. Now that problem is going away.

        We all are aware that computing speeds have exponentially grew since the early 80's first desktop computers; prices have also dropped 10 fold. Our bottom of the barrel personal computer of today is a blindingly fast super computer of yesterday. Money only buys time in this business. The supercomputer of today will cost hundreds of dollars soon.

        What I am curious about is the actual need for faster computers at home. Music & Movie viewing, creation and copying were the need for faster speeds up until a few years ago. There really isn't a new need that I can think of that will once again encapsulate a huge percentage of the public. The movie and music aps could reduce peoples costs and expand thier abilities to entertain themselves, although illegally in most cases. Another OS or word processor does not justify a new computer. Only a use that people cannot resist will.

        Any thoughts?
        • Since you asked, think about Slater Mill in Pawtucket Rhode Island

          On 11/30/06 information_z spoke and said:

          > Any thoughts?

          Since you want thoughts on these ideas, I genuinely believe that OLPC is the future of computing, although I am somewhat cynical about these machines reaching the intended users.

          Of course the need today is for better use of technology (as always). For most uses, the technologies we have are fast enough. Where they are not, it may well be that we are doing the wrong things. While I do like the benefits that copyright and patent protection have given us, the way these two fields have evolved in the last quarter-century strikes me as corrupting, extremist, and unhelpful. I honestly expect that Vista's anti-piracy technology will cause some serious problems for most of its intended audience.

 tells an interesting story about what we today call the theft of intellectual property. Without it, we would not be developed enough for people such as Microsoft or the Recording Industry to worry so much about "their" technology.

          I first became aware of the use of computing in the developing world in the 1970's when my late sister married a Nigerian and, during her time there, did some programming on an Apple ][. I became aware of what powerful machines these little microcomputers were and how much they could accomplish. I also, through my contacts and reading, became aware of how much more important accreditation of the people you trust was there. It makes sense that when a computer costs so much you would want someone from the best possible school to run it. At the same time this becomes a real problem when who gets into that best possible school is determined as much by who you know as by proven ability. One way people deal with this is the vibrant black market.

          The Internet began as a simple and frankly elegant solution to the problem of how to safeguard copies of records in a Nuclear War: you create so many copies in so many places SOME will survive. As I listen to all the complaints about how unsafe the Internet has become, I find myself thinking about how far it has strayed from that original model: with the backbone becoming prohibitively expensive for Mom and Pop stores or volunteer BBSes while it is so easy to hack, it is motivating more and more people to do what they can to survive on the streets, and this means it is increasing the supply and sophistication level of people who work on the black market.

          I believe the future of computing lies in asking hard questions about which records we want to keep, and which we should be getting rid of, and about how we can best maximize the supply of people available to keep the records we want to keep safe, rather than how to minimize the number of people who are available to gangsters and terrorists. The period when Slater was "stealing" British technology was the period we were first hearing that of course everything had been patented.

          I do believe that the technology we have does what we want it to do well enough so more powerful machines are, in many ways, a decadent approach. I also believe that old fashioned "progressive" and "woolly-headed liberal" ideas deserve serious reconsideration in the current situation: Phone Phreaks should have taught us that since things as simple as whistling can impact acoustic signallers on the Telcos' systems, computers as complex as the Apple ][ or even the Altair are all that is needed to affect the web in most countries. We are doing nothing to impact the flow of information with our own "certifications", either to prevent it reaching the "wrong" hands or to ameliorate the results when it does (which is arguably the original purpose of the internet) because we have come up with too many holey systems for the former and too few for the latter.

          OLPC is the future not necessarily because it puts connectivity in the hands of children, but simply because it puts a powerful connectivity into the hands of a lot of people who are very motivated to improve their own lives. I don't see that motivation as being as strong in the developed world, and I wish we did devote more time in our deliberations to accommodating it.
    • The new NVIDIA card totally crushes Cells...

      The new NVIDIA card totally crushes the Cell processor in terms of processing power.

      The Cell is stillborn as far as supercomputing goes - too little, too late.

      Same goes for the PS3 graphics. It's only just been released but PC graphics power is already ahead of it.

      ATI will soon have a competitor on the market, and sis months from now supercomputer level floating point math will be a commodity item.
      • Only dweebs play games on their computer.

        Consoles rule and Wii will win, followed closely by PS3. XBox will be left behind.
        • You're an idiot.

          Why? Because you think the PS3 will outdo the Xbox 360.

          Egh... anyone who buys a PS3 is either a Sony fanboy, or an idiot.
    • Market Value

      Has anyone thought about market value? This cloud of computing power only exists if individuals allow their hardward to communicate. When this commodity has value and bandwidth users are willing to pay, where do the PS3 (or any available) processors go to sign up? They go to the highest bidder. The work will no longer be in the public interest, but doing corporate bidding.

      In the process, science will pave the way on design and development, but corporate and private users will reap the benifit.
  • Just a language breakthrough...

    Yup. And I'm just a parallalizing compiler away from being a very rich man.

    We've been working on this problem for almost 50 years! It's hard, and isn't going to happen in "breakthrough" fashion.

    Processors are going multi-core because of a combination of heat & packaging constraints and the fact that designers, other than throwing more cache around don't really know what to do with increased area of chips.

    Certainly not because the language, software, HW/SW infrastructure is coming closer to finding "Holy Grail" serial language-> parallel implementation technology. It's the tail wagging the dog, and I'd predict we're going to go through some pain in performance improvements one generation relative to the next very, very soon.
    • Agreed

      and stay tuned - you'll like tomorrow's and Friday's blogs too.
    • like true perfection

      it is unattainable... it's like the right tail of a bell-curve... the closer it gets to 0, the smaller the step it takes towards 0. It will never reach 0, but it will keep getting closer.

      that "holy grail" will never be found, but we will find a way to more than exceed our needs before too long. after all, I doubt it'll be too long before you could make a computing cloud that thinks it is only a single supercomputer, by using a few routers, cables, and *any* game console. or your average (bottom-end) Dells...

      computing power is excessive in most cases at the moment... besides the heavy multimedia users, the heavy gamers, and the constant surveilance people... and when we break through to the next level of computing, we'll find that we can use our video game consoles, connected together, to run what previously was considered "heavy use".
      • Still missing

        a couple of basic points--
        First of all, for anyone in a computational field, there actually is no such thing as "excessive". For a run-of-the-mill PC user, sure, but then you're into marketing.

        "Cloud" is both a better and worse analogy than it seems. The cables and routers aren't just something you can handwave away, and as CPU computing power increases, the mismatch actually gets worse. So in order for the individual components to actually have useful work to do, whatever algorithm you're running has to be able to feed them. With such relatively slow feeding tubes, the "cloud" is just that, a whole bunch of tenuously connected individual dust motes. As I mentioned above, the work to bring this type of arrangement closer to reality (for all but a handful of applications) hasn't been done yet.
  • Only a matter of time

    Let's put aside the 'PlayStation' phenom.

    I'll wager that there will be a 'torrent' of new 'cell-based' PC offerings reaching the market much sooner than anyone could have predicted.

    It's not a matter of 'if' but 'when'.

    Like the weather, technology will change, just wait a few hours. ;)

    Good story Murph!
    D T Schmitz
    • Thanks!

      I understand that Toshiba has a cell based PC
      in user test right now - unfortunately I have no knowledge of Japanese and so know nothing beyond third party speculation, but it's pretty clear something's in trial.
      • Cell Skeptic

        I'll admit I'm a Cell-skeptic. I'd think any Cell-based PC (and definitely any laptop) is sort of a red herring or products following buzz.

        What's the market? The OS is going to run on the run-of-the-mill PowerPC, and getting the other processors to do what you want is a major effort. (E.g. no cache coherency, that's up to the programmer) Not to mention that most PC-type tasks don't have that much inherent parallelism to exploit.

        That's not to say multi-cores in desktops aren't around the bend-- Intel and AMD hath spoken, but where they have the chance to shine is in scientific computing and graphics. I'd warrant a guess that with the falling price of LCD's and with these processors with nothing to do sitting on chips, significantly more immersive displays and UI's will become common.

        Okay, off the stump, back to work
        • Ah, but it's a differnt story for the OEM

          Look at todays PC and you see the OEM has to include a disk controler, a video card, a sound card, basic I/O for the input devices, etc.

          (They do this either with adapter/cards or with on board chip sets.)

          Now consider the savings if you have a Cell CPU and each of these tasks are simply assigned to a different core. The cost savings is huge and in a business where .5% makes the difference between profit and bankruptcy I can see OEMs jumping all over it.
          • Wouldn't work

            Aside from the economics (We're going to replace our $0.50->couple buck microcontroller with 1/8 or 1/9 of a $90 chip) the hierarchical nature of PC architecture is there for a reason. You wouldn't want the sound card hogging the CPU memory bus. Not to mention you then sacrifice the whole point of the CELL which is parallel horsepower for the CPU.

            From another viewpoint, why aren't they doing this with the multi-cores already on the market? You're swatting a fly with a sledgehammer that's attached to your pants. (and how's that for a mixed metaphor)

            In the embedded world, you're a bit closer, (designs tend to be flatter, and Size/Power constraints are more active) but even there you probably wouldn't be using a CELL in the first place, and the above arguments about hogging the main pipes still hold.
          • I think you really need to read

            how the Cell works.

            "Not to mention you then sacrifice the whole point of the CELL which is parallel horsepower for the CPU."

            Hmm, says who? That isn't meant to be cruel, it's just that you are thinking in the existing box.

            Tell you what, look at the latest GPUs by NVidia or ATI, they indeed are expensive and could easily be replaced by using the Cell and handing video off to it. Also keep in mind the Cell is designed from the ground up to be used in multiples.

            No reason you can't have one $90 Cell doing the heavy number crunching and another $90 Cell (or even a cut down version that's cheaper) to handle everything else. Or, you could have a Cell with 16 cores for a fraction more money that handles everything.

            That's why the design is so very different and takes an entirely new view on how to build code.
          • Says...

            The people at IBM.


            and all the CELL promo literature you see.

            I'm trying to figure out what you're getting at. The PS3 currently passes the graphics processing to a GPU core, not a CELL, because that's what the GPU's are good at.

            Why would you want to pass data from the main CPU to a graphics processing chip that has a useless( for graphics processing) PowerPC as an intermediary? Why would you tie yourself to only 8 processors, when modern GPUs often have much larger arrays of more specialized elements?

            That's leaving aside the fact that the memory architecture would be abysmal for the tasks you envision.

            Don't be snowed by the hype-- we're still in the box. Multicomputers have been around for awhile, and CELL isn't the only one. Sun's got something cooking, too, Niagara, isn't it? And I've heard Intel folks talk about their 3 year plans, and it looks like 16-core, & 64-core are on their way (truth be told, the block diagrams look like FPGAs).

            However, you are almost absolutely spot on that the design takes "an entirely new view on how to build code".

            And it hasn't been given much thought int he consumer/enterprise domains. The "new" view is probably much older, as the first computers were used for scientific computation. There'll be lots of jobs for physics grads who get sick of the lab, I'll tell you that.

            The view isn't really that new, (it's probably older than the