PPC vs. Intel: Top 500 shows who's right

PPC vs. Intel: Top 500 shows who's right

Summary: These numbers lead to an interesting question. Since "everyone knows" that Pentiums arefaster and cheaper than powerPC chips, how come they seem capable of only about two thirdsas much work even when run at a nearly a third more cycles per second?

SHARE:
TOPICS: Processors
17

The latest listing of the top five hundred super computers was released in today.

Number five, behind two other PowerPC based machines, an Itanium, and NEC's custom earth simulator, is the MareNostrum machine at the Barcelona Supercomputer Center. It runs Linux on a cluster of 4,800 PPC 970 CPUs running at 2.2Ghz and gets score of 27910 (R/Max =Maximal LINPACK performance achieved (in gigaflops) and 42144 (R/peak =Theoretical peak performance (in gigaflops).

Number 14 on the list is Virginia Tech's System X. It has 1,100 dual processor X-servers with the same PPC970 chip used in Barcelona. Virginia's Mac cluster runs at 2.3Ghz and received scores of 12250 and 20240 for R/Max and R/peak respectively.

In Barcelona each PPC contributes 5.8 gigaflops to R/Max and 8.8gigaflops to R/peak. In Virgina, the each PPC contributes 5.6 gigaflops to R/Max and 9.2 gigaflops to R/peak. That peak difference favouring the Mac reflects the slightly higher cycle rate, the lower actual probably reflects factors such as cheaper networking and storage - but I don't know that for sure.

The tenth and eleventh systems, from Cray, use AMD Opterons. Xeon enters the list at number 20: the "Tungsten" system at the NCSA in Urbana-Champaign. That machine has 1,250 dual CPU Dell servers running the Intel Pentium 4 Xeon 3060 MHz with Linux plus another 126 machines handling I/O and storage. Tungsten gets an r/max score of 9,819 and an Rmax of 15,300 or 3.9 and 6.1 per processor.

These numbers lead to an interesting question. Since "everyone knows" that Pentiums are faster and cheaper than powerPC chips, how come they seem capable of only about two thirds as much work even when run at a nearly a third more cycles per second?

Overall, however, the new list is so dominated by PowerPC products and Itanium that long time Intel apologists are now questioning the appropriateness of the benchmark - basically Linpack shows Xeon struggling and these guys therefore want to throw out Linpack.

In fact, however, Linpack is more applicable to the loads that count for Mac users than you might expect. My six-year-old 300Mhz "Tuxedo" powerbook runs vi just fine, but dies horribly when confronted with a big image transform -- exactly the kind of operation Linpack measures.

On the other hand, some of these critics may be right about this being the last or next-to-last top 500 list - not because Linpack's becoming less applicable, but because IBM's cell (itself powerPC based, of course) could own all the top spots by this time next year.

Topic: Processors

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

17 comments
Log in or register to join the discussion
  • Some rambling from me

    I would answer your article with 3 comments:

    The first is that the performance of massively parallel "super computers" is not necessarily a good indication of the performance that a desktop user will experience. It is possible that no one has put the same time and money into creating the best possible x86 grid as they have with PPC and Itanium chips.

    The second thing I would say is that I didn't think pure speed was the reason for Apple making the switch. I was under the impression that this move was more about improving the performance of Apple laptops where you want a good balance of power consumption, heat, and speed. Whether the PPC is faster in a super computer is unimportant to me if my laptop only lasts 1 hour.

    Finally, even if the PPC is architecturally capable of running fast and cool, it is obvious that IBM has not been interested in making the investment to do so, at least not to Apple's satisfaction. IBM probably doesn't care [b]that[/b] much about the laptop market while Apple absolutely cannot ignore it. It is a [b]good[/b] thing for Apple to sacrifice the speed of a "super computer" in order to capture the much more lucrative laptop market.
    NonZealot
    • Yes and no

      Agreed: Apple didn't switch for performance or cost reasons; they switched because IBM gave them a choice between a rock and hard place - signup for Cell and became an IBM subsidiary, or change and lose on cost and performance.

      Jobs went to plan B; good for him. I believe it was the wrong plan B, but he had to change to survive.

      Unfortunately he choose not to tell it like it is, instead, we now have pundits and apple people telling us Intel either is or will be faster - that's the MSN view, and it's wrong.

      Disagreed: the comments about laptops and power are simply wrong. Intel's yonah looks ok, if they can get it out, and if you don't count the power cost of what's not on the chip but should be. AMD, in contrast, has been putting stuff like memory management in hardware on the chip -meaning that their nominal power use per chip is higher, but the total in the box is lower.

      In any case, Apple's G4 laptops now run longer on a charge (provided the battery doesn't explode -:)!) than do those from the PC industry. I can fly from Edmonton to Dallas and rely on the titanium to run the whole way; intel users can not.
      murph_z
  • Super computers can mean anything

    The super computer list is meaningless to good notebook or desktop computer design. You can always gang more PPC boxes together to get higher aggregate throughput.

    If you're talking about a single CPU socket design that most laptops and desktops use, then the Pentium M architecture holds the lead in the race. Sun's product is non production right now so you can't count them, and the chip won't do single threaded applications well. The vast majority of applications for the desktop and server market are optimized for fewer cores.
    george_ou
  • GFLOPS not the only factor

    How many GFLOPS a CPU puts out is meaningless to most PC users. It's only good for some applications such as scientific simulation or 3D rendering applications.

    When you're talking about 3D rendering with Maya, the dual G5 2.5 GHz machine is less than 1/2 the speed of the dual Pentium XEON 3.4 GHz machine.

    http://www.barefeats.com/macvpc.html

    You need to be very careful when you're talking about CPU speed since it really depends on the application you're trying to run. A Maya rendering test is a lot closer to the real world than some theoretical GFLOP number.
    george_ou
    • Right premise, wrong conclusions

      You're right, flops aren't everything - but they are critical to the workstation/desktop tasks that most often cause users to wait.

      Your Maya example illustrates an important problem. Maya is much slower on PPC than intel, but that's not because intel chips are faster, it's because Maya's programmers focus on intel. I believe that both their Sparc and their PPC/Mac ports consist largely of #ifdefs. That's fine for their vision of their market, but it says nothing about the CPUs.

      That's also, by the way, why some games programmers expect intel based macs to be faster - for them they will be because that's what they code for.

      The big benefit of the super computer example, besides the applicability of Linpack to the tasks causing waits is that these machines use essentially the same OS and applications suites with similar levels of customization. That levels the playing field, and shows that the G5 outperforms the P4/Xeon by nearly two to one per cycle and by about 3:2 at production cycle rates.
      murph_z
      • Implementation doesn't matter, just show me the numbers

        No one cares why the Mac sucks at Maya 3D, just that it does. No one cares why the Mac lost in 4 of those 6 tests, just that it did. No one cares why the Mac lost in all the games, just that it did.

        Frankly, it just sounds like a lot of excuses for the PPC. The fact of the matter is, theoretical numbers mean nothing if you can't execute. Pricing is also a huge consideration because I can build those dual XEONs for well under $2K. You can attribute it all to economy of scale if you want, but it doesn?t make a difference to my wallet nor the market place. The personal computer market has spoken loud and clear.
        george_ou
        • Funny

          After murph's eloquent post comes george's reply.

          "No one cares why the Mac sucks at Maya 3D, just that it does."

          Actually when putting down an architecture like the PPC, or
          making conclusions about its performance the WHY is very
          import. The published list shows that the architecture can
          perform extremely well for certain application, that developers
          of packaged software haven't taken advantage of it is
          disappointing.

          "Frankly, it just sounds like a lot of excuses for the PPC."

          Perhaps

          "The fact of the matter is, theoretical numbers mean nothing if
          you can't execute."

          These are not just "theoretical numbers", these architectures are
          chosen to build real-world supercomputers as the list shows.

          "Pricing is also a huge consideration because I can build those
          dual XEONs for well under $2K."

          Good for you.

          "The personal computer market has spoken loud and clear."

          I know, and we're still recovering from the last time they spoke
          loud and clear, we're stuck with the biggest malware invested OS
          that hasn't had a release in over 4 years. Great point;-)
          Richard Flude
          • Typical elitism

            Everyone has their preferences and requirements. My priorities (desktop applications) may be very different from yours (clustered simulations). We mere mortals only care about our Mayas, Photoshop, silly little games, media encoding, and other desktop applications. We?re just not smart enough to know why we need more theoretical GFLOPS on applications that we don't use.

            You can insult me or the modern desktop platform all you like. It just shows your elitism for what it really is. In the mean time, the rest of us go on with our lives.
            george_ou
          • Expected response from you

            Why is it that people like you always need to pull a topic back to a certain OS when it has nothing to do with the topic? It pretty much discounts your credibility.
            balsover
      • What really counts

        "That levels the playing field, and shows that the G5 outperforms the P4/Xeon by nearly two to one per cycle and by about 3:2 at production cycle rates."

        What really counts is the bottom line on the applications that users want to use. It doesn't matter if the G5 is 50% faster than the P4 at the same clock cycle, I don't care about the clock rate. What matters is efficiency X clock rate. What matters is performance divided by total system cost.

        If I were building a super computer with just a bunch of cheap $100 PC motherboards and cheap $250 dual-core 3.2 GHz Pentium D or AMD 64 CPUs, all I would need is a $5 CF adapter and a $40 compact flash card to load embedded Linux on it with some clustering software. I could pile the boards up in a custom made chassis/power-supply and I'd have a pretty mean supercomputer for less than $50,000.

        The point is that GFLOPS/dollar is all that matters when it comes to super computers.

        If you?re talking about the desktop or notebook game, the Pentium M Yonah dual-core is where the actions at within the next couple of months.
        george_ou
      • Unrealistic

        Comparing processors purely on clock cycle performance is a waste of time. There is more to performance that simply clock speed, the supporting hardware is just as important if not more so. Does anyone really think that these million dollar dream machines were made using generic motherboards from your local consumer electronic store? While one of these super computers were constructed with only performance in mind consumer PC's are designed to be cost competitive. If Maya on Intel outperforms Maya on PPC by such a large margin it is more likely that it is the supporting hardware than the software. Optimizing compilers do a pretty good job these days so #ifdefs in code do not explain a 2x performance difference; on both processors the algorithm is exactly the same. If I could get a dual processor Apple with the same supporting hardware that one of these supercomputers use with an equally impressive video card then Maya would probably run faster on my hypothetical Apple than on the standard Intel dual processor Xeon board.

        What matters is the hardware that the consumer can purchase, not magic numbers from blue sky dream machines.
        balsover
        • Quite right -and a small but

          Yes the supporting hardware is critical - particularly when assembling grids because most of the delay occurs during signal transmission and handling.

          At the CPu level AMD derives its advantage over intel from precisely this: more and faster throughout hardware on the chip - and, of course, its getting rid of those distances that powers IBM's cell too.

          On the other hand, most games makers (and I assume Maya as well, although I don't know this) use Intel's simd instructions and fail to translate these to Apple's altivec (or SPARC's four way simd set) when they port to those environments. That's why they appear so much slower when not on their native x86 processors - they're only using part of the chip and may even be forcing the Apple (or SPARC) host to do extensive work around processing to accomodate x86 specific optimizations.
          murph_z
  • supercomputers of today....desktops of tomorrow

    well in ten years anyway.
    pesky_z
    • We still don't have a 10 year old Super computer on our desks

      I think that you are stretching it a bit.
      balsover
  • I don't think it was about speed...

    The decision was based on supply. It had nothing to do with the speed of the chip. The Mac OS X makes lots of folks fairly happy at its present speed.

    Jobs was being dissed by IBM because Apple represented just 2% of IBM's output. He wasn't worth the hassle to IBM, and cut the cord on his own terms.

    Intel is going to treat Apple as an important customer. More important, I think, is they will introduce Apple to their network of OEMs who can supply as many Macs as Jobs can sell.

    Most people feel that a Mac priced like a PC is a better value. If that's still true next year, and if Intel's OEMs can supply as many Macs as Jobs can sell, imagine what that will do to the market share equation vs. Windows.

    Imagine Dell Macs and HP Macs.

    If Dell and HP don't go to Mac after the market share rises, then you'll see everyone running Lenovo Macs, which they won't like. They will have to support the architecture.

    I think this kind of thinking -- whether right or wrong -- says more about what's happening here than a speed comparison of the chips.
    DanaBlankenhorn
    • Apple's decision wasn't about speed, the gloating is

      Right, Apple's decision was made for its own survival, not for speed - a case of chewing off a limb to let the body survive.

      On the other hand an overwhelming majority of "pundits" have gloated over this as recognition of Intel's (and by identification their own) superiority. What these numbers show is that the gloaters are wrong - x86 would be a long step down for Apple.
      murph_z
  • Supercomputers and desktops

    A few comments ...

    The article is relevant to the small number of people who are interested in supercomputer performance.

    GFLOPS is an interesting measurement, and it's important in certain types of processing tasks. Nobody's pretending that it can be used for every type of comparison.

    The article won't teach you much about the performance of desktops equipped with the CPUs used in the supercomputers. So if that's what you're interested in, and you didn't learn what you wanted to know, well, the article wasn't written for you in the first place.

    I would like to see some more practical tests of various type of processing tasks, and compare the performance top 20 on each. It would be interesting to compare those to the GFLOPS top 20.
    netminder