NASA gets SGI 2048-core Itanium 2 supercomputer

NASA gets SGI 2048-core Itanium 2 supercomputer

Summary: I had a chance to speak with NASA and SGI at the SC07 supercomputing convention in Reno this week where I saw one of the biggest super computers in the world.  Pictured left is a 1024-core version of the Altix 4700 and NASA just bought one with twice as many processors (1024 dual-core Itanium 2 processors) based on the Montecito variant of Intel's Itanium 2 processor and 4 Terabytes of RAM.

SHARE:

I had a chance to speak with NASA and SGI at the SC07 supercomputing convention in Reno this week where I saw one of the biggest super computers in the world.  Pictured left is a 1024-core version of the Altix 4700 and NASA just bought one with twice as many processors (1024 dual-core Itanium 2 processors) based on the Montecito variant of Intel's Itanium 2 processor and 4 Terabytes of RAM.

This massive supercomputer is the most powerful single node computer in the world (based on SPECint_rate2006 and SPECfp_rate2006 database) and it has one of the largest single system memory pool in the world.  For some applications that simply can't be effectively broken down in to smaller tasks that a cluster can handle using smaller nodes because of excessive communications overhead, this is really the only system that can crunch those hard problems.

To give you some idea how powerful this system is, a 256-core version of the SGI Altix 4700 has a SPECfp_rate2006 score of 3507 and a SPECint_rate2006 score of 2970.  The biggest 16-core Intel X7350 2.93 GHz server scores 119 on SPECfp_rate2006 and 214 on SPECint_rate2006.  The biggest 16-core AMD Barcelona server has a SPECfp_rate2006 score of 136 and a SPECint_rate2006 score of 160.  A 16-core IBM Power6 has a SPECfp_rate2006 score of 428 and a SPECint_rate2006 score of 478 though the latest 32-core version probably has double that performance.  But even the Power6 is dwarfed by the 256-core SGI machine let alone what a 2048-core version can do.

Of course there are plenty of jobs that do break down nicely for clusters and plenty of jobs that don't need that much single-node memory.  That's why NASA also purchased an Altix "ice" 8200 cluster using 16 of the racks pictured left.  Each one of these racks contains 64 dual-processor Intel XEON x86/x64 servers and 16 of these make a 1024 processor cluster with 4096 XEON CPU cores.

The Altix 8200 rack includes the 20 gbps InfiniBand switches on the sides for the cluster interconnect and the racks can be chained together with InfiniBand.  NASA has for the most part used very large shared memory systems like the Altix 4700 above but they've just started buying the clustered systems.

.

.

.

.

.

Topics: Software, Hardware, Networking, Open Source, Operating Systems, Processors

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

51 comments
Log in or register to join the discussion
  • So that's who's keeping SGI alive

    Granted, I know that the Itanium 2 scales well, but so does the POWER6. In fact, if the POWER6 scales as well as it should, they would require half of the cores that the Itanium 2 would need. I am also sure that if they wanted to build a 1024 Core system, IBM would gladly oblidge them and create a custom rig. They do it for the government all the time. "Blue Gene/L" anyone?

    Still, I would love to be there in the middle of all that supercomputing power. Much like a teenager cut loose in a garage full of muscle cars. I would probably be drooling.
    nucrash
    • Biggest Power6 is only 32 cores

      Biggest Power6 is only 32 cores which means it's only a little better than a 64-core 1.6 GHz Itanium 2 system. Power6 gets smashed by a 128-core Itanium 2 system let alone what a 2048-core system can do.
      georgeou
      • So IBM needs to scale it up.

        Well, Itanium 2 is an older processor that has had some time to be developed into more platforms as opposed to the POWER6.

        A POWERPC400 would probably be more apt to scale up to the amount of Itanium 2, or whatever they use in the BlueGene series of systems. Plus the Blue Gene Solutions do pride themselves on being a greener solution.
        nucrash
        • I'm talking about single system image and shared RAM

          I'm talking about single system image and shared RAM and nothing beats these *NEW* Itanium 2 based systems.
          georgeou
  • Does it run Windows, George?

    Or a decent server OS (presumably the SGI IRIX or Linux)?
    BanjoPaterson
    • I know the answer

      http://www.wxwidgets.org/images/screens/win64_controls.jpg

      Duh!!!
      nucrash
    • The real question is...

      ...would this be enough to run Crysis?
      itpro_z
      • That is a bit harder

        I doubt they have a video card powerful enough. Although they might have the PCI Express on one node or two. But hardware is only part of the issue. Though you have Windows for the Itanium, your Application is still designed for the x86 instruction set. Even though Crysis may support 64 bit(Well 48 bit) design, the way that the Itanium handles execution of instructions is vastly different from that of the x86 chip. To get around this, the Itanium can and will emulate x86 with the help of emulation software. But as almost always, this comes at a cost. Emulation of x86 on the Itanium is slow, and I don't mean slow compared to the bleeding edge. I mean Gigahertz become megahertz slow. Also, the more I/O you tax the system with, the slower the application gets, i.e. Videocard.
        nucrash
    • It runs...

      When I used to work on their Altix 3000 series machines they were using a custom Kernel version of RedHat. Since the original Kernel for RedHat couldn't support enough processors, they had to get one modified to support the massive increase in processors (I believe at the RedHat only supported 32 max processors) and the interconnect scheme they used to unify the whole system.

      SGI has been historically reluctant to change OS's, so I would assume that they are still committed to their RedHat Kernel. IRIX was used on the NEC RISC processor based architecture of the Origin 3000 / 2000 and prior systems.
      Zorched
      • Thank you for the answer

        It surprised me.
        BanjoPaterson
    • NASA runs Linux, but Windows Server 2000 or 2003 does run Itanium

      nt
      georgeou
      • True - but will it cluster to the same degree?

        nt
        BanjoPaterson
        • Windows is a minority in the HPC space, but they're growing very fast

          Windows is a minority in the HPC space by far, but they're growing very fast in this space. Microsoft is still what you would call a "noob" in the HPC space but they are getting better at it and they are growing very fast. I wouldn't count them out.
          georgeou
          • Fair Nuff (nt)

            Have a good weekend :-)
            BanjoPaterson
          • Backing up these Claims....

            In the last 6 months, Windows has tripled the amount of Super Computers in the top 500 list.

            They doubled from the 6 months before that. So that is 500% growth in a matter of a year.

            In other works, Linux is starting to lose the battle in the HPC market unless they can make some advancement that Windows can't.

            Now, this isn't to start a flame war, this is just an observation, but if Windows is starting to make a difference in the HPC market, how long before they begin to take the server room back from Linux?
            nucrash
          • I don't think Windows will...

            reason being the ability to fine tune and customize the kernel AND having direct access to the source code. In these types of systems this is imperative and as noted by another poster, the NASA kernel is a custom in house modification of the Red Hat kernel. It's the one area that will keep Microsoft relegated too 3rd, maybe second place in this arena. ]:)
            Linux User 147560
          • You never really know

            After all, they do have the Shared Source Initiative that they use with governments. Being that NASA is a government organization, there is that possibility that NASA could utilize this to their advantage.

            But then again, who really knows.
            nucrash
          • Make no mistake, they're still noobs but they're learning fast

            Make no mistake, they're still noobs but they're learning fast in the HPC space. The EU is already complaining that Microsoft is dominating 80% of the server market (probably comparing to paid Linux and not roll your own Linux shops).
            georgeou
    • Linux (Red Hat or SuSE)

      Depending on the node count, you can have either Red Hat or SuSE.
      jashley
  • What a joke

    We sent men to the moon with the computing power of a modern PDA (or less) and NASA is screwing around with supercomputers.

    That's probably why the shuttle doesn't have a replacement ready to go or why there was no evolution in the shuttle program over the years. ( upgraded rockets, or different types of suttles, etc).

    Maybe if they concentrated on core duties, NASA wouldn't be a giant waste of taxpayer money.
    croberts