IBM hopes to upend industry standard server ROI equation

IBM hopes to upend industry standard server ROI equation

Summary: IBM will introduce a new class of industry standard servers that it hopes will widen its market share lead and put rivals like HP and Dell on defense.

SHARE:

IBM on Tuesday will introduce a new class of industry standard servers that it hopes will widen its market share lead and put rivals like HP and Dell on defense.

Big Blue's new family of servers, dubbed the eX5 portfolio, features architecture tweaks that allow the customer to add more memory without buying an entirely new server. IBM spent three years engineering the systems, which will be previewed at CeBIT in Germany.

IBM plans to alter the industry standard server return equation via memory expansion options and an extra chip that's designed to coach the system to better performance. IBM said it will announce three eX5 systems through 2010: A four processor version; a new blade design; and an entry-priced two-processor server (right).

The big advantage here appears to be IBM's memory pitch. The eX5 line is engineered to support more DIMMs (Dual In-Line Memory Modules). A DIMM is a printed circuit board that holds memory chips and plugs into a socket on the motherboard.

Industry standard x86 blade servers generally come with 12 to 16 DIMMs and if that's maxed out you need to buy another server. The real trick would be to add more memory without buying a new server and all the hardware that goes with it.

Also: Server sales show signs of life in the fourth quarter; IBM remains top dog

IBM's plan with eX5? Offer blade and rack servers that have 16 DIMMs standard and then the ability to add an additional 24.

According to IBM, the win is that customers don't have to buy a new server when they max out memory. They can simply buy more memory and can do it in smaller increments for overall savings. Tom Bradidich, IBM fellow and vice president of IBM x86 servers, said expansion options are a big plus because customers were buying full systems when they only really needed more memory. Those additional servers led to higher maintenance and license costs. "It was like buying a full Happy Meal when all you really wanted was the prize," he explained.

Bradidich said this approach can help customers buy less equipment, cut energy costs and prevent server sprawl. Bradidich argues that x86 servers are based on PC architecture that is three decades old and locks memory and the processing power together. "PC architecture shouldn't masquerade as enterprise server," said Bradidich.

Aside from the memory advantages (see memory drawer right), IBM is also adding an additional chip to its eX5 systems. The chip, based on IBM's X-Architecture, will ride shotgun along with standard Intel server chips and memory. This IBM chip will cut the latency between memory and the processor. With the additional chip, IBM claims that its eX5 portfolio will deliver 30 times better database performance compared to the current generation of systems, 99 percent better performance per watt and the ability to run 78 percent more virtual servers for the same license cost.

Bradidich added that the X5 chip pulls together I/O, chip, memory, storage and networking to coax more performance out of industry standard memory and chips.

Big Blue said pricing of these new servers will be competitive with the broader market, but specifics would wait until Intel launches its latest server chips at the end of the month.

Topics: Networking, Hardware, IBM, Processors, Servers

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

18 comments
Log in or register to join the discussion
  • How is this new tech?

    I recall this kind of tech being sold in the 80's
    and 90's in the form of PCI add-on cards where you
    could add another proc and/or more RAM. Or for
    the PowerPC Mac, the ability to run PC apps via a
    celeron chip, ram, etc on a PCI card. Seems like
    it was just re-engineered for current times if
    anything LOL
    unredeemed
    • SIMM Stackers!

      http://www.maxcom.com.tw/html/ss.htm
      JetJaguar
  • RE: IBM hopes to upend industry standard server ROI equation

    Yep, they reinvented the daughterboard...in a new form factor. Good for you IBM, here's your cookie.
    MadLyb
  • RE: IBM hopes to upend industry standard server ROI equation

    Hmm. Am I the only one that thinks that the premise of "you don't need to upgrade the rest of the system - just the memory" is a bad one?
    I find that typically if the memory requirements grow, so have the overall system requirements. Even if you could swap the CPUs as well, would you not want new power supplies, faster bus, etc...

    Plus, it's called "commodity" hardware for a reason - it has a finite life-span.
    rossdav
    • You missed the point.

      This isn't a new version of "commodity" hardware. It's a completely new design that is aimed at server functionality. Re-read the article because you missed the part about the old designs not being up to the task...
      Timpraetor
  • One ECC failure can ruin your whole day

    The problem with adding large numbers of DIMM's to a single system is that the risk of an uncorrectable memory error rises to an uncomfortable level. One ECC failure and the entire server crashes. Many of these memory heavy systems are being used for virtualization, and an error in any DIMM takes down all the VM's. There is currently no PC operating system capable of isolating the error and recovering.
    johndoe445566
    • And you are sure that IBM has not accounted for this?

      If you read the article, IBM's design discussion states that the existing design is not up to the task. This is a completely new design paradigm. If anyone can alleviate your concerns, IBM is at the top of the list for potentials.
      Timpraetor
      • Are you sure that they have?

        Or are you just taking that for granted? With zero facts to base a conclusion on, your response seems to just be background noise at best.
        jasonp9
        • Yes, as a matter of fact, I am.

          And you could be sure as well if you did a little digging. This article isn't the only source of info on the new designs.

          Do a little digging yourself before you introduce //more\\ noise into this thread.
          Timpraetor
    • But it has more than just ECC

      It has chipkill & MemoryProtexion to withstand even two whole chips crashing on a DIMM.
      halj78727
      • Uncorrectable errors happen. Get used to it.

        Memory errors are more common than many people think, especially on heavily loaded servers:

        http://arstechnica.com/business/news/2009/10/dram-study-turns-assumptions-about-errors-upside-down.ars

        Chipkill is certainly better than traditional ECC, but no error correction system is perfect. As you increase the total size of your RAM array, you increase the risk of an uncorrectable error. The fact remains that one such error takes down the entire server. If the server is running many virtual machines, that creates a lot of pain.

        My take is that there are very few workloads where a single "super sized server" is the best solution. I'm not saying that [b]nobody[/b] needs a server with several TB of RAM, but [b]very few[/b] people do. As such, IBM's announcement is mostly marketing BS, not some "breakthrough".
        johndoe445566
        • More common, yes, but with Chipkill

          (oops--posted as a reply to story rather than message!)

          you're in pretty good shape.

          I disagree that "very few" people need this. For a VM host, one of the critical factors is RAM. You don't even necessarily need these kind of ultra-reliable measures in that situation--just buy multiple boxes and set up an HA solution. If the memory fails in one server, you can failover to another.

          I would worry much more about unrecoverable error rates for disks (affecting rebuilds) more than RAM:

          http://blogs.zdnet.com/storage/?p=805
          blu_vg9
          • HA/clustering vs. mainframes

            [i]just buy multiple boxes and set up an HA solution.[/i]

            My point exactly! The best way for most people to achieve HA is with multiple boxes, not with a single "high reliability mega server".

            [i]I would worry much more about unrecoverable error rates for disks (affecting rebuilds) more than RAM[/i]

            I'm worried about both.
            johndoe445566
        • Chipkill and DIMMkill

          Valid points about UEs in an ECC-only world. The probability for bit-flips increases with more memory bits in a system (more memory bits from higher density DRAM chips, the main culprit, as well as more of those memory chips via more DIMMs).

          However, a double bit unrecoverable error is dependent on the ECC domain.

          So a server with twice the memory will experience twice the CEs and UEs as one with 1X memory, just like two separate servers with 1X memory would. But it is not just the number of DIMMs, an 8GB DIMM will have twice the potential for errors as a 4GB DIMM.

          Chipkill helps most of this. I do not have data, but I would guess Chipkill provides similar reliability at 100GB of RAM as ECC only provides at 1GB RAM. IBM studies in the late 1990s showed Chipkill reduced UEs by 150 times compared to ECC-only.

          As for the comment there is no PC operating system which can withstand a memory UE, this is simply not correct. It depends if the UE occurs in kernel space or user space of the host operating system, the error reporting capabilities of the processor, and the architecture of the OS.

          Nehalem-EX includes Intel's Machine Check Recovery, formerly only found on Itanium. MCA can report to the OS the memory location of a UE. The operating system, if properly architected with a service management facility (rather than the legacy UNIX init system), can then deal with the error. If the error is in kernel space, it will panic and reboot the kernel, to prevent data corruption. If it is in user space, the OS can kill the effected process and restart it.

          VMware plans to support this in a future release of vSphere. What this means is if vSphere is running on Nehalem-EX, and a UE affects only a running VM, and not the ESX VMkernel, ESX will kill and restart the VM, and keep the other VMs running.

          One last thought on this. IBM knows what it is doing when it comes to DRAM memory. They have an excellent track record on reducing and tolerating UEs on the mainframe and pSeries. These machines have built-in hypervisors and are like giant ESX hosts hundreds of gigabytes to terabytes of DRAM. They rarely crash.
          meh1309
          • Future plans vs. current reality

            [i]"VMware plans to support this in a future release ..."[/i]

            As I said, no PC operating system [b]available today[/b] can recover from an uncorrectable memory error. Let me know when VMware has something that (1) is proven to work, and (2) I can buy. Until then, it's just marketing hype.
            johndoe445566
  • Innovation

    Good to see IBM is innovating again. Innovation can also mean creating new pricing structures and the way people purchase products.
    Maxfli82
  • More common, yes, but with Chipkill

    you're in pretty good shape.

    I disagree that "very few" people need this. For a VM host, one of the critical factors is RAM. You don't even necessarily need these kind of ultra-reliable measures in that situation--just buy multiple boxes and set up an HA solution. If the memory fails in one server, you can failover to another.

    I would worry much more about unrecoverable error rates for disks (affecting rebuilds) more than RAM:

    http://blogs.zdnet.com/storage/?p=805
    blu_vg9
  • RE: IBM hopes to upend industry standard server ROI equation

    Issues may be a portion lodging man. [url=http://www.cheap-nfljerseys-shop.com/]buffalo bills jerseys[/url] Fully grasp that errors for which these are: constructive day-by-day profile tutorials which will basically be practiced usually the detailed course. Probability that's a fatal error in [url=http://www.cheap-nfljerseys-shop.com/]bills jerseys[/url] judgement, [url=http://www.cheap-nfljerseys-shop.com/]bill jerseys[/url] and also this, at the least, a variety of many have the ability to be familiar with with out of.
    makrekdw45-24353611923087901074498125825970