Between the Lines

Larry Dignan, Andrew Nusca and Rachel King

IBM hopes to upend industry standard server ROI equation

By | March 2, 2010, 2:01am PST

Summary: IBM will introduce a new class of industry standard servers that it hopes will widen its market share lead and put rivals like HP and Dell on defense.

IBM on Tuesday will introduce a new class of industry standard servers that it hopes will widen its market share lead and put rivals like HP and Dell on defense.

Big Blue’s new family of servers, dubbed the eX5 portfolio, features architecture tweaks that allow the customer to add more memory without buying an entirely new server. IBM spent three years engineering the systems, which will be previewed at CeBIT in Germany.

IBM plans to alter the industry standard server return equation via memory expansion options and an extra chip that’s designed to coach the system to better performance. IBM said it will announce three eX5 systems through 2010: A four processor version; a new blade design; and an entry-priced two-processor server (right).

The big advantage here appears to be IBM’s memory pitch. The eX5 line is engineered to support more DIMMs (Dual In-Line Memory Modules). A DIMM is a printed circuit board that holds memory chips and plugs into a socket on the motherboard.

Industry standard x86 blade servers generally come with 12 to 16 DIMMs and if that’s maxed out you need to buy another server. The real trick would be to add more memory without buying a new server and all the hardware that goes with it.

Also: Server sales show signs of life in the fourth quarter; IBM remains top dog

IBM’s plan with eX5? Offer blade and rack servers that have 16 DIMMs standard and then the ability to add an additional 24.

According to IBM, the win is that customers don’t have to buy a new server when they max out memory. They can simply buy more memory and can do it in smaller increments for overall savings. Tom Bradidich, IBM fellow and vice president of IBM x86 servers, said expansion options are a big plus because customers were buying full systems when they only really needed more memory. Those additional servers led to higher maintenance and license costs. “It was like buying a full Happy Meal when all you really wanted was the prize,” he explained.

Bradidich said this approach can help customers buy less equipment, cut energy costs and prevent server sprawl. Bradidich argues that x86 servers are based on PC architecture that is three decades old and locks memory and the processing power together. “PC architecture shouldn’t masquerade as enterprise server,” said Bradidich.

Aside from the memory advantages (see memory drawer right), IBM is also adding an additional chip to its eX5 systems. The chip, based on IBM’s X-Architecture, will ride shotgun along with standard Intel server chips and memory. This IBM chip will cut the latency between memory and the processor. With the additional chip, IBM claims that its eX5 portfolio will deliver 30 times better database performance compared to the current generation of systems, 99 percent better performance per watt and the ability to run 78 percent more virtual servers for the same license cost.

Bradidich added that the X5 chip pulls together I/O, chip, memory, storage and networking to coax more performance out of industry standard memory and chips.

Big Blue said pricing of these new servers will be competitive with the broader market, but specifics would wait until Intel launches its latest server chips at the end of the month.

Kick off your day with ZDNet's daily e-mail newsletter. It's the freshest tech news and opinion, served hot. Get it.

Topics

Larry Dignan is Editor in Chief of ZDNet and SmartPlanet as well as Editorial Director of ZDNet's sister site TechRepublic.

Disclosure

Larry Dignan

Larry Dignan has nothing to disclose. He doesn’t hold investments in the technology companies he covers.

Biography

Larry Dignan

Larry Dignan is Editor in Chief of ZDNet and SmartPlanet as well as Editorial Director of ZDNet's sister site TechRepublic. He was most recently Executive Editor of News and Blogs at ZDNet. Prior to that he was executive news editor at eWeek and news editor at Baseline. He also served as the East Coast news editor and finance editor at CNET News.com. Larry has covered the technology and financial services industry since 1995, publishing articles in WallStreetWeek.com, Inter@ctive Week, The New York Times, and Financial Planning magazine. He's a graduate of the Columbia School of Journalism and the University of Delaware.

For daily updates, follow Larry on Twitter.

Talkback Most Recent of 18 Talkback(s)

  • How is this new tech?
    I recall this kind of tech being sold in the 80's
    and 90's in the form of PCI add-on cards where you
    could add another proc and/or more RAM. Or for
    the PowerPC Mac, the ability to run PC apps via a
    celeron chip, ram, etc on a PCI card. Seems like
    it was just re-engineered for current times if
    anything LOL
    ZDNet Gravatar
    unredeemed
    1st Mar 2010
  • ZDNet Gravatar
    JetJaguar
    2nd Mar 2010
  • RE: IBM hopes to upend industry standard server ROI equation
    Yep, they reinvented the daughterboard...in a new form factor. Good for you IBM, here's your cookie.
    ZDNet Gravatar
    MadLyb
    2nd Mar 2010
  • RE: IBM hopes to upend industry standard server ROI equation
    Hmm. Am I the only one that thinks that the premise of "you don't need to upgrade the rest of the system - just the memory" is a bad one?
    I find that typically if the memory requirements grow, so have the overall system requirements. Even if you could swap the CPUs as well, would you not want new power supplies, faster bus, etc...

    Plus, it's called "commodity" hardware for a reason - it has a finite life-span.
    ZDNet Gravatar
    rossdav@...
    2nd Mar 2010
  • You missed the point.
    This isn't a new version of "commodity" hardware. It's a completely new design that is aimed at server functionality. Re-read the article because you missed the part about the old designs not being up to the task...
    ZDNet Gravatar
    Timpraetor
    2nd Mar 2010
  • One ECC failure can ruin your whole day
    The problem with adding large numbers of DIMM's to a single system is that the risk of an uncorrectable memory error rises to an uncomfortable level. One ECC failure and the entire server crashes. Many of these memory heavy systems are being used for virtualization, and an error in any DIMM takes down all the VM's. There is currently no PC operating system capable of isolating the error and recovering.
    ZDNet Gravatar
    johndoe445566
    2nd Mar 2010
  • And you are sure that IBM has not accounted for this?
    If you read the article, IBM's design discussion states that the existing design is not up to the task. This is a completely new design paradigm. If anyone can alleviate your concerns, IBM is at the top of the list for potentials.
    ZDNet Gravatar
    Timpraetor
    2nd Mar 2010
  • Are you sure that they have?
    Or are you just taking that for granted? With zero facts to base a conclusion on, your response seems to just be background noise at best.
    ZDNet Gravatar
    jasonp@...
    2nd Mar 2010
  • Yes, as a matter of fact, I am.
    And you could be sure as well if you did a little digging. This article isn't the only source of info on the new designs.

    Do a little digging yourself before you introduce //more\\ noise into this thread.
    ZDNet Gravatar
    Timpraetor
    2nd Mar 2010
  • But it has more than just ECC
    It has chipkill & MemoryProtexion to withstand even two whole chips crashing on a DIMM.
    ZDNet Gravatar
    halj78727@...
    2nd Mar 2010
  • Uncorrectable errors happen. Get used to it.
    Memory errors are more common than many people think, especially on heavily loaded servers:

    http://arstechnica.com/business/news/2009/10/dram-study-turns-assumptions-about-errors-upside-down.ars

    Chipkill is certainly better than traditional ECC, but no error correction system is perfect. As you increase the total size of your RAM array, you increase the risk of an uncorrectable error. The fact remains that one such error takes down the entire server. If the server is running many virtual machines, that creates a lot of pain.

    My take is that there are very few workloads where a single "super sized server" is the best solution. I'm not saying that nobody needs a server with several TB of RAM, but very few people do. As such, IBM's announcement is mostly marketing BS, not some "breakthrough".
    ZDNet Gravatar
    johndoe445566
    2nd Mar 2010
  • More common, yes, but with Chipkill
    (oops--posted as a reply to story rather than message!)

    you're in pretty good shape.

    I disagree that "very few" people need this. For a VM host, one of the critical factors is RAM. You don't even necessarily need these kind of ultra-reliable measures in that situation--just buy multiple boxes and set up an HA solution. If the memory fails in one server, you can failover to another.

    I would worry much more about unrecoverable error rates for disks (affecting rebuilds) more than RAM:

    http://blogs.zdnet.com/storage/?p=805
    ZDNet Gravatar
    blu_vg@...
    2nd Mar 2010
  • HA/clustering vs. mainframes
    just buy multiple boxes and set up an HA solution.

    My point exactly! The best way for most people to achieve HA is with multiple boxes, not with a single "high reliability mega server".

    I would worry much more about unrecoverable error rates for disks (affecting rebuilds) more than RAM

    I'm worried about both.
    ZDNet Gravatar
    johndoe445566
    3rd Mar 2010
  • Chipkill and DIMMkill
    Valid points about UEs in an ECC-only world. The probability for bit-flips increases with more memory bits in a system (more memory bits from higher density DRAM chips, the main culprit, as well as more of those memory chips via more DIMMs).

    However, a double bit unrecoverable error is dependent on the ECC domain.

    So a server with twice the memory will experience twice the CEs and UEs as one with 1X memory, just like two separate servers with 1X memory would. But it is not just the number of DIMMs, an 8GB DIMM will have twice the potential for errors as a 4GB DIMM.

    Chipkill helps most of this. I do not have data, but I would guess Chipkill provides similar reliability at 100GB of RAM as ECC only provides at 1GB RAM. IBM studies in the late 1990s showed Chipkill reduced UEs by 150 times compared to ECC-only.

    As for the comment there is no PC operating system which can withstand a memory UE, this is simply not correct. It depends if the UE occurs in kernel space or user space of the host operating system, the error reporting capabilities of the processor, and the architecture of the OS.

    Nehalem-EX includes Intel's Machine Check Recovery, formerly only found on Itanium. MCA can report to the OS the memory location of a UE. The operating system, if properly architected with a service management facility (rather than the legacy UNIX init system), can then deal with the error. If the error is in kernel space, it will panic and reboot the kernel, to prevent data corruption. If it is in user space, the OS can kill the effected process and restart it.

    VMware plans to support this in a future release of vSphere. What this means is if vSphere is running on Nehalem-EX, and a UE affects only a running VM, and not the ESX VMkernel, ESX will kill and restart the VM, and keep the other VMs running.

    One last thought on this. IBM knows what it is doing when it comes to DRAM memory. They have an excellent track record on reducing and tolerating UEs on the mainframe and pSeries. These machines have built-in hypervisors and are like giant ESX hosts hundreds of gigabytes to terabytes of DRAM. They rarely crash.
    ZDNet Gravatar
    meh130@...
    3rd Mar 2010
  • Future plans vs. current reality
    "VMware plans to support this in a future release ..."

    As I said, no PC operating system available today can recover from an uncorrectable memory error. Let me know when VMware has something that (1) is proven to work, and (2) I can buy. Until then, it's just marketing hype.
    ZDNet Gravatar
    johndoe445566
    3rd Mar 2010

Talkback - Tell Us What You Think

Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]

The best of ZDNet, delivered

ZDNet Newsletters

Get the best of ZDNet delivered straight to your inbox

Facebook Activity

White Papers, Webcasts, & Resources