Analysis: Server Side Java energy efficiency versus load

Analysis: Server Side Java energy efficiency versus load

Summary: With the arrival of the latest standardized energy efficiency benchmark from SPEC, we have a good way to measure server efficiency.  In light of the recent controversy over flawed energy efficiency studies that have unfortunately been touted by so many in the press instead of SPEC, I thought I'd offer some more in-depth analysis on energy efficiency.

SHARE:

With the arrival of the latest standardized energy efficiency benchmark from SPEC, we have a good way to measure server efficiency.  In light of the recent controversy over flawed energy efficiency studies that have unfortunately been touted by so many in the press instead of SPEC, I thought I'd offer some more in-depth analysis on energy efficiency.

The new SPECpower_ssj2008 benchmark gives us a standardized way of measuring energy efficiency for Server Side Java.  SPECpower_ssj2008 gives us efficiency data at varying workloads going from 0% to 100% at increments of 10%.  Then it provides us with a Performance to Power Ratio curve along with an average efficiency of those 11 workload measurements.  The two graphs below are compiled from the SPEC database.  It represents the fastest Intel quad-core system (below left) versus the only AMD CPU submitted to the SPECpower_ssj2008 database to date which is a special energy-efficient Opteron 2216HE (below right).

The two graphs above show more than a 3 to 1 advantage for the fastest Intel system when we look at it in terms of percent workload.  This is a perfectly valid way of analyzing the data, but the tradeoff is that you're not seeing the efficiency of each processor at absolute workloads which might be valuable if you need a system with lighter workloads.  So to offer an alternative method of interpreting the efficiency data, I plotted out the following Efficiency versus CPU capacity graph with published data from SPEC (and some MS Excel help from analyst David Kanter).

  • DP = Dual Processor
  • UP = Single Processor (Uni-Processor)
  • QC = Quad Core
  • DC = Dual Core
  • FB = Fully Buffered
  • "Operations per joule" is identical to ssj_ops/watt unit used by SPEC.
  • "Operations per second" refers to Server Side Java performance.

The blue curve represents the Intel E5450 server shown in the SPEC "Performance to Power" chart above left while the cyan curve represents the AMD 2216HE system.  You'll notice that the curves are somewhat close together at the lower workloads which means the AMD system is almost as efficient as Intel at lighter workloads.  But at peak performance levels, Intel is three times faster than the AMD 2216HE system and more then three times the energy efficiency.  So if you had to buy three of the AMD 2216HE systems to get the same Server Side Java capacity as the Intel E5450, it would cost you three times the power.

You'll also notice the pink curve spiking upwards in efficiency just shy of the absolute peak efficiency level of Intel's latest 45nm E5450 3.0 GHz quad-core CPU.  This single-socket single-processor 2.4 GHz XEON X3220 Intel server is by far the most efficient system at lighter workloads.  Had a newer single-socket CPU like the 45nm QX9650 3.0 GHz 45nm quad-core processor been used, the efficiency curve would probably fly off this chart.  Intel's 5100 series "San Clemente" chipset will  also get much better efficiency than anything on this graph because it uses lower power registered DDR2-667 memory like AMD.

<Next page - How to spot a flawed CPU energy efficient study>

How to spot a flawed CPU energy efficiency study

Now that we've gone through some thorough analysis on energy efficiency, let's look back to the flawed CPU study from Neal Nelson and associates.  Upon further investigation, I found that not only is Nelson's test flawed in the sense that Intel's best players aren't included, but the test is fundamentally flawed.

In a bastardized manner, Nelson's efficiency results actually look like my efficiency versus capacity graph but he chops off the right side of the graph with arbitrary performance caps.  Since it's possible for a less efficient chip to get better efficiency at lower workloads, you can manipulate the graph to favor one vendor over another simply by playing with the arbitrary cap on performance.

Updated 1/25/2008 - To illustrate this problem with Nelson's efficiency study, we can look at Neal Nelson’s “published paper”.  As you can see, Nelson is capping the performance of every machine to 2407 TPM (transactions per minute) for 100-user loads and 12036 TPM for 500-user tests.  Nelson's TPM numbers are identical to within 0.1% deviation whether you're using a single 2GHz processor server with 1GB RAM or a dual 2.33 GHz server with 16GB RAM.  That means the Intel CPU no matter how fast it can go will never be allowed to perform to its full potential.  Nelson called me this week to explain that he doesn't cap the performance portion of his study which appears to be true, but that has no relevance to the efficiency portion of his study which are capped and are used to draw conclusions about energy efficiency.

This is like saying that a Boeing 737 is more efficient than an Airbus A380 at carrying 130 people and then declaring to the entire world that Boeing is more efficient than Airbus.  But if you measured under the premise that you need to carry 525 people, the A380 will always be more efficient than four Boeing 737s in terms of passenger*mile/gallon but this critical detail is omitted in Nelson's report.

Takeaway You might conclude from this analysis that it's always better to buy the smaller computer system to run your business since it's more energy efficient and probably cheaper to acquire the hardware, but there are other factors to consider.  For one thing software licensing often dwarfs the hardware costs so you want to maximize your software licenses in terms of performance.  Another problem is that if you're talking about a transaction system and you occasionally need to go beyond the peak performance you've allocated for, are you prepared to turn away those transactions?

Lastly, it is usually cheaper to have idle processors than idle people.  Having workers that sit around twiddling their thumbs while your server cranks isn't the best use of resources.  If this were a customer facing system (directly or indirectly), you risk losing customers.  You might only need X number of transactions per minute and you even provisioned a server with 25% overhead capacity, but that may still not be good enough.  A server operating at or near capacity has very slow response times which may violate your SLA (Service Level Agreement) with your users whereas a server operating with 50% overhead can respond much quicker.

So at the end of the day when you factor in the need for responsive computer systems, IT departments will always buy the system that meets the worst-case workloads.  Ideally those servers will be energy efficient at all workloads but your mileage will vary depending on how close you get to peak performance and it will depend on what you can tolerate in response times.

<Return to top>

Topics: Intel, CXO, Hardware, Open Source, Processors, Servers, IT Employment

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

48 comments
Log in or register to join the discussion
  • That was educational

    Apparently there is a reason that I continue to buy Intel for my servers.

    Licensing is always a battle though. I would love to put the latest quad core in a system, but the downside to that is that IBM and other companies charge by the core. Though this isn't a problem due to the fact that Intel's Dual core still out performs AMD's dual core. This could be a struggle later on if AMD gets their game in gear. Or if Intel slows up just enough to let AMD catch up again. I actually see the latter of the two happening just because of the recession we are in the midst of.
    nucrash
    • Since you complained for more IT content, that was for you

      Since you complained for more IT content, that was for you :).
      georgeou
      • Have to justify spending work time on ZDnet

        As I said before. I appreciate the system builds as well.

        Since I have two server builds coming up, I will be using this information.
        nucrash
        • No problem - nt

          nt
          georgeou
  • No question, AMD is disappointing ...

    everyone. If it weren't for them offering a "fairly" good performing desktop chip for under $60, I would be thinking more Intel. AMD motherboards are usually cheaper for desktops as well.

    How long does it take to recoup the price difference in chips by the amount of money saved in energy costs?

    Are we gonna' have to do this all over again when B3 Barcelona is ready?
    bjbrock
    • The AMD systems have higher energy consumption on the desktop

      "How long does it take to recoup the price difference in chips by the amount of money saved in energy costs?"

      The AMD systems have higher energy consumption on the desktop at every load level except below 4% CPU utilization and the advantage is mimimal. At peak the AMD systems run much hotter.

      See data here.
      http://blogs.zdnet.com/Ou/?p=934
      georgeou
  • You're doing exactly the same thing you complained about.

    Except that you're comparing newer Intel processor to an old AMD processor.


    E5450 release date - 11Nov2007
    X3220 release date - 27Jul2007 (G0 stepping)
    5160 release date - 26Jun2006

    2216HE release date 15Aug2006

    I note that the 2216HE and 5160 are similar in terms of efficiency. That is not too surprising. They were released within one month of each other. Yes, the 5160 scales further, but I'm not debating the actual performance that can be achieved.

    Comparing the two newer Intel CPUs to a much older AMD CPU is EXACTLY the behavior you complained about in your other article.
    Letophoro
    • I think this anticle is fare.

      George already mentioned in the article, the only numbers for AMD CPU available on SPEC is 2216HE. It is not the fastest from AMD, but it is HE. I don't think George need to hand pick some CPUs to proof his point.
      Victor2008
    • 2216HE is the best chip available from AMD

      Actually, I'm comparing the BEST chips available. The 2216HE is the best chip available from AMD for testing efficiency. The HE stands for "High Efficiency" and it uses less power than normal AMD chips. Had I excluded a more efficient chip from AMD that you can actually buy, then you have something to complain about.
      georgeou
      • 2216 certainly isn't the fastest chip available

        Seems pretty weak there George. The 2216 is only 2.4 GHz with a 68 W rating. At a minimum, AMD's site list the 2218 HE if you want to stick to the 68 W rating so you're at least clipping out 200 MHz in speed there alone.

        If you go up in wattage rating you can get to a 2222 at 95 W and be up at 3 GHz which is a 25% speed bump versus a 40% increase in wattage so that would probably also be worth looking at as well.

        I'd also note that the max wattage rating on AMD's third generation Opterons are 75 W.
        Robert Crocker
        • This is a brand new test and there are only one dozen entries in it so far.

          SPECpower is a brand new test and there are only one dozen entries in it so far and it tests a lot more parts and doesn't throttle performance like Nelson's test. This Sunday there will be a SPECpower workshop which will train a lot of people how to run the test and I'm attending it. The information that is there now is fairly representative of AMD's most efficient options available but it will no doubt be updated whenever AMD's partners get to it. The 2218HE may not have been available at the time of the testing and it would provide a small 8% bump in performance if you assume near perfect scaling.

          You should also note that even though processors state that they're in the same TDP envelop, faster processors even at the same TDP will tend to use slightly more power in real-world measurements.

          Lastly, I'm not making any broad proclamations or press releases based on a very narrow set of data like Neal Nelson. I present the most up to date information with all the qualifiers and disclosures. In my SPECpower analysis, I go out of my way to point out that the AMD system needs to test with 4 DIMMs instead of 8, I point out that AMD would do a lot better clock-for-clock core-for-core on a SPECweb derivative of SPECpower, and I point out that Barcelona will do much better. I present the data objectively to the best of my ability with the most up to date data that's available to me at the time of writing. If you think that?s ?weak?, that?s your opinion but I stand by my work.
          georgeou
          • It's weak

            When you say 2216 HE is the best AMD has when in fact you're only saying it's the result that you could find rather than the best chip they have to offer.

            You castigated Neal because he didn't use Intel's latest and greatest chips. He replied saying that he didn't have them available but would be more than happy to test them if someone were to provide them to him. You then tried to throw in some sinister angle by repeatedly and unnecessarily demanding to know who supplied him the chips (insinuating a grand conspiracy). And of course to top it all off you provided a wonderfully dispassionate analysis by labeling something that you didn't agree with "rigged".

            Well now here we have a set of tests/analysis that you've created under the rubric of how to correctly do something and low and behold we find that you're using 2 year-old product on the other side and saying "well that's the best I can get my hands on/there are stats on". Doesn't that sound even slightly familiar?

            Should we know go and demand who supplied you with your benchmarks?
            Robert Crocker
          • Neal Nelson is off by 80% whereas the current SPECpower data is off by 8%

            "You castigated Neal because he didn't use Intel's latest and greatest chips"

            Neal Nelson is off by 80% whereas the current SPECpower data is off by 8%. Neal has consistently avoided using Intel products that were leaps and bounds ahead in performance and efficiency. First he avoided using the Intel quad-cores last year, now he avoided using the 45nm chips from Intel AND he capped the performance which prevented Intel from winning.

            Your only legitimate complaint about SPECpower is that the 2216HE was used instead of the 2218HE. You can't tell me that results in more than an 8% improvement in performance and that has to assume perfect scaling for the 2218HE. Then as I explained that even if the 2218HE is the same TDP as the 2216HE, TDP is like boxing weight classes where it's a bracket and not an exact measurement. Actual power measurements will vary slightly and the 2218HE will actually consume slightly more power than the 2216HE which means the efficiency won't go up as much as you would expect.

            More importantly, Neal Nelson actually flipped the winner and loser whereas the current SPECpower data only shows a slightly bigger lead for Intel than it actually should. If you think there's any resemblance between these two cases, then you ought to get your head examined.

            "Should we know go and demand who supplied you with your benchmarks?"

            I've already disclosed everything on my data, go read the SPEC disclosures and the vendor is listed. Neal Nelson to this day still refuses to verify if AMD gave him the hardware to test. I think it's likely that AMD did since they're one of three vendors listed under Nelson's disclosure, but the other two vendors are software people which points to AMD.
            georgeou
          • It's not off by 8%

            You don't know what it's off by.

            You've at best conjectured that the 2218HE would be an 8% improvement without looking at any other AMD CPU's. You admit that the benchmark is new and very few people have run it let alone understand how it works and thus how to run it optimally.

            You don't have number so don't speculate.

            Oh but wait, you DO speculate:
            [i]You?ll also notice the pink curve spiking upwards in efficiency just shy of the absolute peak efficiency level of Intel?s latest 45nm E5450 3.0 GHz quad-core CPU. This single-socket single-processor 2.4 GHz XEON X3220 Intel server is by far the most efficient system at lighter workloads. Had a newer single-socket CPU like the 45nm QX9650 3.0 GHz 45nm quad-core processor been used, the efficiency curve would probably [b]fly off this chart.[/b][/i] (Real dispassionate "analysis" there George.)

            And you continue to allege malfeasance and conspiracy with your breathless: "Neal Nelson to this day still refuses to verify if AMD gave him the hardware to test."

            Tell me, what difference would it make if AMD gave him the processors to test?
            Robert Crocker
          • 8% assumes perfect scaling. No CPU scales perfectly.

            8% assumes perfect scaling. No CPU scales perfectly, it's always worse than perfect so it's less than 8%. For you to deny this is ludicrous.

            You also neglect the fact that there are much faster and more efficient Intel systems that weren't tested either so the present data actually short changes Intel more than it does AMD.
            georgeou
          • Furthermore, the current SPECpower data does more injustice to Intel

            Furthermore, the current SPECpower data does more injustice to Intel than AMD. AMD's performance would probably be about 8% better if you changed out the CPU to the 2218HE.

            Intel on the other hand has the 3.2 GHz part on the Stoakley 5400 series platform with DDR2-800 memory which wasn't tested. This would have given Intel a much higher score (more than 8%) on performance because of the faster chipset. Intel also has the 5100 series San Clemente chipset which uses standard registered DDR2-667 memory like AMD and that would have allowed Intel to get a much better efficiency score.

            So when you really look at it, Intel actually gets shortchanged on performance and efficiency more than AMD in the currently available SPECpower data.
            georgeou
          • Please send your complaint to AMD

            SPECpower is open for every one. If you think it doesn't have the best result for AMD, that is AMD's problem. It's not George's fault. Plus, as long as you can meet SPECpower's standard, you can play whatever trick you want to boost your numbers. You look like a loser who keeps complaining other's racquet is better.
            Victor2008
          • Then should it be the basis for an "analysis"?

            It's a new test so it needs to be vetted more before you run out and draw conclusions from it.

            There are only one dozen entries so far so you lack sufficient data to perform an analysis.

            Yet you feel free to hold this up as the "correct" way to do an analysis.
            Robert Crocker
          • Please...

            ...this is a standard benchmark.

            AMD is part of the committee.

            If they can't publish results that are favorable, it's not George's problem.

            Take it up with AMD.
            thetruthhurts
      • Wrong in so many ways.

        1. "Actually, I'm comparing the BEST chips available. The 2216HE is the best chip available from AMD for testing efficiency."
        2216HE is not the BEST chip available. According to you, it is the ONLY one from AMD for which you have data from SPEC. Those are not the same thing.

        2. "The HE stands for "High Efficiency" and it uses less power than normal AMD chips."
        Add "that are two or more years old and 90nm or more" and you might have a valid statement.

        3. "Had I excluded a more efficient chip from AMD that you can actually buy, then you have something to complain about."
        Barcelona/Phenom? Oh, wait. You're going to use the "but I don't have data" argument. Piffle. You're simply doing an analysis based on flawed data. The lack of better data does not make the resultant "analysis" any more meaningful to the real world.



        You are still comparing a two month old Intel CPU with an AMD CPU that is a year and a half old. That is EXACTLY the behavior you wrote an entire article to bash.
        Letophoro