This week, sandwiched between the annual Structure Big Data conference and the International Supercomputing show in Hamburg, Germany, ARM startup and HP partner Calxeda also found time to release the first well-documented x86 versus ARM benchmarks. The results, shown below, are very positive — while there are some caveats that we need to note, the first generation ARM SOCs seem to deliver on their basic promise of much better performance per Watt.
The benchmark, which compares anew ARM SOC from Calxeda to a Sandy Bridge (not Ivy Bridge) low-end Xeon server with the same number of cores, shows that the Xeon CPU, while delivering more performance, has a very large deficit in workload per Watt, which is one of the key value propositions of the ARM community. Benchmark details:
Interpreting The Benchmark
First of all, this is a single benchmark, and its relevance is limited to its domain — lightweight web serving on a small web server with 1 Gb network. We cannot interpolate results based on a faster network configuration (although my guess is that this configuration is bottlenecked by the network, and a faster Xeon would not make much difference), nor can we extend the interpretation to other workloads. But within the benchmark domain, this early comparison tells us some important things:
- Even with the current V7 32-bit architecture, the ARM CPU does indeed deliver impressive power efficiency.
- Absolute performance, especially considering the huge difference in clock speed, is higher than most of us expected.
- As a basic proof point, this benchmark succeeds as a proof of concept — AMR servers are indeed in the ballpark versus their initial promises.
Even if we start to do some “Kentucky windage” on the benchmark results to try and guess the impact of substituting a newer, more power-efficient Ivy Bridge CPU such as the Intel Xeon E3-1200 v2 and assume a 50% improvement in power efficiency, the ARM CPU workload/Watt advantage might drop to the neighborhood of 10X. Further deflation for a modest improvement in Ivy Bridge performance (assuming that network bandwidth might not be a bottleneck), would bring it down a bit further, possibly to the region of 8/1. Still very compelling from an overall efficiency perspective.
But Intel is not laying down on the job, and the announcement this week of the “Gemini” server from HP as an extension of their Project Moonshot scalable fabric-based server family powered by the latest generation of Intel’s Atom CPU will probably move x86 performance per Watt closer to ARM when it is available later this year. We look forward with anticipation to the continued competitive playoff in server technology and what it means for users.