Benchmarks don't measure up for new platforms

Modern server designs are powerful and complex. Old benchmarking ideas are not up to the task of measuring what matters

Some time next month — the smart money is on 11 April — the latest and by any measure greatest Airbus will depart the tarmac on its first test flight. There is great excitement in the aviation community about this: the A380 is an aircraft of superlatives. Bigger than a 747, it will set new standards for range, carrying capacity, efficiency and sheer size. What nobody is talking about, because nobody cares, is how fast it flies.

With Sun preparing an eight-way Opteron server and Intel gearing up for next week’s launch of its next generation multi-processor Xeon platforms, life at the affordable end of performance computing has rarely looked so rewarding. Commodity computing is pushing its way up towards the sort of performance once reserved for specialist supercomputers; single racks serving thousands of users doing serious transactions are no longer the stuff of fiction. Meanwhile, everyone is talking about manageability, reliability and transactional throughput. A modern server processor design must have features that address all these. What fewer people are talking about is how fast it runs.

With aircraft, the most efficient top speed is set by aerodynamics. The A380 flies at much the same speed as a 707 of 50 years ago; the new Xeons, shackled by thermodynamics, will have much the same clock rate as the previous models. In both cases, the things that matter now are not easily distilled into neat lists of integers: the aviation industry has realised this, but IT is still coming to terms with the idea.

It’s always been difficult to successfully model real-world computing in a fashion that is repeatable, transferable and relevant, but the latest multi-processor platforms break the idea completely. Unlike high performance computers, which typically run a small set of very well characterised tasks, a typical eight way server will be dealing with lots of different requests to use heterogenous data in a wide variety of ways. Who cares how quickly it can run floating point instructions in loops? Who can simulate a thousand users hammering on a terabyte of real world data?

Benchmarks are falling out of fashion — they should fall further. With modern computing environments nearly impossible to simulate and vendor figures casting a light on affairs that is artificial at best, data centre denizens should set their own rules. Before investing in any of the new breed of n-way servers, decide on the performance you need doing your mix of tasks and measured in ways that are meaningful to you. Then demand the vendors prove they meet this.

This might sound like more work than watching graphs on PowerPoint. It is. But from now on — and especially as we move towards virtualisation and multicore chips — reality is the only benchmark worth having.