Is workflow analysis better than relying on benchmarks?

IBM's Dave Turek discussed Exascale computing and why the benchmarks the industry is relying on today simply aren't good enough.
Written by Dan Kusnetzky, Contributor on

In the commentary, IBM's Intelligent Clusters - an old idea done well, I discussed one of the many excellent sessions IBM presented during a technical computing analyst day in late May 2012. This time, I'd like to examine a very interesting discussion of Exascale computing and why most of today's benchmarks are no longer useful tools in evaluating system performance presented by Dave Turek, IBM's VP, Exascale Computing.

First of all, Dave spent a few moments defining the concept of Exascale computing. A simple definition is that Exascale computing offers levels of performance that are orders of magnitude larger than even the largest clusters working today. The performance of an exascale system is measured in Exaflops. What does this mean? An an Exaflop is 1 quintillion (or 1 million trillion) floating point operations per second. Although the primary goal of these future systems is to execute technical computing workloads faster than currently possible using today's technology, it is clear to me that systems of this scale will eventually be used to support the largest commercial workloads as well.

Since this session was a discussion of future technology, Dave then went on to consider how can we measure such powerful systems. Today's benchmarks, such as Linpack or those offered by the transaction processing council (TPC) don't really measure all of the important performance metrics of such vast systems. To measure such large systems, the workflows they're supporting should be examined. He used the following figure to discuss what is really needed.

Today's technical benchmarks examine either:

  • The flow of work from the processor through the cache to the system's memory or
  • The flow of work from the processor through the cache to the systems memory and then onto the systems storage system

These types of benchmarks simply can't capture what is really happening for distributed and/or cloud computing systems. The cost and time involved in sending applications and data over a network to remote data centers needs to be included as well.

Dave suggested that workflow analysis, that is examine each of the resources used by complete workflows is a much better metric. This would be helpful in considering the systems required to support applications that make intense use of processors, cache memory, main memory, storage systems and public and private networks.

He went on to point out that many of today's tools to measure performance don't take into account the cost (processing time, power, cooling and the like) of heavily networked systems. This, he points out, means that most benchmarks don't really offer a useful way to look at cloud computing environments.

Dave then went on to discuss what IBM is doing both in the areas of Exascale computing and in promoting workflow analysis measurements. Although he didn't go into detail, it is clear that IBM is using this type of thinking to reconsider system, memory, storage and networking designs to achieve performance levels Exascale computing requires.

It was a fascinating hour and I hope to learn more from the good Mr. Turek in the future.

Editorial standards