Are supercomputers just better liars?

Supercomputers may be far more powerful than ordinary machines, but that does not make their predictions infallible, says Andrew Jones
Written by Andrew Jones, Contributor

Supercomputers might be better at providing the right answers — or they could just be providing the wrong answers in far greater detail, says Andrew Jones.

Supercomputers enable simulations to use much higher resolutions, more detailed physics, and a greater amount of input data. Yet does all that extra simulation power ensure their predictions are more accurate — or more likely to be right than a simpler model? In other words, how do you know your supercomputer is telling the truth?

It is a fact of modern life that many things we deal with daily are influenced by computer modelling in some way. Phones, soft-drink cans, vehicles, computers, weather forecasts, food containers, healthcare products — all are designed with input from computer modelling. Those products are shipped to us with logistics supported by computer modelling. Everyday life is powered by energy found or generated by computer modelling.

Increasingly, computer simulations are replacing physical testing for most design work — jet-engine failures are now tested in computer simulations many times before a single final physical test is conducted. That testing saves huge amounts of time and money for designers of jet engines.

The computer says...
In fact, so relied on are these tests that the physical test is only carried out when the computer says it will be completely successful. Similar stories can be found elsewhere — the Airbus 380 was famously designed almost entirely through computer simulations.

In other situations, the computer model is the only way to get a prediction. We can't pop into tomorrow to check the weather, so we have to ask a computer for its predictions — not to mention climate-change assessments. With lightning-strike protection, even if we could find lightning with repeatability, persuading pilots to be the first to try a design would be problematic. Equally, with high-speed crash testing, you would be hard pushed to assure a test driver that the safety system will work.

Real lives or business decisions with large economic impact will often rest on computer predictions. So how do we know they are right?

Model comparisons
Thankfully, most developers of computer models test them before releasing them for real design work. They may compare the predictions with those from other models — perhaps using different principles or algorithms — or with physical testing, historical data, or other known data points.

Supercomputers are often called into use for this purpose. In fact, one of the biggest roles of supercomputer simulations in science and engineering is exploring the validity of the models. Models are pushed to extreme scales, data sets and boundary conditions, to help establish the confidence that they will be safe with less extreme parameters.

Many users of models are rigorous about validating their predictions, especially those users with a strong link to the advancement of the model or its underpinning science. But, unfortunately, not all users of models are so scrupulous.

They think the model must be right — after all, it is running at a higher resolution than before, or with physics algorithm v2.0, or some other enhancement, so the answers must be more accurate. Or they assume it is the model supplier's job to make sure it is correct. And yes, it is — but how often do users check that their prediction relies on a certified part of parameter space?

Misleading assumptions
Even for developers, the assumptions of increasing scale can be misleading. Higher resolutions do not guarantee more accurate results. For example, is the code numerically capable of handling the smaller or larger floating-point numbers involved? Do the algorithms remain stable over the larger number of iterations? Does the code use a subroutine or library call that may not have been designed or certified for this regime?

What about correctness errors unique to parallel processing, such as race conditions? At the higher end of supercomputing, fault tolerance is becoming critical — not just in node failures, but in softer errors such as data corruption in memory or in the interconnect.

What to do? As the use of high-performance computing becomes more widespread, we need to be wary of the assumptions that more powerful simulations are more accurate. In most cases they will be. In many cases they will explore parts of a design that are not possible any other way. But we need to check.

Building in safety
I don't have the space here to describe how to build safety into model development and validate its use; but I'll leave you with two thoughts.

First, more computational power, through high-performance computing, will make better modelling possible — but be sure to use some of that computational power to validate the modelling.

Second, don't make the dangerous assumption that physical testing is always better than computer predictions. Physical testing has its own sources of errors, assumptions, and regions of validity. Disagreement between computer and measurement does not mean the computer is wrong. You should explore the accuracy of both.

As vice president of HPC at the Numerical Algorithms Group, Andrew Jones leads the company's HPC services and consulting business, providing expertise in parallel, scalable and robust software development. Jones is well known in the supercomputing community. He is a former head of HPC at the University of Manchester and has more than 10 years' experience in HPC as an end user.

Editorial standards