# Creating a new data type to better model risky outcomes

Sam Savage says that by using "dists" a new data type, in financial risk models, companies can reach better decisions and avoid serious mistakes.

Sam Savage is on a mission. He wants people to stop using single numbers in spreadsheets used to model financial risk and instead use a "distribution" - a range of numbers. He says that by using a distribution or "dist" we would be able to not only produce better models of uncertainty but we would avoid fundamental mistakes in modeling financial and operational performance.

Mr Savage recently published a book "The Flaw of Averages - Why we underestimate risk in the face of uncertainty" which explains his evangelism for the use of dists within financial models of risk.

Currently, the most widely used method of predicting uncertainty is to use single numbers, usually representing a single average of expected outcomes.

However, models based on average assumptions are wrong on average. This is a paradox that has been known by mathematicians for nearly 100 years, called Jensen's Inequality. Although business schools teach Jensen's Inequality, business managers continue to use average numbers to try to model things like demand, production, and project completion time. And they are constantly surprised by real world outcomes that can be very costly.

From the book: "A classic example of the Flaw of Averages involves a statistician who drowned crossing a river that was on average 3ft. deep."

If a dist is used, it creates a model that more clearly illuminates the range of risk within that simulation. It will also generate insight into the range of potential outcomes, which makes for better decisions.

I recently spoke with Mr Savage, whose day job is Professor in the Department of Management Science and Engineering at Stanford University. Here are some notes from our conversation:

- The widespread use of spreadsheets has made it much easier for people to plug in averages into their models. Those spreadsheets then are combined with other spreadsheets, and are easily shared, which then compounds the errors.

- Even though we've known about Jensen's Inequality for a hundred years, it wasn't until the spreadsheet that we started to suffer from this problem, because the spreadsheet makes it easy to misuse averages. The spreadsheet is the contagion vector for this fundamental mistake. It is easy to email a spreadsheet to 1,000 people.

- Dists can be plugged into spreadsheets. A single cell can contain a dist representing 10,000 numbers. This represents a range of outcomes. A dist captures relationships between outcomes.

- Think of climbing a ladder. Before you climb a ladder typically you will shake it to see how stable it is. This is similar to using a dist in a financial model - you are shaking it to see how it performs through many different scenarios at once.

- How do you know you are using the right dist? You don;t. But any dist is better than no dist. When you use a wrong number you put garbage in and you get garbage out. With a dist you get insight out.

- I've done a lot of work at Shell where we've developed 50,000 dists to model a variety of scenarios including the liklihood of geo-politcal turmoil. Dists can be used to model pandemics. Dists can also be used to evaluate investment decisions in startups.

- Businesses would be run much better if they used dists in their spreadsheets. Companies should appoint a Chief Probability Officer to select and use dists in their planning. Dists are very transparent, you can't hide information.

- I have a company (Probilitech) that sells dists (XLSim Software). And there are other companies that also sell dists.

- If public companies used dists in their forecasts to Wall Street, they would be better able to communicate the risks within their business. And Wall Street analysts would be better able to judge performance by seeing that a company performed within a predicted range rather than punishing the company if it missed projected earnings.

- - -