Pitfalls to Avoid when Interpreting Machine Learning Models

Modern requirements for machine learning models include both high predictive performance and model interpretability. A team of experts in explainable AI highlights pitfalls to avoid when addressing model interpretation, and discusses open issues for further research. Images created by Cristoph Molnar: https://twitter.com/ChristophMolnar/status/1281272026192326656

By George Anadiotis, Contributor Aug. 20, 2020 at 8:30 a.m. PT

1 of 8 Christoph Molnar

1-bad-model-generalization

Any interpretation of relationships in the data is only as good as the model it is based on. Both under- and overfitting can lead to bad models with a misleading interpretation.

=> Use proper resampling techniques to assess model performance.

2 of 8 Christoph Molnar

2-unnecessary-use-of-ml

Don't use a complex ML model when a simple model has the same (or better) performance or when the gain in performance would be irrelevant.

=> Check the performance of simple models first, gradually increase complexity.

3 of 8 Christoph Molnar

3-1-ignoring-feature-dependence

When features depend on each other (as they usually do) interpretation becomes tricky, since effects can't be separated easily.

=> Analyze feature dependence. Be careful with the interpretation of dependent features. Use appropriate methods.

4 of 8 Christoph Molnar

3-2-confusing-dependence-with-correlation

Correlation is a special case of dependence. The data can be dependent in much more complex ways.

=> In addition to correlation, analyze data with alternative association measures such as HSIC.

5 of 8 Christoph Molnar

4-misleading-effects-due-to-interaction

Interactions between features can "mask" feature effects.

=> Analyze interactions with e.g. 2D-PDP and the interactions measures.

6 of 8 Christoph Molnar

5-ignoring-estimation-uncertainty

There are many sources of uncertainty: model bias, model variance, estimation variance of the interpretation method.

=> In addition to point estimates of (e.g., feature importance) quantify the variance. Be aware of what is treated as 'fixed.'

7 of 8 Christoph Molnar

6-ignoring-multiple-comparisons

If you have many features and don't adjust for multiple comparisons, many features will be falsely discovered as relevant for your model.

=> Use p-value correction methods.

8 of 8 Christoph Molnar

7-unjustified-causal-interpretation

Per default, the relationship modeled by your ML model may not be interpreted as causal effects.

=> Check whether assumption can be made for a causal interpretation.

Show Comments

Related Galleries

Holiday wallpaper for your phone: Christmas, Hanukkah, New Year's, and winter scenes

Holiday lights in Central Park background

Related Galleries

Holiday wallpaper for your phone: Christmas, Hanukkah, New Year's, and winter scenes

21 Photos

Winter backgrounds for your next virtual meeting

Wooden lodge in pine forest with heavy snow reflection on Lake O'hara at Yoho national park

Related Galleries

Winter backgrounds for your next virtual meeting

21 Photos

Holiday backgrounds for Zoom: Christmas cheer, New Year's Eve, Hanukkah and winter scenes

Related Galleries

Holiday backgrounds for Zoom: Christmas cheer, New Year's Eve, Hanukkah and winter scenes

21 Photos

Hyundai Ioniq 5 and Kia EV6: Electric vehicle extravaganza

Related Galleries

Hyundai Ioniq 5 and Kia EV6: Electric vehicle extravaganza

26 Photos

A weekend with Google's Chrome OS Flex

Related Galleries

A weekend with Google's Chrome OS Flex

22 Photos

Cybersecurity flaws, customer experiences, smartphone losses, and more: ZDNet's research roundup

Related Galleries

Cybersecurity flaws, customer experiences, smartphone losses, and more: ZDNet's research roundup

8 Photos

Inside a fake $20 '16TB external M.2 SSD'

Related Galleries

Inside a fake $20 '16TB external M.2 SSD'

8 Photos