Search
  • Videos
  • Windows 10
  • 5G
  • Best VPNs
  • Cloud
  • Security
  • AI
  • more
    • TR Premium
    • Working from Home
    • Innovation
    • Best Web Hosting
    • ZDNet Recommends
    • Tonya Hall Show
    • Executive Guides
    • ZDNet Academy
    • See All Topics
    • White Papers
    • Downloads
    • Reviews
    • Galleries
    • Videos
    • TechRepublic Forums
  • Newsletters
  • All Writers
    • Preferences
    • Community
    • Newsletters
    • Log Out
  • Menu
    • Videos
    • Windows 10
    • 5G
    • Best VPNs
    • Cloud
    • Security
    • AI
    • TR Premium
    • Working from Home
    • Innovation
    • Best Web Hosting
    • ZDNet Recommends
    • Tonya Hall Show
    • Executive Guides
    • ZDNet Academy
    • See All Topics
    • White Papers
    • Downloads
    • Reviews
    • Galleries
    • Videos
    • TechRepublic Forums
      • Preferences
      • Community
      • Newsletters
      • Log Out
  • us
    • Asia
    • Australia
    • Europe
    • India
    • United Kingdom
    • United States
    • ZDNet around the globe:
    • ZDNet France
    • ZDNet Germany
    • ZDNet Korea
    • ZDNet Japan

Pitfalls to Avoid when Interpreting Machine Learning Models

1 of 8 NEXT PREV
  • 1-bad-model-generalization

    1-bad-model-generalization

    Any interpretation of relationships in the data is only as good as the model it is based on. Both under- and overfitting can lead to bad models with a misleading interpretation.

    => Use proper resampling techniques to assess model performance.

    Published: August 20, 2020 -- 15:30 GMT (08:30 PDT)

    Photo by: Christoph Molnar

    Caption by: George Anadiotis

  • 2-unnecessary-use-of-ml

    2-unnecessary-use-of-ml

    Don't use a complex ML model when a simple model has the same (or better) performance or when the gain in performance would be irrelevant. 

    => Check the performance of simple models first, gradually increase complexity.

    Published: August 20, 2020 -- 15:30 GMT (08:30 PDT)

    Photo by: Christoph Molnar

    Caption by: George Anadiotis

  • 3-1-ignoring-feature-dependence

    3-1-ignoring-feature-dependence

    When features depend on each other (as they usually do) interpretation becomes tricky, since effects can't be separated easily. 

    => Analyze feature dependence. Be careful with the interpretation of dependent features. Use appropriate methods.

    Published: August 20, 2020 -- 15:30 GMT (08:30 PDT)

    Photo by: Christoph Molnar

    Caption by: George Anadiotis

  • 3-2-confusing-dependence-with-correlation

    3-2-confusing-dependence-with-correlation

    Correlation is a special case of dependence. The data can be dependent in much more complex ways.

    => In addition to correlation, analyze data with alternative association measures such as HSIC.

    Published: August 20, 2020 -- 15:30 GMT (08:30 PDT)

    Photo by: Christoph Molnar

    Caption by: George Anadiotis

  • 4-misleading-effects-due-to-interaction

    4-misleading-effects-due-to-interaction

    Interactions between features can "mask" feature effects. 

    => Analyze interactions with e.g. 2D-PDP and the interactions measures.

    Published: August 20, 2020 -- 15:30 GMT (08:30 PDT)

    Photo by: Christoph Molnar

    Caption by: George Anadiotis

  • 5-ignoring-estimation-uncertainty

    5-ignoring-estimation-uncertainty

    There are many sources of uncertainty: model bias, model variance, estimation variance of the interpretation method. 

    => In addition to point estimates of (e.g., feature importance) quantify the variance. Be aware of what is treated as 'fixed.'

    Published: August 20, 2020 -- 15:30 GMT (08:30 PDT)

    Photo by: Christoph Molnar

    Caption by: George Anadiotis

  • 6-ignoring-multiple-comparisons

    6-ignoring-multiple-comparisons

    If you have many features and don't adjust for multiple comparisons, many features will be falsely discovered as relevant for your model. 

    => Use p-value correction methods.

    Published: August 20, 2020 -- 15:30 GMT (08:30 PDT)

    Photo by: Christoph Molnar

    Caption by: George Anadiotis

  • 7-unjustified-causal-interpretation

    7-unjustified-causal-interpretation

    Per default, the relationship modeled by your ML model may not be interpreted as causal effects.

    => Check whether assumption can be made for a causal interpretation.

    Published: August 20, 2020 -- 15:30 GMT (08:30 PDT)

    Photo by: Christoph Molnar

    Caption by: George Anadiotis

1 of 8 NEXT PREV
George Anadiotis

By George Anadiotis for Big on Data | August 20, 2020 -- 15:30 GMT (08:30 PDT) | Topic: Artificial Intelligence

  • 1-bad-model-generalization
  • 2-unnecessary-use-of-ml
  • 3-1-ignoring-feature-dependence
  • 3-2-confusing-dependence-with-correlation
  • 4-misleading-effects-due-to-interaction
  • 5-ignoring-estimation-uncertainty
  • 6-ignoring-multiple-comparisons
  • 7-unjustified-causal-interpretation

Modern requirements for machine learning models include both high predictive performance and model interpretability. A team of experts in explainable AI highlights pitfalls to avoid when addressing model interpretation, and discusses open issues for further research. Images created by Cristoph Molnar: https://twitter.com/ChristophMolnar/status/1281272026192326656

Read More Read Less

1-bad-model-generalization

Any interpretation of relationships in the data is only as good as the model it is based on. Both under- and overfitting can lead to bad models with a misleading interpretation.

=> Use proper resampling techniques to assess model performance.

Published: August 20, 2020 -- 15:30 GMT (08:30 PDT)

Caption by: George Anadiotis

1 of 8 NEXT PREV

Related Topics:

Artificial Intelligence Big Data Analytics Digital Transformation CXO Internet of Things Innovation
George Anadiotis

By George Anadiotis for Big on Data | August 20, 2020 -- 15:30 GMT (08:30 PDT) | Topic: Artificial Intelligence

Show Comments
LOG IN TO COMMENT
  • My Profile
  • Log Out
| Community Guidelines

Join Discussion

Add Your Comment
Add Your Comment

Related Galleries

  • 1 of 2
  • Nvidia Jetson Nano 2GB Development Kit With Wi-Fi Bundle

    The Nvidia Jetson Nano 2GB Developer Kit is an AI computer designed for manufacturers, students and embedded developers. The kit enables you to create parallel artificial intelligence ...

  • Azure Synapse Analytics data lake features: up close

    Microsoft has added a slew of new data lake features to Synapse Analytics, based on Apache Spark. It also integrates Azure Data Factory, Power BI and Azure Machine Learning. These ...

  • Over-hyped AI, data security worries and an e-commerce boom: Tech research round-up

    From emerging technologies to consumer confidence and onto remote-working tech, here's the charts that matter from the past month in news.

  • AAEON BOXER-8120AI

    The AAEON BOXER-8120AI equipped with NVIDIA Jetson TX2 can be hooked up to thermal and CCTV cameras and can be used to monitor entrances and other areas with high-volume foot traffic. ...

  • IT spending, cloud computing, big data, virtual reality, and more: Research round-up

    All the data that matters to you from the past month in technology news.

  • The Nightmare in Silicon Valley: 8 horror technologies that should scare you to death

    Every night is fright night with what can happen once these scary technologies take hold in ways that you may not have imagined.

ZDNet
Connect with us

© 2021 ZDNET, A RED VENTURES COMPANY. ALL RIGHTS RESERVED. Privacy Policy | Cookie Settings | Advertise | Terms of Use

  • Topics
  • Galleries
  • Videos
  • Sponsored Narratives
  • Do Not Sell My Information
  • About ZDNet
  • Meet The Team
  • All Authors
  • RSS Feeds
  • Site Map
  • Reprint Policy
  • Manage | Log Out
  • Join | Log In
  • Membership
  • Newsletters
  • Site Assistance
  • ZDNet Academy
  • TechRepublic Forums