Not Always a Black Box: Machine Learning Approaches For Model Explainability
Imagine that you have built a very precise machine learning model by using clever tricks and non-standard features. You are beyond happy and proud. However, when you present your results to your stakeholders, they are less thrilled. They don’t understand what you did and why. They don’t comprehend how your clever model makes a prediction. Your model’s output even contradicts some of their intuition. Don’t you hate it when that happens?
After being in that scenario a few times myself, I could appreciate the importance of model explainability.
So what is explainable ML? Christoph Molnar, in his book ‘Interpretable Machine Learning, defines it as “methods and models that make the behavior and predictions of a machine learning model understandable to humans.”
Why model explainability?
In the 2017 edition of NeurIPS, an ‘“Interpretable ML Symposium” took place, including the “Interpretability, is necessary for machine learning” debate.
During the debate, a very interesting use case was discussed. A deep neural network model was trained to predict which patients with pneumonia were high-risk and thus should be treated first, and which were low-risk. The model was very accurate as deep learning models usually are, but one problem arose — it predicted that patients with asthma were low-risk. In reality, patients with asthma are considered more complicated and one might want to treat them first.
After additional analysis and help from experts, the model developers found that patients with asthma, on average, tend to seek help much earlier than healthy patients. Thus, asthma was a proxy for time to seeking treatment.
I personally find that a good example of how an accurate model is not necessarily a good performing model.
In certain disciplines, explainability is not even an option — it is required. Think about credit risk scoring, insurance, and healthcare. However, these industries could benefit from having “challenger” models, which could run alongside the current simple algorithms but are more precise. For that to happen, we need good explainable methods in place.
Explainability is necessary for building trust. I have found that implementing different explainability techniques when working with business partners who are not versed in the AI lingo has increased their understanding and faith in the models. Moreover, as machine learning models are becoming more and more prevalent, interpretability methods are likely to somewhat ease fears and increase social acceptance.
Having transparency in the models is also very important. Being able to detect potential bias against a certain group of the population before the model is released in the wild is vital. Otherwise, things tend to go wrong. And for applications in healthcare and safety, we cannot afford that scenario.
Figure 1: Why do we need interpretability methods?
Types of explainability approaches
There are multiple approaches to model interpretability. Some are model specific (such as tree interpreters and various techniques applied to deep learning models), while others are model agnostic (can be applied to any type of a model). We can also make the split between intrinsic (such as restricting the complexity of the model of features a priori), while others are applied post-hoc to the model training. There are also local (explain individual predictions) and global (explain the whole model behavior).
Figure 2: Categories of explainability methods
Explainability approaches in practice
During my upcoming talk in ODSC Europe this November 2019, I will demonstrate a number of explainability approaches applied to predicting the sale price of a house.
I will discuss a number of methods, including their advantages and pitfalls. Some of the approaches I will talk about are: visual methods (such as partial dependency plots, individual conditional expectation, accumulated local effects), feature importance, global surrogate models, LIME, Shapley values, and others.
Though none of the methods is a one-size-fits-all, one of the more promising is the Shapley values due to its solid theoretical foundations, intuitive interpretation, and local explainability.
Figure 3 shows the Shapley values plotted for the first row in the dataset. It neatly demonstrates how each feature contributes to pushing the model output from the base value.
Figure 3: Shapley values for the first case in the dataset
For more details on each of the methods, join me at the session at the ODSC London in November!