Learn Interpretability for Data Science

3 min readMar 18, 2019

Editor’s note: Rajiv Shaw will be a speaker at ODSC East 2019 this May! Be sure to check out his talk, “Deciphering the Black Box: Latest Tools and Techniques for Interpretability” there.

The impact of machine learning has been tremendous, whether it’s measured in dollars (trillions) or human impact (YouTube algorithms driving conspiracy theories). The engines of machine learning are a myriad set of algorithms derived from logistic regression, the deft gradient boosted trees, the preening neural networks, or the cooperative ensemble. In their wake is the realization that machine learning has been and will continually be driven by a diverse mix of algorithms.

Even when using the simplest of these algorithms, people can’t fully grasp how the inputs for a model will result in the value of a prediction. While it might be possible to get a calculator out and use the coefficients of a multivariate regression with 8 variables to calculate a prediction, research has shown that in terms of human interpretability, the workings of such algorithms are impenetrable. However, there are tools that allow us to understand or explain these black box models.

Spurious correlation is another common issue in data science. Consider this toy example in Strobl’s paper. The target feature is the score on a test of reading skills. The predictors include the age of the child and how much milk a child drinks. Obviously, milk consumption does not directly affect reading scores. (If so, we would focus on getting kids to drink less milk to improve reading). However, when I plot variable importance, it shows milk consumption as the most important feature! Clearly, something is amiss.

A second common concern in machine learning is multicollinearity. Anyone who has taken traditional statistics has learned to consider multicollinearity as undesirable, and many data scientists will, therefore, build out correlation matrices and find ways to reduce their features. However, in doing so, the performance of their model drops.

An emerging best practice is providing prediction explanations. Explanations complement the prediction score by identifying the features most responsible for the prediction score. This is extremely useful, but data scientists need to understand how these explanations work.

One simple example is the inconsistency with LIME. It’s possible for two data scientists, using the same data and the same model, to generate two different sets of explanations for the same observation. The example below shows for the same observation, how LIME generates two different explanations on the same model and data.

To learn about these issues and to understand some best practices around model interpretability, please consider attending my workshop at ODSC East 2019.

Bio of the writer: Rajiv Shah is a data scientist at DataRobot, where his primary focus is helping customers improve their ability to make and implement predictions. Previously, Rajiv has been part of data science teams at Caterpillar and State Farm. He has worked on a variety of projects from a wide-ranging set of areas including supply chain, sensor data, acturial ratings, and security projects. He has a PhD from the University of Illinois at Urbana-Champaign.

Learn Interpretability for Data Science

Written by ODSC - Open Data Science

No responses yet