Member-only story

Cracking the Box: Interpreting Black Box Machine Learning Models

14 min readSep 6, 2019

To kick off this article, I’d like to explain the interpretability of a machine learning (ML) model.

According to Merriam-Webster, interpretability describes the process of making something plain or understandable. In the context of ML, interpretability provides us with an understandable explanation of how a model behaves. Basically, it helps us figure out what’s behind model predictions and how these models work. Miller and Tim’s “Explanation in Artificial Intelligence: Insights from the Social Sciences” states that “Interpretability is the degree to which a human can understand the cause of a decision.” By utilizing ML interpretability methods, we increase this degree and allow humans to consistently predict the model’s behavior.

Moreover, fairness and unbiasedness have recently become important auxiliary criteria for model optimization. ML interpretability is an essential tool to check these properties for ML systems.

In this work, I’ll let you in on major methods of tackling the interpretability of ML models using Python and explain how to build a resilient machine learning infrastructure that serves as a foundation for artificial intelligence, accelerating the time to market for AI-powered projects.

Classes of Interpretability Methods

Cracking the Box: Interpreting Black Box Machine Learning Models

Classes of Interpretability Methods

Written by ODSC - Open Data Science

No responses yet