Why Causal Machine Learning is the Next Revolution in AI
Causal modeling and inference are perhaps at the core of the most interesting questions in data science. A common task for a data scientist at a FAANG is to query users who had exposure to a feature and calculate the correlation between usage of that feature and engagement on the platform.
However, the data scientist does not care about that correlation; they care about whether that correlation indicates that the feature drives engagement. In other words, they care about the causal effect of the feature on engagement. For this reason, experience with causal inference is a highly sought-after skill in marketing and digital experimentation teams at top companies, particularly in tech. I cover these FAANG-flavored use cases like these with my online students in a course called Causal Generative Machine Learning Minicourse on Altdeep.ai.
Not even deep correlation implies causation.
Deep learning has yielded awe-inspiring results in terms of predictive performance on a variety of benchmarks. Deep learning is essentially the state-of-the-art in correlation detection; it can model the complex nonlinear correlation between high dimensional sets of variables and use the model to generate accurate predictions. Yet the old maxim still holds; correlation does not imply causality, even if that correlation is fancy.
Like the FAANG data scientist, machine learning engineers find themselves in need of going beyond correlation to answer causal questions. Some of these questions emerge when dealing with failure modes of predictive modeling. Robustness is one example; if you could gain insight into the causal relationship between features and labels, you could construct models that were performant across different datasets since causality is consistent across data. Similarly, the core questions of algorithmic fairness are causal — is the credit algorithm or hiring algorithm giving a bad result to an individual because of their race or gender?
We’ll get better reinforcement learning with counterfactual regret.
However, for me, the most exciting element of causal machine learning is causal reinforcement learning, or more generally, causal agent modeling. Cognitive scientists argue that causal inference is native to human reasoning — the human mind generates causal explanations for how the data came to be. Consider the fundamentally human emotion of regret.
Regret is that feeling you get when you reason, “I married Jack, the man my mother approved of, and now I’m not happy. Had I married my true love Phil, I would be happy.” This counterfactual statement is odd from a predictive modeling standpoint because the prediction “I would be happy” is based on data “I married my true love Phil” that doesn’t exist. You didn’t marry Phil; you married Jack. Not even deep learning works on empty training data. Yet you can perform this counterfactual prediction because your mind can generate probable cause and effect scenarios of what life would have been like with Phil. Further, that generation incorporates data about your experience with Jack.
It turns out that we can indeed encode that reasoning process into an algorithm. For example, we can build agents that minimize counterfactual regret by reasoning, “I performed an action and got this reward; what reward would I have gotten had I performed a different action?” The agent can use the answer to that query to improve their action selection policy.
Algorithmic causal reasoning and inference will power a new set of breakthroughs in machine learning and artificial intelligence. Remember how those folks who got in early on the deep learning training enjoyed a massive career boost? Well, all aboard! The next revolution in AI is leaving the station. If you are interested in learning more, check out our Causal Generative Machine Learning Minicourse at Altdeep.ai.
About the author/ODSC East 2021 Speaker on Causal Machine Learning
Robert Ness didn’t start in machine learning. He started his career by becoming fluent in Mandarin Chinese and moving to Tibet to do developmental economics fieldwork. He later obtained a graduate degree from Johns Hopkins School of Advanced International Studies.
After switching to the tech industry, Robert’s interests shifted to modeling data. He attained his Ph.D. in mathematical statistics from Purdue University, and then he worked as a research engineer in various AI startups (he is currently a ML research scientist at Gamalon). He has published in journals and venues across these spaces, including RECOMB and NeurIPS, on topics including causal inference, probabilistic modeling, sequential decision processes, and dynamic models of complex systems. In addition to startup work, he is a machine learning professor at Northeastern University.