Dealing with the Incompleteness of Machine Learning

ODSC - Open Data Science
4 min readNov 18, 2020

--

The prospect of automating every aspect of human life is exciting. Imagine humans permanently living a life of leisure and machine learning robot labor picking up the slack! Even though this sounds like a recipe for lazy and depressed humans, we can still be useful to each other by building communities surrounded by interests and companionship. And gaining fulfillment that we used to get through employment through collaboration, learning, and innovation.

It also seems like this imagined future is around the corner. In the last century and a half, we have automated most of the manual labor. Automating cognitive tasks is what naturally comes next. Essentially this means making decisions for us. We trust A.I. and machine learning to take care of this — at least piecemeal — till we get Artificial General Intelligence. However, it’s not as simple as it seems.

Let’s examine how a human makes decisions. Whenever it’s deliberate, it follows a logical sequence of seven steps. Perhaps you don’t follow these steps for trivial choices, but you would expect your doctor, lawyer, or portfolio manager to follow them when making decisions on your behalf.

We know that however carefully considered the decision was, a doctor can misdiagnose, a lawyer can pick a faulty defense strategy, and a portfolio manager can make the wrong investments. Human decision making is flawed, no doubt, but a human is accountable for any mistakes, can second guess their decisions or deliver it with hesitation, and explain their reasoning.

Can you expect the same from a machine learning algorithm?

Not really, but with all the hubris surrounding A.I., you would think so.

The Accountability Gap

When we discuss a Machine Learning model’s decisions, we generally focus on the very last step: Inference. The decisive inference step is when the model has already been deployed, and it’s being used in real-world applications. However, decisions were made before that. Not by the model but by a human! And because of this, there’s potential for bias in every previous step:

  • What part of the problem to focus on?
  • What data to use?
  • How to prepare the data?
  • What model classes to use?
  • What evaluation metrics to use?
  • And how to deploy the model?

The many untraceable decisions that were made before and not communicated or even understood create an accountability gap!

The Confidence Trap

Also, the machine learning model doesn’t second guess any decisions. They are final. Even when a probability or confidence band is surrounding a prediction, they are often ignored and not communicated to the end-user, making it deterministic.

Explain Your Reasoning

The model cannot do this. But as machine learning practitioners, we can do this for the model.

Machine learning interpretation is needed because machine learning by itself is incomplete as a solution. Think about it. Simple problems could be likely solved with a flowchart or procedural programming. The solution covers all of it.

The complex problems we optimize with machine learning require linear algebra, calculus, and statistics precisely because we don’t understand all of the problem. For instance, what does cancer look like on an x-ray? How could we even begin to describe all the many ways you can detect cancerous growth in an x-ray?

By explaining a model’s decisions, we can cover gaps in our understanding of the problem — it’s incompleteness. One of the most significant issues is that given the high accuracy of our machine learning solutions, we tend to increase our confidence level to the point we fully understand the problem. Then, we are misled into thinking our solution covers ALL OF IT!

Original post here.

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday.

--

--

ODSC - Open Data Science
ODSC - Open Data Science

Written by ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.

Responses (2)