A Study of Real-world AI Model Failures and Their Impact

4 min readMay 4, 2023

Editor’s note: Ayush Patel is a speaker for ODSC East 2023. Be sure to check out his talk, “Why do AI Models go Rogue? A Guide to Detect and Fix Silent Model Failures,” to learn more about AI model failures.

Rapid advancements in AI have made it possible to achieve what was once unimaginable. It wasn’t that long ago when training an LLM model like GPT-3 would be inconceivable by a single Nvidia Tesla v100 GPU. But now, with more excellent hardware capabilities and better architectures, AI is gradually jumping up and redefining what is and isn’t possible.

However, when considering such humongous leaps in AI evolution, we must also discuss its potential to fail. With more significant complexities, AI models are becoming more prone to errors that seemingly go undetected. Data becomes a focal point of many of these issues that ultimately drip down and impact important revenue-generating models in a business. From frequent downtimes to bad data-led decisions, a model plagued with issues can often produce various counter-productive effects.

How models prone to defects can lead to costly mistakes

According to a report by PYMNTS, businesses globally lose about $118 billion annually due to false positives. And this is just one aspect of the AI ecosystem. Cases, where model failures have caused irreversible damage, are real. So let’s look at a few real-world instances where AI has had to pay a hefty price in businesses.

IBM’s failed attempt at fighting cancer

IBM’s AI system, Watson Health, sought to develop solutions to help healthcare providers diagnose diseases more accurately, identify personalized treatment options, improve clinical workflows, and streamline administrative tasks.

What went wrong?

IBM’s marketing campaigns for Watson Health made bold claims about the system’s ability to revolutionize healthcare, but in reality; the AI technology was not yet advanced enough to deliver on those promises. Hospitals were often reluctant to share data, so Watson’s model had to rely on vague & erroneous data, ultimately resulting in many misdiagnoses and treatment options that weren’t plausible. The result was a gap between expectations and reality, leading to customer disappointment and disillusionment. Since then, it has been sold to venture capitalists for an undisclosed amount.

Amazon’s AI Resume Screening

In 2018, Amazon abandoned an AI-powered resume screening tool after it was found to be biased against women.

What went wrong?

The system was trained on resumes submitted to Amazon over ten years, predominantly from men, resulting in a systemic gender bias. As a result, the AI system learned to favor male candidates over female candidates, and the company realized that the tool needed to be more reliable and fair for screening job applicants.

Microsoft’s chatbot Tay

Microsoft’s chatbot Tay, which was designed to learn from Twitter conversations, quickly became a racist and sexist troll after being exposed to harmful content from Twitter users. Microsoft shut down the chatbot within 24 hours of its launch.

What went wrong?

This project failure can be traced back to the bias introduced by the training data and the users who then went on to interact with the bot. These biases resulted in the bot reiterating a lot which was fed to it via different sources and conversations.

Addressing the barriers to AI evolution, one model at a time

Fixing an AI model doesn’t have to be complicated. In fact, the root cause behind every AI failure lies behind the answer to the questions below:

Is something going wrong?
What is going wrong?
Why it went wrong?

Asking these questions is a conscious attempt at gauging the impact of these failures that go on to hamper businesses. Loss of revenue opportunities, increased resource costs, reputational damage, and even legal issues are just a few by-products of these failures. But, on the bright side, no model failure is too big to fix, provided you are proactive about them.

If you are curious to know how your AI/ML models in production might be failing under your noses and how to go about fixing them, be sure to catch my session, “Why do AI Models go Rogue? A Guide to Detect and Fix Silent Model Failures” at ODSC East 2023.

About the Author

Ayush Patel is the co-founder of TwelveFold, an AI start-up studio, where he manages a portfolio of MLOps and Generative AI companies with entrepreneurs. He also serves as the CEO of Censius, an AI Observability platform that helps to optimize AI models’ real-world performance.

As a seasoned ML professional, he has worked with customers across industry verticals alongside AI and research teams and is a strong advocate for building transparent, reliable, and compliant AI solutions.

Originally posted on OpenDataScience.com

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Subscribe to our fast-growing Medium Publication too, the ODSC Journal, and inquire about becoming a writer.

A Study of Real-world AI Model Failures and Their Impact

Written by ODSC - Open Data Science

No responses yet