Ensemble Models Demystified

ODSC - Open Data Science
5 min readJun 3, 2019

Ensemble models give us excellent performance and work in a wide variety of problems. They’re easier to train than other types of techniques, requiring less data with better results. In machine learning, ensemble models are the norm. Even if you aren’t using them, your competitors are.

Kevin Lemagnen is going to introduce ensemble models to help us untangle these beneficial models by looking into the black box in Ensemble Methods Demystified.

[Related article: 25 Excellent Machine Learning Open Datasets]

What Can We Do With Ensembles?

Ensemble models are comprised of several weak models that aren’t so great themselves, known as weak learners. When you combine them, they fill in each other’s weaknesses and render better performance than they would when deployed alone.

Two popular techniques for employing ensembles are bagging and boosting. The first, bagging, has roots in Bootstrapping, the act of taking a random sample of data, all of which can be replaced with another random sampling and helps us understanding bias within the dataset. Bagging is the act of applying that method to a group of decision trees. Each model runs independently, and results are aggregated at the end. Boosting, on the other hand, runs weighted averages in parallel.

--

--

ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.