Description
Building on the discussion of individual decision trees in the prior episode, Shea and Anders shift to one of today’s most popular ensemble models, the Random Forest. At first glance, the algorithm may seem like a brute force approach of simply running hundreds or thousands of decision trees, but it leverages the concept of “bagging” to avoid overfitting and attempt to learn as much as possible from the entire data sets, not just a few key features. We close by covering strengths and weaknesses of this model and providing some real-life examples.