Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
Survey: Machine Learning Projects Still Routinely Fail to Deploy
 Originally published in KDnuggets. Eric Siegel highlights the chronic...
Three Best Practices for Unilever’s Global Analytics Initiatives
    This article from Morgan Vawter, Global Vice...
Getting Machine Learning Projects from Idea to Execution
 Originally published in Harvard Business Review Machine learning might...
Eric Siegel on Bloomberg Businessweek
  Listen to Eric Siegel, former Columbia University Professor,...
SHARE THIS:

This excerpt is from Stitchfix.com. To view the whole article click here.  

9 years ago
GAM: The Predictive Modeling Silver Bullet

 

Imagine that you step into a room of data scientists; the dress code is casual and the scent of strong coffee is hanging in the air. You ask the data scientists if they regularly use generalized additive models (GAM) to do their work. Very few will say yes, if any at all.

Now let’s replay the scenario, only this time we replace GAM with, say, random forest or support vector machines (SVM). Everyone will say yes, and you might even spark a passionate debate.

Despite its lack of popularity in the data science community, GAM is a powerful and yet simple technique. Hence, the purpose of this post is to convince more data scientists to use GAM. Of course, GAM is no silver bullet, but it is a technique you should add to your arsenal. Here are three key reasons:

  • Easy to interpret.
  • Flexible predictor functions can uncover hidden patterns in the data.
  • Regularization of predictor functions helps avoid overfitting.

In general, GAM has the interpretability advantages of GLMs where the contribution of each independent variable to the prediction is clearly encoded. However, it has substantially more flexibility because the relationships between independent and dependent variable are not assumed to be linear. In fact, we don’t have to know a priori what type of predictive functions we will eventually need. From an estimation standpoint, the use of regularized, nonparametric functions avoids the pitfalls of dealing with higher order polynomial terms in linear models. From an accuracy standpoint, GAMs are competitive with popular learning techniques.

In this post, we will lay out the principles of GAM and show how to quickly get up and running in R. We have also put together an ebook that gets into more detail around smoothing, model selection and estimation.

By: Kim Larsen, Director of Client Algorithms, Stitch Fix
Originally published at http://multithreaded.stitchfix.com

Leave a Reply