By: Dean Abbott, Abbott Analytics and SmarterHQ, Inc.
Portions excerpted from Chapter 2 of his book Applied Predictive Analytics (Wiley 2014, http://amzn.com/1118727967) Successful predictive modeling is more than identifying the right algorithms. And, even though 60-90% of our time is spend on data preparation before deploying the first predictive model built from a new data set, successful predictive modeling goes well beyond effective
By Dean Abbott, Abbott Analytics and SmarterHQ, Inc.
In my last post, “Coefficients are not the same as variable influence”, I argued that coefficients in a linear regression model are useful but limited in answering the question, “which variables are most influential in model predictions?”...
By: Dean Abbott, Co-Founder & Chief Data Scientist, SmarterHQ
President, Abbott Analytics
When we build predictive models, we often want to understand why the model behaves the way it does, or in other words, which variables are the most influential in the predictions. But how can we tell which...
By: Dean Abbott, Co-Founder & Chief Data Scientist, SmarterHQ
President, Abbott Analytics
Excerpted and modified from Chapters 3 and 4 of Mr. Abbott’s book Applied Predictive Analytics, Wiley 2014 The Data Understanding stage of a predictive analytics project is intended to uncover the characteristics of the data available for...
By: Dean Abbott, SmarterHQ and Abbott Analytics
(more…)
In my last two posts I described why overfitting predictive models is dangerous beyond the most obvious problem, namely that accuracy on new data is lower than expected. In the next few posts, I’ll describe how to...
Editor’s note: This article compares measures for model performance. Note that “accuracy” is a specific such measure, but that this article uses the word “accuracy” to generically refer to measures in general. In data mining, data scientists...
Arguably, the most important safeguard in building predictive models is complexity regularization to avoid overfitting the data. When models are overfit, their accuracy is lower on new data that wasn’t seen during training, and therefore when these...
This speaker session is from Predictive Analytics World, September 30-October 1, 2013 in Boston, MA: (more…)
Predictive Modeling competitions, once the arena for a few data mining conferences, has now become big business. Kaggle (kaggle.com) is perhaps the most well-known forum for modeling competitions, using a crowd-sourcing mentality: if more people try to...
The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 •
Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact