Machine Learning Times
Machine Learning Times

Abbott Analytics

Three Critical Definitions You Need Before Building Your First Predictive Model

 Portions excerpted from Chapter 2 of his book Applied Predictive Analytics (Wiley 2014, Successful predictive modeling is more than identifying the right algorithms. And, even though 60-90% of our time is spend on data preparation before deploying the first predictive model built from a new data set, successful predictive modeling goes well beyond effective

In Predictive Analytics, Coefficients are Not the Same as Variable Influence, Part II

 In my last post, “Coefficients are not the same as variable influence”, I argued that coefficients in a linear regression model are useful but limited in answering the question, “which variables are most influential in model predictions?”...

In Predictive Analytics, Coefficients are Not the Same as Variable Influence

 When we build predictive models, we often want to understand why the model behaves the way it does, or in other words, which variables are the most influential in the predictions. But how can we tell which...

Predictive Modeling Forensics: Identifying Data Problems

 Excerpted and modified from Chapters 3 and 4 of Mr. Abbott’s book Applied Predictive Analytics, Wiley 2014 The Data Understanding stage of a predictive analytics project is intended to uncover the characteristics of the data available for...

Defining Measures of Success for Cluster Models


Recognizing and Avoiding Overfitting, Part 1

 In my last two posts I described why overfitting predictive models is dangerous beyond the most obvious problem, namely that accuracy on new data is lower than expected. In the next few posts, I’ll describe how to...

3 Ways to Test the Accuracy of Your Predictive Models

 Editor’s note: This article compares measures for model performance. Note that “accuracy” is a specific such measure, but that this article uses the word “accuracy” to generically refer to measures in general. In data mining, data scientists...

Why Overfitting is More Dangerous than Just Poor Accuracy, Part I

 Arguably, the most important safeguard in building predictive models is complexity regularization to avoid overfitting the data. When models are overfit, their accuracy is lower on new data that wasn’t seen during training, and therefore when these...

Video: My Five Predictive Analytics Pet Peeves

 This speaker session is from Predictive Analytics World, September 30-October 1, 2013 in Boston, MA: (more…)

A Good Business Objective Beats a Good Algorithm

 Predictive Modeling competitions, once the arena for a few data mining conferences, has now become big business. Kaggle ( is perhaps the most well-known forum for modeling competitions, using a crowd-sourcing mentality: if more people try to...

Page 1 of 2 1 2