In part one, I described one problem with overfitting the data is that estimates of the target variable in regions without any training data can be unstable, whether those regions require the model to interpolate or extrapolate. Accuracy is a problem, but more precisely, the problems in interpolation and extrapolation are not revealed using any
Over the past 5 years there have been several trends that have changed the way retailers operate their businesses. Many of them have to do with how consumers use technology to make a purchase. Pure e-commerce retailers...
Much has been written about customer churn – predicting who, when, and why customers will stop buying, and how (or whether) to intervene. Employee churn is similar – we want to predict who, when, and why employees...
Arguably, the most important safeguard in building predictive models is complexity regularization to avoid overfitting the data. When models are overfit, their accuracy is lower on new data that wasn’t seen during training, and therefore when these...
The need to adopt sophisticated data analytics has become widely apparent to businesses recently, and the necessity of adopting “Big Data” analytics approaches is only becoming more evident. Gartner’s report on Big Data Adoption in 2013 found...
(Part 4 (of 11) of the Top 10 Data Mining Mistakes, drawn from the Handbook of Statistical Analysis and Data Mining Applications) It is very important to have the right project goal; that is, to aim at...
Network analysis is an emerging Business Intelligence technique that’s increasingly used in risk management, social network analytics, banking, telecommunication analytics, bioinformatics, criminal intelligence, and human resources planning. Sometimes the term Network Analysis (or Network Analytics) is mixed...
Kaggle, an online platform that hosts data analytics competitions, allows companies to tap into the expertise of data gurus to tackle specific company issues (and possibly reap prizes and job offers, if successful). With more than 100,000...
This is my final article for this year. It’s hard to imagine that it’s almost 2014, and yet I can’t tell you how many times I’ve found myself in the following situation: I meet someone at a...
Predictive Modeling competitions, once the arena for a few data mining conferences, has now become big business. Kaggle (kaggle.com) is perhaps the most well-known forum for modeling competitions, using a crowd-sourcing mentality: if more people try to...
The Machine Learning Times © 2025 • 1221 State Street • Suite 12, 91940 •
Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact