The Single Best Predictive Modeling Technique. Seriously.

Nov 13, 2014
No comments yet
Industry News
19509 Views

The ROI on AI: Advisors struggle to get unbiased answers from tech providers
Large language models use a surprisingly simple mechanism to retrieve some stored knowledge
Apple researchers develop AI that can ‘see’ and understand screen context

9 years ago
The Single Best Predictive Modeling Technique. Seriously.

By: Dr. Justin Washtell, Technical, Director, ForecastThis
Originally published at www.analyticbridge.com

I read two strangely similar articles last week. One was an article by Vincent Granville, entitled “The 8 worst predictive modeling techniques”. The other was an article on Forbes entitled “America’s 10 Best-Paying Jobs”.

What on Earth do these two articles have in common (other than both being lists)?

A brief flick through the Forbes article reveals that it could almost as accurately have been entitled “America’s Single Best Paying Job”: because 9 out of the 10 jobs listed were in healthcare.

Likewise Granville’s article, which is packed with excellent and accurate detail, really has one central point: that the biggest enemy of good predictive modeling is human error.

Human error in data science comes in a host of forms, but they can almost all be distilled down into a handful of categories:

A lack of knowledge of the appropriate algorithms or techniques
An unreasonable or non-empirical bias towards particular algorithms or techniques (in spite of available knowledge)
Simple systematic mistakes in application of techniques or in interpretation of results (however well informed or chosen)

These pitfalls are nothing new to the world; tales abound of “bad statistics” by experienced practitioners. What is relatively new is the variety and complexity of the statistical techniques that are available to and demanded of data scientists, putting the level of risk into an entirely different order.

Michael Jordan, in an interview also published last week, likens the current application of data science to big data with the proverbial billion typing monkeys, warning of an impending disaster from the failure of all the less-than-rigorously validated models currently in production. There is no question as to whether he is right in principle. The question is only how significant the repercussions will be, and to what extent they will be offset by any gains arising from good modeling. Perhaps if data science plays things right, there will be no “big data winter” to speak of.

So, what is the single best predictive modeling technique available, imho?

Simple. Take human judgement out of the equation wherever it is not required.

Tools presently exist which combine automated search methods with rigorous cross-validation, making it much easier to make optimal selections from a host of algorithms and parameters (the Caret package in R is a good example). However these methods still rely on manual specification of algorithms and search parameters, they can be extremely computationally expensive, and remain prone to under or over fitting if mis-used.

Companies like DataRobot and ForecastThis (disclaimer: for whom I work) are taking this idea to the next level. These services combine up-to-date algorithm libraries with robust parallelized search and cross-validation on the cloud, along with Python and R integration.

DataRobot are still in beta and are playing the details of their platform quite close to their chest, but in the case of ForecastThis the library includes not just classification and regression algorithms, but algorithms addressing the gamut of the predictive modeling pipeline, including data cleansing, feature transformation, NLP, and so on.

For data scientists, technologies like this mean that it is now practical to confidently identify the most appropriate algorithm configurations in a way that is fast, thorough, and presents fewer opportunities for human error.

Let us be clear. There is currently no replacement for first hand data science expertise: despite buzz around advances in “Deep” Neural Networks and so on, there is presently no such thing as a one-size-fits-all black box for predictive modeling.

That said, the arrival of new data, algorithms and applications shows no sign of slowing down. As the field of data science slowly matures and knowledge of best practices struggles to disseminate, the professional landscape is only becoming more competitive. In this climate, undertaking predictive modeling without appropriate use of intelligent automation is like playing a high-stakes round of golf on an unfamiliar course without an experienced caddy.

By: Dr. Justin Washtell, Technical, Director, ForecastThis
Originally published at www.analyticbridge.com

EXCLUSIVE HIGHLIGHTS

Related

9 years ago
The Single Best Predictive Modeling Technique. Seriously.

Leave a Reply Cancel reply

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact

EXCLUSIVE HIGHLIGHTS

Related

9 years agoThe Single Best Predictive Modeling Technique. Seriously.

Recommended

The ROI on AI: Advisors struggle to get unbiased answers from tech providers

Large language models use a surprisingly simple mechanism to retrieve some stored knowledge

Apple researchers develop AI that can ‘see’ and understand screen context

A.I. Is Spying on the Food We Throw Away

Leave a Reply Cancel reply

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190 Produced by: Rising Media & Prediction Impact

9 years ago
The Single Best Predictive Modeling Technique. Seriously.

The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact