By Dean Abbott, Abbott Analytics and SmarterHQ, Inc.
In my last post, “Coefficients are not the same as variable influence”, I argued that coefficients in a linear regression model are useful but limited in answering the question, “which variables are most influential in model predictions?” One manifestation of the differences is that variables that have relatively small coefficients, that is, they have relatively small influence on predictions on average, may have significant influence on predictions within sub-ranges of the input variable, even sometimes become the most important variable within the sub-range(s). This effect can occur when the input variables do not comply with the assumptions of the algorithm, most notably with linear regression models.
In this post, I’ll take this one step further to show another way to estimate the influence of a variable on a predictive model without having to decompose the predictions into terms like I did last week, which assumed linear regression models. The method described this week is based on randomization experiments to estimate influence without having to assume or understand the distributions input variables approximate.
To view this content
OR subscribe for free
Already receive the Machine Learning Times emails?
The Machine Learning Times now requires legacy email subscribers to upgrade their subscription - one time only - in order to attain a password-protected login and gain complete access.
The Machine Learning Times © 2020 • 211 E. Victoria Street, Suite E •
Santa Barbara, CA 93101
Produced by: Rising Media & Prediction Impact