This post is largely excerpted from Dean Abbott’s book Applied Predictive Analytics (Wiley, 2014) Many predictive modeling projects include hundreds of candidate input variables as a part of the analysis, including original variables and new features created to improve the predictive models. The inclusion of hundreds of variables as candidates for predictive models can cause problems, however: 1) Some algorithms cannot reliably use hundreds or thousands of input variables. 2) Algorithms that can reliably incorporate hundreds or thousands of variables as candidate inputs or actual inputs to models may take considerable time to train, slowing the iterative process
This content is restricted to site members. If you are an existing user, please log in on the right (desktop) or below (mobile). If not, register today and gain free access to original content and industry news. See the details here.