7 years ago
It is a Mistake to…. Listen Only to the Data

 (Part 5 of 11 of the Top 10 Data Mining Mistakes, drawn from the Handbook of Statistical Analysis and Data Mining Applications) Inducing models from data has the virtue of looking at the data afresh, not constrained by old hypotheses. But, while “letting the data speak”, don’t tune out received wisdom. Experience has taught this once brash analyst that those familiar with the domain are usually more vital to the solution of the problem than the technology we bring to bear. Often, nothing inside the data will protect one from significant, but wrong, conclusions. Table 1 contains two

