Experts believe our collection of Big Data will double every two years until 2020.
Much of those digital artifacts come from people like you and me as we “Like” things on Facebook, buy books over the web, post blog entries, and share smartphone photos on Instagram. Yet only a fraction of this data is actually being used.
So what should we do with it?
Eric Siegel says that most valuable thing we can do with data is to “learn from it how to predict.”
The founder of the Predictive Analytics World conference, Dr. Siegel is also the author of the bestselling book, “Predictive Analytics,” with the catchy subtitle of “The Power to Predict Who Will Click, Buy, Lie, or Die.”
I read his work right on the heals of taking a Coursera MOOC on Data Analysis and was pleased to get Siegel’s common-sense clarifications of the same academic topics.
Throughout the book, Siegel provides real-life examples of how organizations use data and software to infer something unknown, perhaps imperfectly but often with surprising accuracy.
For example, Siegel covers how the retail giant Target Corporation uses predictive analytics to decide which of its shoppers might be pregnant and how financial services giant Chase predicts which customers might pay off mortgages early (good for the homeowner but bad for Chase since they lose interest payments).
Siegel points out that after a predictive application provides insight, somebody still has to do something about it. Target needs to provide pregnancy-related coupons to pregnant customers. Chase needs to convince mortgage holders to stay.
Siegel’s book focuses on five different “effects” of using data to infer some unknown situation:
The Prediction Effect
“A little prediction goes a long way.”
The Data Effect
“Data is always predictive.”
The Induction Effect
“Art drives machine learning; when followed by computer programs, strategies designed in part by informal human creativity succeed in developing predictive models that perform well on new cases.”
The Ensemble Effect
“When joined in an ensemble, predictive models compensate for one another’s limitations, so the ensemble as a whole is more likely to predict correctly than its component models are.”
The Persuasion Effect
“Although imperceivable, the persuasion of an individual can be predicted by uplift modeling, predictively modeling across two distinct training data sets that record, respectively, the outcomes of two competing treatments.”
In addition to these five effects, Siegel covers the important Big Data topic of ethics.
Imagine that your company could predict which of its customers were likely to die soon. What actions should it take? Who owns that powerful piece of information? Are there any obligations and responsibilities related to holding that insight?
I found Siegel’s book to be not only educational but also enjoyable; it was like a “Moneyball” for the business world. And it was not without a game; Siegel devoted an entire chapter to how IBM’s Watson computer used predictive analytics to beat humans on Jeopardy!.
If you want a free copy of the book, just attend one of the upcoming PAWCon events. In September, there is one in Boston where you can hear Dr. Siegel during the keynote presentation.