Data Scientist: The Sexiest Job of the Twenty-first Century
—Title of a Harvard Business Review article by Thomas Davenport and DJ Patil, who in 2015 became the first U.S. Chief Data Scientist
Prediction is booming. It reinvents industries and runs the world. More and more, predictive analytics (PA) drives commerce, manufacturing, healthcare, government, and law enforcement. In these spheres, organizations operate more effectively by way of predicting behavior—i.e., the outcome for each individual customer, employee, patient, voter, and suspect.
Everyone’s doing it. Accenture and Forrester both report that PA’s adoption has more than doubled in recent years. Transparency Market Research projects the PA market will reach $6.5 billion within a few years. A Gartner survey ranked business intelligence and analytics as the current number one investment priority of chief information officers. And in a Salesforce.com study, PA showed the highest growth rate of all sales tech trends, more than doubling its adoption in the next 18 months. High-performance sales teams are four times more likely to already be using PA than underperformers.
I am a witness to PA’s expanding deployment across industries. Predictive Analytics World (PAW), the conference series I founded, has hosted over 10,000 attendees since its launch in 2009 and is expanding well beyond its original PAW Business events. With the expert assistance of industry partners, we’ve launched the industry-focused events PAW Government, PAW Healthcare, PAW Financial, PAW Workforce, and PAW Manufacturing, events for senior executives, and the news site The Predictive Analytics Times.
Since the publication of this book’s first edition in 2013, I have been commissioned to deliver keynote addresses in each of these industries: marketing, market research, e-commerce, financial services, insurance, news media, healthcare, pharmaceuticals, government, human resources, travel, real estate, construction, and law, plus executive summits and university conferences.
Want a future career in futurology? The demand is blowing up. McKinsey forecasts a near-term U.S. shortage of 140,000 analytics experts and 1.5 million managers “with the skills to understand and make decisions based on analysis of big data.” LinkedIn’s number one “Hottest Skills That Got People Hired” is “statistical analysis and data mining.”
PA is like Moneyball for . . . money.
Frequently Asked Questions about Predictive Analytics
Who is this book for?
Everyone. It’s easily understood by all readers. Rather than a how-to for hands-on techies, the book serves lay readers, technology enthusiasts, executives, and analytics experts alike by covering new case studies and the latest state-of-the-art techniques.
Is the idea of predictive analytics hard to understand?
Not at all. The heady, sophisticated notion of learning from data to predict may sound beyond reach, but breeze through the short Introduction chapter and you’ll see: The basic idea is clear, accessible, and undeniably far-reaching.
Is this book a how-to?
No, it is a conceptually complete, substantive introduction and industry overview.
Not a how-to? Then why should techies read it?
Although this mathless introduction is understandable by any reader— including those with no technical background—here’s why it also affords value for would-be and established hands-on practitioners:
That said, burgeoning practitioners who wish to jump directly to a more traditional, technically in-depth or hands-on treatment of this topic should consider themselves warned: This is not the book you are seeking (but it makes a good gift; any of your relatives would be able to understand it and learn about your field of interest).
As with introductions to other fields of science and engineering, if you are pursuing a career in the field, this book will set the foundation, yet only whet your appetite for more. At the end of this book, you are guided by the Hands-On Guide on where to go next for the technical how-to and advanced underlying theory and math.
What is the purpose of this book?
I wrote this book to demonstrate why PA is intuitive, powerful, and awe-inspiring. It’s a book about the most influential and valuable achievements of computerized prediction and the two things that make it possible: the people behind it and the fascinating science that powers it.
While there are a number of books that approach the how-to side of PA, this book serves a different purpose (which turned out to be a rewarding challenge for its author): sharing with a wider audience a complete picture of the field, from the way in which it empowers organizations, down to the inner workings of predictive modeling.
With its impact on the world growing so quickly, it’s high time the predictive power of data—and how to scientifically tap it—be demystified. Learning from data to predict human behavior is no longer arcane.
How technical does this book get?
While accessible and friendly to newcomers of any background, this book explores “under the hood” far enough to reveal the inner workings of decision trees (Chapter 4), an exemplary form of predictive model that serves well as a place to start learning about PA, and often as a strong first option when executing a PA project.
I strove to go as deep as possible—substantive across the gamut of fascinating topics related to PA—while still sustaining interest and accessibility not only for neophyte users, but even for those interested in the field avocationally, curious about science and how it is changing the world.
Is this a university textbook?
This book has served as a textbook at more than 30 colleges and universities. A former computer science professor, I wrote this introduction to be conceptually complete. In the table of contents, the words in parentheses beside each chapter’s “catchy” title reveal an outline that covers the fundamentals: (1) model deployment, (2) ethics, (3) data, (4) predictive modeling, (5) ensemble models, (6) question answering, and (7) uplift modeling. To guide reading assignments, see the diagram under the next question below.
However, this is not written in the formal style of a textbook; rather, I sought to deliver an entertaining, engaging, relevant work that illustrates the concepts largely via anecdotes.
For instructors considering this book for course material, additional resources and information may be found at www.teachPA.com.
How should I read this book?
The chapters of this book build upon one another. Some depend only on first reading the Introduction, but others build cumulatively. The figure below depicts these dependencies—read a chapter only after first reading the one it points up to. For example, Chapter 3 assumes you’ve already read Chapter 1, which assumes you’ve read the Introduction.
Dependencies between chapters. An arrow pointing up means, “Read the chapter above first”:
What’s new in the “Revised and Updated” edition of Predictive Analytics?
Where can I learn more after this book, such as a how-to for hands-on practice?
Excerpted with permission of the publisher from Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, Revised and Updated Edition (Wiley, January 2016) by Eric Siegel, Ph.D. Siegel is the founder of the Predictive Analytics World conference series—which covers both business and government deployment—executive editor of The Predictive Analytics Times, and a former computer science professor at Columbia University. For more information about predictive analytics, see the Predictive Analytics Guide.