By: Eric Siegel, Predictive Analytics World
Watch the latest episode of The Dr. Data Show, which answers the question, “Why do computers predict when you’ll die?” – with five example reasons!
About the Dr. Data Show. This new web series breaks the mold for data science infotainment, captivating the planet with short webisodes that cover the very best of machine learning and predictive analytics.
Full Transcript of This Episode
Please note that viewing the video (above) is recommended, since it includes complementary visuals. Also, certain vocal inflections and gesticulations hold meaning. Some of the intented meaning is lost by reading this transcript rather watching the video.
Welcome to “The Dr. Data Show”! I’m Eric Siegel.
Various organizations predict when you’ll die. Insurance companies, police departments, doctors — and in some cases, this actually helps you. The fact is, there are computers out there calculating the probability you will die in the relatively near future. These predictions serve a variety of purposes, such as trying to medically allay your demise, or at least administratively manage it.
Even Facebook has applied for a patent that covers predicting if one of your friends has died, even if their profile hasn’t been updated to reflect that yet.
Death prediction is a very morbid topic indeed — but a very practical one. So how does it work? What technological sorcery is at play? And how accurate is it? And, gee whiz, what’s it used for? In the next several minutes, I shall answer these questions and cover five reasons computers predict when you will die.
It’s a defining characteristic of life that we’re all gonna die. We’re gonna kick the bucket, buy the farm, bite the dust, and croak.
People do try to gloss over this little detail. For example, the publisher of my book, “Predictive Analytics,” was initially quite resistant to authorize the book’s subtitle ‘cause it has the word “die” in it. The book title’s an informal definition of the field, by the way. The full title is, “Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die.” After the book published, IBM offices in Australia even disallowed the book’s full title in their printed materials because of that last word.
But we shouldn’t gloss over death — not only for the obvious reason of its unavoidability, but also ‘cause death matters. I mean, every important thing a person does can be valuable to predict — for all kinds of business, government, and healthcare operations. And this definitely applies for the final thing we will each do. To put it gently, I’m talking about, you know, keeling over.
How does it work? Well, the same machine learning methods apply for this use of predictive analytics as for predicting any other kind of behavior or outcome. What makes the difference for what it’s predicting is just the training data. For predicting death, we supply data that includes examples of A) people who died — say, within a time window of 12 months — and examples of B) people who did not die within that time frame, so that our trusty machine can derive patterns that discern group A from group B. Practically speaking, for the machine to learn is for it to find trends in how those two groups differ.
The basis for differentiating these two groups will be any and all things recorded about individuals, including their demographics and their previous behavior. Whichever such attributes are included in the training data end up serving as inputs to the predictive model. For example, in healthcare this can be all kinds of clinical features, test results, and even MRIs. But then it goes beyond all that to some pretty surprising stuff. All kinds of unexpected factors you wouldn’t immediately think of might help predict. For example, early retirement is a risk factor. For a group of Austrian men in one study, early retirement decreased life expectancy. And, on the boat the Titanic in 1912, women were almost four times more likely to survive than men. Rock stars die younger, and solo rock stars die even younger than those in bands, according to public health offices in the U.K.
Now, if you’d like to quickly calculate your own fate, death-clock.org will estimate the Grim Reaper’s ETA by way of statistics from the World Health Organization — it just asks for a handful of elements like age, gender, body-mass index, and cigarette and alcohol consumption. For me, it says I’ve got until December 20th, 2052.
But that’s far from certain. As with most outcomes we may wish to predict, death is impossible for anyone or any technology to predict with high accuracy in general. But it’s still useful. I should be quick to point that, like many other outcomes, it’s definitely possible for machine learning to predict death considerably better than guessing — and often better than human experts. I’ll explain why that limited level of prediction is still extremely useful — but first a little more on what it means to estimate these probabilities.
The trends learned from data are together known as a predictive model. For death prediction, this is often called a mortality model. Or, the more genteel among us might call it a croakometer.
Just kidding. Now, when the model predicts for an individual person today whether they’ll live or die, only time will tell for sure if it turns out to be correct. To be more specific, some models calculate, for example, the probability you will die within, say, one year. So, among those assigned around a, say, 30% probability, about 30% of those will indeed die within a year. But we don’t have certainty about any one of them as an individual — on a case-by-case basis, we don’t know for sure one way or the other. You can think of probabilities as just a level of predictive confidence that death will happen — a risk level — which is basically never full on 100%. There’s never total certainty — always some shade of gray.
Now, just to clarify, even though the model just gives us probabilities, the word “prediction” does still apply. If you say, “Let’s put our money on everyone assigned a 90% or greater risk and bet they’ll die.” Then we’re using the model to predict, and in fact we may be quite accurate for that particular smaller segment of high risk individuals, but predictions for the overall entire population in general won’t be accurate per se. So beware any claims of “high accuracy” in machine learning. It’s not a magic crystal ball.
So by the way, all these points also apply for all predictive objectives, for any other outcome or behavior you’re trying to predict, be it predicting who will click, who will buy, who will lie, or who will eat pie.
Anyway, predicting death — even just better than guessing — can be super valuable. Here are five ways machine learning for death prediction is put to good use.
Number one: Healthcare providers predict death to help prevent it. Patients with a high mortality risk are flagged accordingly, and this guides clinical treatment decisions. For example, Riskprediction.org.uk — you can go check it out — predicts the risk of death in surgery, based on characteristics of the patient and the type of surgery.
Number two: End-of-life counseling, palliative care, and hospice care can be more effectively targeted when we know which patients are the most likely to pass away. Many who are nearing the end of life will greatly benefit from these offerings, especially if well-timed. For this reason, there’s an active trend among healthcare institutions, insurance companies, and research labs to apply machine learning for this purpose.
Number three: Suicide prevention. The VA, the Crisis Text Line, and psychiatric research at large work to flag which individuals — be they veterans or those who’ve contacted a crisis hotline — are at the greatest risk in order to improve triage and more precisely target intervention.
Number four: Law enforcement and the military predict kill victims in order to protect. For example, researchers assess the risk to individual soldiers, such as when they’re parachuting, and law enforcement in, for example, Maryland applies predictive models to detect which inmates are more at risk to be either perpetrators or victims of murder.
And finally, number five: Life insurance. Now, the central thing life insurance companies do is predict life expectancy — that is, they predict when you’re going to die. That’s how they set premiums and manage coverage. To improve their predictions, life insurance companies more and more these days go beyond conventional actuarial look-up tables and integrate machine learning methods in order to improve predictive performance.
And, hold on, we’re not quite finished yet — as a special bonus, there’s actually one more thing to predict even after death. The Chicago Police Department predicts whether a murder can be solved, based on the characteristics of a homicide and its victim. The objective is to better manage and triage investigations, which improves overall enforcement, which, one hopes, could help reduce the number of future murders.
And that wraps up another deadly episode of the Dr. Data Show. I’m Eric Siegel; thanks for watching. Hit “like” and share this video if you think your friends would also be interested in the very morbid but practical topic of death prediction. And for access to the entire web series, go to TheDoctorDataShow.com.
Excerpt of “Predict This!” rap during closing credits:
Who’s your data?
Provide me the data to improve
and I’ll apply the computation.
Predictive analytics can help you with decisions;
you can call, mail, credit, or hire with precision.
On law, love, and life, you can prognosticate
whom to investigate, incarcerate, set up on a date, or medicate.
Charlie Brown never gets his kicks;
that’s why every old dog needs a brand new trick.
If you get sick of chasing sticks or clicks with just a quick fix,
you need to learn to predict.
I can predict your every move;
just gimme all your information.
Who’s your data?
Provide me the data to improve
and I’ll apply the computation.
I love it when you call me big data.
To receive notifications of new webisodes of The Dr. Data Show as they are released, register for The Predictive Analytics Times – the machine learning professionals’ premier resource.