Originally published by Scientific American June 14, 2017
Let’s not lose our scientific minds. As the White House’s travel ban maneuvers toward the U.S. Supreme Court for possible approval, let’s take a scientific look at whether it makes sense for security. Does data support the ban’s underlying supposition that disallowing Muslims—at least those from certain countries—would decrease the risk of terrorism?
This question begs to be addressed, but who’s best suited to do so? The field of data science holds that jurisdiction. It is the discipline of improving operational decisions—often made by hunches and intuition—with empirical, fact-based insights. Data science has assumed the expansive role of running our increasingly data-driven world. It drives security screening for both government and commerce, including predictive policing, assessing convicts for parole decisions and detecting fraudulent transactions. (And the field extends beyond security to optimize for marketing, online ads, financial credit risk, health care and more.) Data science also applies to improve screening for terrorists. So, if Muslims as a group posed a security risk that warranted their prohibition, then data science would bear that out.
From the perspective of data science, however, a Muslim ban would weaken security, not strengthen it—even if only applied to a limited set of countries. Indeed, the majority of data scientists oppose such a ban. Unfortunately, the tricky pitfalls of quantitative analysis lead some to the opposite conclusion. As a result, for all the value that today’s growing use of data contributes to organizational decision-making, if data are interpreted as supporting a ban, it would only contribute uncertainty when it comes to the status of religious equality. Here are three analytical traps that cause people to misinterpret data as supporting a Muslim ban:
First, people excessively “slice and dice” data. Depending on how data are selected, certain portions of a terrorism database can wrongly appear to justify religion-based security screening. If a cherry-picked data sample designates Muslims as being most likely to commit an act of terror, then another sample may designate them as least likely. It depends on which records you include for analysis.
For example, a sample of terror incidents could be selected for analysis based on the country attacked, suspect’s country of origin, basis for entry (refugee, student, fiancé, etcetera) and era (for example, before or after 9/11). Different samples will suggest different conclusions and, indeed, published reports on the proportion of terrorism enacted by Muslims have disagreed with one another for this very reason. By poignant coincidence, data scientists affectionately describe such manipulation as “torturing the data until it confesses.”
Second, people misjudge the risk individual immigrants present. Even if the odds of violence differed greatly between religions, the odds of any one individual being a terrorist would remain tiny. This is because across any major religion the vast majority of people would never engage in terrorism. Imagine hypothetically that, after slicing and dicing, data show immigrants of a certain religion from a certain country were five times more likely to engage in terrorism than average over the past 10 years. (Such trends exist, because major religions see the frequency of terrorism carried out in their name rise and fall over time.) Even within such a group the odds for any one individual would be infinitesimally small—below 0.01 percent, in most cases. For no major religion are individual members especially dangerous.
Third, screening by religion only impairs the ability to predict who the terrorists are. It’s more than just philosophical to say people are defined by their behavior—by what they do rather than in which category they fit. In practical application, data science repeatedly shows that people’s prior actions predict future behavior more accurately than demographic profile data do.
Accordingly, any immigration-screening process can be continually improved by amassing more elements from what is an open-ended range of behavioral data, including personal and professional activities as well as financial transactions. Myriad as-yet undiscovered behavioral patterns would catch more terrorists than indiscriminately screening by demographic category. Because a category such as religion provides less information about future behavior, including it as a factor would only distract analytical number crunching (that is, predictive modeling) from pinpointing the best way to screen by prior behavior. Doing so would also reinforce existing biases among government workers and lessen the perceived importance of behavioral data.
The civil rights perspective further bolsters the position that behavior is more predictive than religion. A Muslim ban epitomizes prejudice in the most literal sense: It would be the very act of prejudging individuals based on a protected class, religion. Religion carries the status of a protected class because we respect it as a defining attribute, intrinsic to one’s identity. As with other defining attributes like race, gender and country of origin, an individual’s religion holds at most an indirect relationship with whether the person would engage in terrorism. Although terrorism can be shaped by religion, religious scholars argue that it doesn’t stem from religion per se. Terrorism’s causes include socioeconomic and geopolitical factors; an act of terror “in the name of” a religion does not mean it was “because of” that religion. Consequently, it’s an individual’s past behavior—not his or her religion—that’s pertinent to detecting any possible malicious intent.
Get the target right. Rather than screening by religion we must focus squarely on the threat we’re compelled to predict: terrorism. The very act of treating an immigrant as an individual, evaluating by way of the person’s unique backstory of behavior, will better predict risk and thereby improve security. In contrast, a ban that targets Muslims would be an epic national mistake—not only in terms of social justice but also in terms of security.
About the Author
Eric Siegel, Ph.D., founder of the Predictive Analytics World conference series and executive editor of The Predictive Analytics Times, makes the how and why of predictive analytics understandable and captivating. He is the author of the award-winning Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, a former Columbia University professor who used to sing to his students, and renowned speaker, educator, and leader in the field.
Eric has appeared on Al Jazeera America, Bloomberg TV and Radio, Business News Network (Canada), Fox News, Israel National Radio, NPR Marketplace, Radio National (Australia), and TheStreet. He and his book have been featured in Businessweek, CBS MoneyWatch, Contagious Magazine, The European Business Review, The Financial Times, Forbes, Forrester, Fortune, Harvard Business Review, The Huffington Post, The New York Review of Books, Newsweek, Quartz, Salon, Scientific American, The Seattle Post-Intelligencer, The Wall Street Journal, The Washington Post, and WSJ MarketWatch. Follow him at @predictanalytic