Big ideas.
Once a week.
Subscribe to our weekly newsletter.
Predictive policing: Data can be used to prevent crime, but is that data racially tinged?
Predictive policing introduces a scientific element to law enforcement decisions, such as whether to investigate or detain, how long to sentence, and whether to parole.

As predictive analytics advances decision making across the public and private sectors, nowhere could this prove more important – nor more risky – than in law enforcement. If the rule of law is the cornerstone of society, getting it right is literally foundational. But the art of policing by data, without perpetuating or even magnifying the human biases captured within the data, turns out to be a very tricky art indeed.
Predictive policing introduces a scientific element to law enforcement decisions, such as whether to investigate or detain, how long to sentence, and whether to parole. In making such decisions, judges and officers take into consideration the calculated probability a suspect or defendant will be convicted for a crime in the future. Calculating predictive probabilities from data is the job of predictive modeling (aka machine learning) software. It automatically establishes patterns by combing historical conviction records, and in turn these patterns – together a predictive model – serve to calculate the probability for an individual whose future is as-yet unknown. Such predictive models base their calculations on the defendant's demographic and behavioral factors. These factors may include prior convictions, income level, employment status, family background, neighborhood, education level, and the behavior of family and friends.
Ironically, the advent of predictive policing came about in part to address the very same social justice infringements for which it’s criticized. With stop and frisk and other procedures reported to be discriminatory and often ineffective, there emerged a movement to turn to data as a potentially objective, unbiased means to optimize police work. Averting prejudice was part of the impetus. But the devil’s in the detail. In the process of deploying predictive policing and analyzing its use, complications involving racial bias and due process revealed themselves.
The first-ever comprehensive overview, The Rise of Big Data Policing: Surveillance, Race, and the Future of Law Enforcement, strikes an adept balance in covering both the promise and the peril of predictive policing. No one knows how much of a high wire act it is to justly deploy this technology better than the book’s author, law professor Andrew Guthrie Ferguson. The book’s mission is to highlight the risks and set a cautionary tone – however, Ferguson avoids the common misstep of writing off predictive policing as an endeavor that will always intrinsically stand in opposition to racial justice. The book duly covers the technical capabilities, underlying technology, historical developments, and numerical evidence that support both its deployed value and its further potential (on a closely-related topic, I covered the analogous value of applying predictive analytics for homeland security).
The book then balances this out by turning to the pitfalls, inadvertent yet dire threats to civil liberties and racial justice. Here are some of the main topics the book covers in that arena.
Racial Bias
As Ferguson puts it, “The question arises about how to disentangle legacy police practices that have resulted in disproportionate numbers of African American men being arrested or involved in the criminal justice system… if input data is infected with racial bias, how can the resulting algorithmic output be trusted?” It turns out that predictive models consulted for sentencing decisions falsely flag black defendants more often than white defendants. That is, among those who will not re-offend, the predictive system inaccurately labels black defendants as higher-risk more often than it does for white defendants. In what is the most widely cited piece on bias in predictive policing, ProPublica reports that the nationally used COMPAS model (Correctional Offender Management Profiling for Alternative Sanctions) falsely flags black defendants at almost twice the rate of white defendants (44.9% and 23.5%, respectively). However, this is only part of a mathematical conundrum that, to some, blurs the meaning of “fairness.” Despite the inequity in false flags, each individual flag is itself racially equitable: Among those flagged as higher risk, the portion falsely flagged is similar for both black and white defendants. Ferguson’s book doesn’t explore this hairy conundrum in detail, but you can learn more in an article I published about it.
Ground Truth: One Source of Data Bias
The data analyzed to develop crime-predicting models includes proportionately more prosecutions of black criminals than white ones and, conversely, proportionately fewer cases of black criminals getting away with crime (false negatives) than of white criminals. Starting with a quote from the ACLU’s Ezekiel Edwards, Ferguson spells out why this is so:
"Time and again, analysis of stops, frisks, searches, arrests, pretrial detentions, convictions, and sentencing reveal differential treatment of people of color.” If predictive policing results in more targeted police presence, the system runs the risk of creating its own self-fulfilling prediction. Predict a hot spot. Send police to arrest people at the hot spot. Input the data memorializing that the area is hot. Use that data for your next prediction. Repeat.

Since the prevalence of this is, by definition, not observed and not in the data, measures of model performance do not reveal the extent to which black defendants are unjustly flagged more often. After all, the model doesn’t predict crime per se; it predicts convictions – you don’t know what you don’t know. Although Ferguson doesn’t refer to this as a lack of ground truth, that is the widely used term for this issue, one that is frequently covered, e.g., by The Washington Post and by data scientists.
Constitutional Issues: Generalized Suspicion
A particularly thorny dispute about fairness – that’s actually an open constitutional question – arises when predictive flags bring about searches and seizures. The Fourth Amendment dictates that any search or seizure be “reasonable,” but this requirement is vulnerable to corruption when predictive flags lead to generalized suspicion, i.e., suspicion based on bias (such as the individual’s race) or factors that are not specific to the individual (such as the location in which the individual finds him- or herself). For example, Ferguson tells of a black driver in a location flagged for additional patrolling due to a higher calculated probability of crime. The flag has placed nearby a patrol, who pulls over the driver in part due to subjective “gut” suspicion, seeing also that there is a minor vehicle violation that may serve to explain the stop’s “reasonableness”: the vehicle’s windows are more heavily tinted than permitted by law. It’s this scenario's ambiguity that illustrates the dilemma. Do such predictive flags lead to false stops that are rationalized retroactively rather than meeting an established standard of reasonableness? “The shift to generalized suspicion also encourages stereotyping and guilt by association. This, in turn, weakens Fourth Amendment protections by distorting the individualized suspicion standard on the street,” Ferguson adds. This could also magnify the cycle perpetuating racial bias, further corrupting ground truth in the data.
Transparency: Opening Up Otherwise-Secret Models that Help Determine Incarceration
Crime-predicting models must be nakedly visible, not amorphous black boxes. To keep their creators, proponents, and users accountable, predictive models must be open and transparent so they’re inspectable for bias. A model’s inner workings matter when assessing its design, intent, and behavior. For example, race may hold some influence on a model’s output by way of proxies. Although such models almost never input race directly, they may incorporate unchosen, involuntary factors that approximate race, such as family background, neighborhood, education level, and the behavior of family and friends. For example, FICO credit scores have been criticized for incorporating factors such as the “number of bank accounts kept, [which] could interact with culture – and hence race – in unfair ways.”

Despite this, model transparency is not yet standard. For example, the popular COMPAS model, which informs sentencing and parole decisions, is sealed tight. The ways in which it incorporates such factors is unknown – to law enforcement, the defendant, and the public. In fact, the model’s creators recently revealed it only incorporates a selection of six of the 137 factors collected, but which six remains a proprietary secret. However, the founder of the company behind the model has stated that, if factors correlated with race, such as poverty and joblessness, “…are omitted from your risk assessment, accuracy goes down” (so we are left to infer the model may incorporate such factors).
In his book, Ferguson calls for accountability, but stops short of demanding transparency, largely giving the vendors of predictive models a pass, in part to protect “private companies whose business models depend on keeping proprietary technology secret.” I view this allowance as inherently contradictory, since a lack of transparency necessarily compromises accountability. Ferguson also argues that most lay-consumers of model output, such as patrolling police officers, would not be equipped to comprehend a model’s inner workings anyway. However, that presents no counterargument to the benefit of transparency for third party analytics experts who may serve to audit a predictive model. Previously, before his book, Ferguson had influenced my thinking in the opposite direction with a quote he gave me for my writing (a couple years before his book came out). He told me, “Predictive analytics is clearly the future of law enforcement. The problem is that the forecast for transparency and accountability is less than clear.”
I disagree with Ferguson’s position that model transparency may in some cases be optional (a position he also covers in an otherwise-valuable presentation accessible online). This opacity infringes on liberty. Keeping the inner workings of crime-predictive models proprietary is like having an expert witness without allowing the defense to cross-examine. It’s like enforcing a public policy the details of which are confidential. There’s a movement to make such algorithms transparent in the name of accountability and due process, in part forwarded by pertinent legislation in Wisconsin and in New York City, although the U.S. Supreme Court declined to take on a pertinent case last year.
Deployment: It’s How You Use It that Matters
In conclusion, Ferguson lands on the most pertinent point: It’s how you use it. “This book ends with a prediction: Big data technologies will improve the risk-identification capacities of police but will not offer clarity about appropriate remedies.” By “remedy,” this lawyer is referring to the way police respond, the actions taken. When it comes to fairness in predictive policing, it is less the underlying number crunching and more the manner in which it’s acted upon that makes the difference.
Should judges use big data tools for sentencing decisions? The designer of the popular COMPAS crime-predicting model did not originally intend it be used this way. However, he “gradually softened on whether this could be used in the courts or not.” But the Wisconsin Supreme Court set limits on the use of proprietary scores in future sentencing decisions. Risk scores “may not be considered as the determinative factor in deciding whether the offender can be supervised safely and effectively in the community.”
To address the question of how model predictions should be acted upon, I urge law enforcement to educate and guide decision makers on how big data tools inevitably encode racial inequity. Train judges, parole boards, and officers to understand the pertinent caveats when they’re given the calculated probability a suspect, defendant, or convict will offend or reoffend. In so doing, empower these decision makers to incorporate such considerations in whatever manner they deem fit – just as they already do with the predictive probabilities in the first place. See my recent article for more on the considerations upon which officers of the law should reflect.
Ferguson’s legal expertise serves well as he addresses the dilemma of translating predictions based on data into police remedies – and it serves well throughout the other varied topics of this multi-faceted, well-researched book. The Amazon description calls the book “a must read for anyone concerned with how technology will revolutionize law enforcement and its potential threat to the security, privacy, and constitutional rights of citizens.” I couldn’t have put it better myself.
--
Eric Siegel, Ph.D., founder of the Predictive Analytics World and Deep Learning World conference series – which include the annual PAW Government – and executive editor of The Predictive Analytics Times, makes the how and why of predictive analytics (aka machine learning) understandable and captivating. He is the author of the award-winning Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, a former Columbia University professor, and a renowned speaker, educator, and leader in the field.
No, the Yellowstone supervolcano is not ‘overdue’
Why mega-eruptions like the ones that covered North America in ash are the least of your worries.
Ash deposits of some of North America's largest volcanic eruptions.
- The supervolcano under Yellowstone produced three massive eruptions over the past few million years.
- Each eruption covered much of what is now the western United States in an ash layer several feet deep.
- The last eruption was 640,000 years ago, but that doesn't mean the next eruption is overdue.
The end of the world as we know it
Panoramic view of Yellowstone National Park
Image: Heinrich Berann for the National Park Service – public domain
Of the many freak ways to shuffle off this mortal coil – lightning strikes, shark bites, falling pianos – here's one you can safely scratch off your worry list: an outbreak of the Yellowstone supervolcano.
As the map below shows, previous eruptions at Yellowstone were so massive that the ash fall covered most of what is now the western United States. A similar event today would not only claim countless lives directly, but also create enough subsidiary disruption to kill off global civilisation as we know it. A relatively recent eruption of the Toba supervolcano in Indonesia may have come close to killing off the human species (see further below).
However, just because a scenario is grim does not mean that it is likely (insert topical political joke here). In this case, the doom mongers claiming an eruption is 'overdue' are wrong. Yellowstone is not a library book or an oil change. Just because the previous mega-eruption happened long ago doesn't mean the next one is imminent.
Ash beds of North America
Ash beds deposited by major volcanic eruptions in North America.
Image: USGS – public domain
This map shows the location of the Yellowstone plateau and the ash beds deposited by its three most recent major outbreaks, plus two other eruptions – one similarly massive, the other the most recent one in North America.
Huckleberry Ridge
The Huckleberry Ridge eruption occurred 2.1 million years ago. It ejected 2,450 km3 (588 cubic miles) of material, making it the largest known eruption in Yellowstone's history and in fact the largest eruption in North America in the past few million years.
This is the oldest of the three most recent caldera-forming eruptions of the Yellowstone hotspot. It created the Island Park Caldera, which lies partially in Yellowstone National Park, Wyoming and westward into Idaho. Ash from this eruption covered an area from southern California to North Dakota, and southern Idaho to northern Texas.
Mesa Falls
About 1.3 million years ago, the Mesa Falls eruption ejected 280 km3 (67 cubic miles) of material and created the Henry's Fork Caldera, located in Idaho, west of Yellowstone.
It was the smallest of the three major Yellowstone eruptions, both in terms of material ejected and area covered: 'only' most of present-day Wyoming, Colorado, Kansas and Nebraska, and about half of South Dakota.
Lava Creek
The Lava Creek eruption was the most recent major eruption of Yellowstone: about 640,000 years ago. It was the second-largest eruption in North America in the past few million years, creating the Yellowstone Caldera.
It ejected only about 1,000 km3 (240 cubic miles) of material, i.e. less than half of the Huckleberry Ridge eruption. However, its debris is spread out over a significantly wider area: basically, Huckleberry Ridge plus larger slices of both Canada and Mexico, plus most of Texas, Louisiana, Arkansas, and Missouri.
Long Valley
This eruption occurred about 760,000 years ago. It was centered on southern California, where it created the Long Valley Caldera, and spewed out 580 km3 (139 cubic miles) of material. This makes it North America's third-largest eruption of the past few million years.
The material ejected by this eruption is known as the Bishop ash bed, and covers the central and western parts of the Lava Creek ash bed.
Mount St Helens
The eruption of Mount St Helens in 1980 was the deadliest and most destructive volcanic event in U.S. history: it created a mile-wide crater, killed 57 people and created economic damage in the neighborhood of $1 billion.
Yet by Yellowstone standards, it was tiny: Mount St Helens only ejected 0.25 km3 (0.06 cubic miles) of material, most of the ash settling in a relatively narrow band across Washington State and Idaho. By comparison, the Lava Creek eruption left a large swathe of North America in up to two metres of debris.
The difference between quakes and faults
The volume of dense rock equivalent (DRE) ejected by the Huckleberry Ridge event dwarfs all other North American eruptions. It is itself overshadowed by the DRE ejected at the most recent eruption at Toba (present-day Indonesia). This was one of the largest known eruptions ever and a relatively recent one: only 75,000 years ago. It is thought to have caused a global volcanic winter which lasted up to a decade and may be responsible for the bottleneck in human evolution: around that time, the total human population suddenly and drastically plummeted to between 1,000 and 10,000 breeding pairs.
Image: USGS – public domain
So, what are the chances of something that massive happening anytime soon? The aforementioned mongers of doom often claim that major eruptions occur at intervals of 600,000 years and point out that the last one was 640,000 years ago. Except that (a) the first interval was about 200,000 years longer, (b) two intervals is not a lot to base a prediction on, and (c) those intervals don't really mean anything anyway. Not in the case of volcanic eruptions, at least.
Earthquakes can be 'overdue' because the stress on fault lines is built up consistently over long periods, which means quakes can be predicted with a relative degree of accuracy. But this is not how volcanoes behave. They do not accumulate magma at constant rates. And the subterranean pressure that causes the magma to erupt does not follow a schedule.
What's more, previous super-eruptions do not necessarily imply future ones. Scientists are not convinced that there ever will be another big eruption at Yellowstone. Smaller eruptions, however, are much likelier. Since the Lava Creek eruption, there have been about 30 smaller outbreaks at Yellowstone, the last lava flow being about 70,000 years ago.
As for the immediate future (give or take a century): the magma chamber beneath Yellowstone is only 5 percent to 15 percent molten. Most scientists agree that is as un-alarming as it sounds. And that its statistically more relevant to worry about death by lightning, shark, or piano.
Strange Maps #1041
Got a strange map? Let me know at strangemaps@gmail.com.
Smartly dressed: Researchers develop clothes that sense movement via touch
Measuring a person's movements and poses, smart clothes could be used for athletic training, rehabilitation, or health-monitoring.
In recent years there have been exciting breakthroughs in wearable technologies, like smartwatches that can monitor your breathing and blood oxygen levels.
But what about a wearable that can detect how you move as you do a physical activity or play a sport, and could potentially even offer feedback on how to improve your technique?
And, as a major bonus, what if the wearable were something you'd actually already be wearing, like a shirt of a pair of socks?
That's the idea behind a new set of MIT-designed clothing that use special fibers to sense a person's movement via touch. Among other things, the researchers showed that their clothes can actually determine things like if someone is sitting, walking, or doing particular poses.
The group from MIT's Computer Science and Artificial Intelligence Lab (CSAIL) says that their clothes could be used for athletic training and rehabilitation. With patients' permission, they could even help passively monitor the health of residents in assisted-care facilities and determine if, for example, someone has fallen or is unconscious.
The researchers have developed a range of prototypes, from socks and gloves to a full vest. The team's "tactile electronics" use a mix of more typical textile fibers alongside a small amount of custom-made functional fibers that sense pressure from the person wearing the garment.
According to CSAIL graduate student Yiyue Luo, a key advantage of the team's design is that, unlike many existing wearable electronics, theirs can be incorporated into traditional large-scale clothing production. The machine-knitted tactile textiles are soft, stretchable, breathable, and can take a wide range of forms.
"Traditionally it's been hard to develop a mass-production wearable that provides high-accuracy data across a large number of sensors," says Luo, lead author on a new paper about the project that is appearing in this month's edition of Nature Electronics. "When you manufacture lots of sensor arrays, some of them will not work and some of them will work worse than others, so we developed a self-correcting mechanism that uses a self-supervised machine learning algorithm to recognize and adjust when certain sensors in the design are off-base."
The team's clothes have a range of capabilities. Their socks predict motion by looking at how different sequences of tactile footprints correlate to different poses as the user transitions from one pose to another. The full-sized vest can also detect the wearers' pose, activity, and the texture of the contacted surfaces.
The authors imagine a coach using the sensor to analyze people's postures and give suggestions on improvement. It could also be used by an experienced athlete to record their posture so that beginners can learn from them. In the long term, they even imagine that robots could be trained to learn how to do different activities using data from the wearables.
"Imagine robots that are no longer tactilely blind, and that have 'skins' that can provide tactile sensing just like we have as humans," says corresponding author Wan Shou, a postdoc at CSAIL. "Clothing with high-resolution tactile sensing opens up a lot of exciting new application areas for researchers to explore in the years to come."
The paper was co-written by MIT professors Antonio Torralba, Wojciech Matusik, and Tomás Palacios, alongside PhD students Yunzhu Li, Pratyusha Sharma, and Beichen Li; postdoc Kui Wu; and research engineer Michael Foshey.
The work was partially funded by Toyota Research Institute.
Reprinted with permission of MIT News. Read the original article.
Do you worry too much? Stoicism can help
How imagining the worst case scenario can help calm anxiety.
Stoicism can help overcome anxiety
- Stoicism is the philosophy that nothing about the world is good or bad in itself, and that we have control over both our judgments and our reactions to things.
- It is hardest to control our reactions to the things that come unexpectedly.
- By meditating every day on the "worst case scenario," we can take the sting out of the worst that life can throw our way.
Are you a worrier? Do you imagine nightmare scenarios and then get worked up and anxious about them? Does your mind get caught in a horrible spiral of catastrophizing over even the smallest of things? Worrying, particularly imagining the worst case scenario, seems to be a natural part of being human and comes easily to a lot of us. It's awful, perhaps even dangerous, when we do it.
But, there might just be an ancient wisdom that can help. It involves reframing this attitude for the better, and it comes from Stoicism. It's called "premeditation," and it could be the most useful trick we can learn.
Practical Stoicism
Broadly speaking, Stoicism is the philosophy of choosing your judgments. Stoics believe that there is nothing about the universe that can be called good or bad, valuable or valueless, in itself. It's we who add these values to things. As Shakespeare's Hamlet says, "There is nothing either good or bad, but thinking makes it so." Our minds color the things we encounter as being "good" or "bad," and given that we control our minds, we therefore have control over all of our negative feelings.
Put another way, Stoicism maintains that there's a gap between our experience of an event and our judgment of it. For instance, if someone calls you a smelly goat, you have an opportunity, however small and hard it might be, to pause and ask yourself, "How will I judge this?" What's more, you can even ask, "How will I respond?" We have power over which thoughts we entertain and the final say on our actions. Today, Stoicism has influenced and finds modern expression in the hugely effective "cognitive behavioral therapy."
Helping you practice StoicismCredit: Robyn Beck via Getty Images
One of the principal fathers of ancient Stoicism was the Roman statesmen, Seneca, who argued that the unexpected and unforeseen blows of life are the hardest to take control over. The shock of a misfortune can strip away the power we have to choose our reaction. For instance, being burglarized feels so horrible because we had felt so safe at home. A stomach ache, out of the blue, is harder than a stitch thirty minutes into a run. A sudden bang makes us jump, but a firework makes us smile. Fell swoops hurt more than known hardships.
What could possibly go wrong?
So, how can we resolve this? Seneca suggests a Stoic technique called "premeditatio malorum" or "premeditation." At the start of every day, we ought to take time to indulge our anxious and catastrophizing mind. We should "rehearse in the mind: exile, torture, war, shipwreck." We should meditate on the worst things that could happen: your partner will leave you, your boss will fire you, your house will burn down. Maybe, even, you'll die.
This might sound depressing, but the important thing is that we do not stop there.
Stoicism has influenced and finds modern expression in the hugely effective "cognitive behavioral therapy."
The Stoic also rehearses how they will react to these things as they come up. For instance, another Stoic (and Roman Emperor) Marcus Aurelius asks us to imagine all the mean, rude, selfish, and boorish people we'll come across today. Then, in our heads, we script how we'll respond when we meet them. We can shrug off their meanness, smile at their rudeness, and refuse to be "implicated in what is degrading." Thus prepared, we take control again of our reactions and behavior.
The Stoics cast themselves into the darkest and most desperate of conditions but then realize that they can and will endure. With premeditation, the Stoic is prepared and has the mental vigor necessary to take the blow on the chin and say, "Yep, l can deal with this."
Catastrophizing as a method of mental inoculation
Seneca wrote: "In times of peace, the soldier carries out maneuvers." This is also true of premeditation, which acts as the war room or training ground. The agonizing cut of the unexpected is blunted by preparedness. We can prepare the mind for whatever trials may come, in just the same way we can prepare the body for some endurance activity. The world can throw nothing as bad as that which our minds have already imagined.
Stoicism teaches us to embrace our worrying mind but to embrace it as a kind of inoculation. With a frown over breakfast, try to spend five minutes of your day deliberately catastrophizing. Get your anti-anxiety battle plan ready and then face the world.
Jonny Thomson teaches philosophy in Oxford. He runs a popular Instagram account called Mini Philosophy (@philosophyminis). His first book is Mini Philosophy: A Small Book of Big Ideas.
Study: People will donate more to charity if they think something’s in it for them
A study on charity finds that reminding people how nice it feels to give yields better results than appealing to altruism.
