Six ways machine learning threatens social justice

Machine learning is a powerful and imperfect tool that should not go unmonitored.

machine learning surveillance
Credit: Monopoly919 on Adobe Stock
  • When you harness the power and potential of machine learning, there are also some drastic downsides that you've got to manage.
  • Deploying machine learning, you face the risk that it be discriminatory, biased, inequitable, exploitative, or opaque.
  • In this article, I cover six ways that machine learning threatens social justice and reach an incisive conclusion: The remedy is to take on machine learning standardization as a form of social activism.

When you harness the power and potential of machine learning, there are also some drastic downsides that you've got to manage. Deploying machine learning, you face the risk that it be discriminatory, biased, inequitable, exploitative, or opaque. In this article, I cover six ways that machine learning threatens social justice and reach an incisive conclusion: The remedy is to take on machine learning standardization as a form of social activism.

When you use machine learning, you aren't just optimizing models and streamlining business. You're governing. In essence, the models embody policies that control access to opportunities and resources for many people. They drive consequential decisions as to whom to investigate, incarcerate, set up on a date, or medicate – or to whom to grant a loan, insurance coverage, housing, or a job.

For the same reason that machine learning is valuable—that it drives operational decisions more effectively—it also wields power in the impact it has on millions of individuals' lives. Threats to social justice arise when that impact is detrimental, when models systematically limit the opportunities of underprivileged or protected groups.

Here are six ways machine learning threatens social justice

Credit: metamorworks via Shutterstock

1) Blatantly discriminatory models are predictive models that base decisions partly or entirely on a protected class. Protected classes include race, religion, national origin, gender, gender identity, sexual orientation, pregnancy, and disability status. By taking one of these characteristics as an input, the model's outputs – and the decisions driven by the model – are based at least in part on membership in a protected class. Although models rarely do so directly, there is precedent and support for doing so.

This would mean that a model could explicitly hinder, for example, black defendants for being black. So, imagine sitting across from a person being evaluated for a job, a loan, or even parole. When they ask you how the decision process works, you inform them, "For one thing, our algorithm penalized your score by seven points because you're black." This may sound shocking and sensationalistic, but I'm only literally describing what the model would do, mechanically, if race were permitted as a model input.

2) Machine bias. Even when protected classes are not provided as a direct model input, we find, in some cases, that model predictions are still inequitable. This is because other variables end up serving as proxies to protected classes. This is a bit complicated, since it turns out that models that are fair in one sense are unfair in another.

For example, some crime risk models succeed in flagging both black and white defendants with equal precision – each flag tells the same probabilistic story, regardless of race – and yet the models falsely flag black defendants more often than white ones. A crime-risk model called COMPAS, which is sold to law enforcement across the US, falsely flags white defendants at a rate of 23.5%, and Black defendants at 44.9%. In other words, black defendants who don't deserve it are erroneously flagged almost twice as much as white defendants who don't deserve it.

3) Inferring sensitive attributes—predicting pregnancy and beyond. Machine learning predicts sensitive information about individuals, such as sexual orientation, whether they're pregnant, whether they'll quit their job, and whether they're going to die. Researchers have shown that it is possible to predict race based on Facebook likes. These predictive models deliver dynamite.

In a particularly extraordinary case, officials in China use facial recognition to identify and track the Uighurs, a minority ethnic group systematically oppressed by the government. This is the first known case of a government using machine learning to profile by ethnicity. One Chinese start-up valued at more than $1 billion said its software could recognize "sensitive groups of people." It's website said, "If originally one Uighur lives in a neighborhood, and within 20 days six Uighurs appear, it immediately sends alarms" to law enforcement.

4) A lack of transparency. A computer can keep you in jail, or deny you a job, a loan, insurance coverage, or housing – and yet you cannot face your accuser. The predictive models generated by machine learning to drive these weighty decisions are generally kept locked up as a secret, unavailable for audit, inspection, or interrogation. Such models, inaccessible to the public, perpetrate a lack of due process and a lack of accountability.

Two ethical standards oppose this shrouding of electronically-assisted decisions: 1) model transparency, the standard that predictive models be accessible, inspectable, and understandable. And 2) the right to explanation, the standard that consequential decisions that are driven or informed by a predictive model are always held up to that standard of transparency. Meeting those standards would mean, for example, that a defendant be told which factors contributed to their crime risk score -- which aspects of their background, circumstances, or past behavior caused the defendant to be penalized. This would provide the defendant the opportunity to respond accordingly, establishing context, explanations, or perspective on these factors.

5) Predatory micro-targeting. Powerlessness begets powerlessness – and that cycle can magnify for consumers when machine learning increases the efficiency of activities designed to maximize profit for companies. Improving the micro-targeting of marketing and the predictive pricing of insurance and credit can magnify the cycle of poverty. For example, highly-targeted ads are more adept than ever at exploiting vulnerable consumers and separating them from their money.

And insurance pricing can lead to the same result. With insurance, the name of the game is to charge more for those at higher risk. Left unchecked, this process can quickly slip into predatory pricing. For example, a churn model may find that elderly policyholders don't tend to shop around and defect to better offers, so there's less of an incentive to keep their policy premiums in check. And pricing premiums based on other life factors also contributes to a cycle of poverty. For example, individuals with poor credit ratings are charged more for car insurance. In fact, a low credit score can increase your premium more than an at-fault car accident.

6) The coded gaze. If a group of people is underrepresented in the data from which the machine learns, the resulting model won't work as well for members of that group. This results in exclusionary experiences and discriminatory practices. This phenomenon can occur for both facial image processing and speech recognition.

​Recourse: Establish machine learning standards as a form of social activism

To address these problems, take on machine learning standardization as a form of social activism. We must establish standards that go beyond nice-sounding yet vague platitudes such as "be fair", "avoid bias", and "ensure accountability". Without being precisely defined, these catch phrases are subjective and do little to guide concrete action. Unfortunately, such broad language is fairly common among the principles released by many companies. In so doing, companies protect their public image more than they protect the public.

People involved in initiatives to deploy machine learning have a powerful, influential voice. These relatively small numbers of people mold and set the trajectory for systems that automatically dictate the rights and resources that great numbers of consumers and citizens gain access to.

Famed machine learning leader and educator Andrew Ng drove it home: "AI is a superpower that enables a small team to affect a huge number of people's lives... Make sure the work you do leaves society better off."

And Allan Sammy, Director, Data Science and Audit Analytics at Canada Post, clarified the level of responsibility: "A decision made by an organization's analytic model is a decision made by that entity's senior management team."

Implementing ethical data science is as important as ensuring a self-driving car knows when to put on the breaks.

Establishing well-formed ethical standards for machine learning will be an intensive, ongoing process. For more, watch this short video, in which I provide some specifics meant to kick-start the process.

Eric Siegel, Ph.D., is a leading consultant and former Columbia University professor who makes machine learning understandable and captivating. He is the founder of the long-running Predictive Analytics World and the Deep Learning World conference series and the instructor of the end-to-end, business-oriented Coursera specialization Machine learning for Everyone. Stay in touch with Eric on Twitter @predictanalytic.

COVID-19 amplified America’s devastating health gap. Can we bridge it?

The COVID-19 pandemic is making health disparities in the United States crystal clear. It is a clarion call for health care systems to double their efforts in vulnerable communities.

Willie Mae Daniels makes melted cheese sandwiches with her granddaughter, Karyah Davis, 6, after being laid off from her job as a food service cashier at the University of Miami on March 17, 2020.

Credit: Joe Raedle/Getty Images
Sponsored by Northwell Health
  • The COVID-19 pandemic has exacerbated America's health disparities, widening the divide between the haves and have nots.
  • Studies show disparities in wealth, race, and online access have disproportionately harmed underserved U.S. communities during the pandemic.
  • To begin curing this social aliment, health systems like Northwell Health are establishing relationships of trust in these communities so that the post-COVID world looks different than the pre-COVID one.
Keep reading Show less

Who is the highest selling artist from your state?

What’s Eminem doing in Missouri? Kanye West in Georgia? And Wiz Khalifa in, of all places, North Dakota?

Eminem may be 'from' Detroit, but he was born in Missouri
Culture & Religion

This is a mysterious map. Obviously about music, or more precisely musicians. But what’s Eminem doing in Missouri? Kanye West in Georgia? And Wiz Khalifa in, of all places, North Dakota? None of these musicians are from those states! Everyone knows that! Is this map that stupid, or just looking for a fight? Let’s pause a moment and consider our attention spans, shrinking faster than polar ice caps.

Keep reading Show less

MIT breakthrough in deep learning could help reduce errors

Researchers make the case for "deep evidential regression."

Credit: sdeocoret / Adobe Stock
Technology & Innovation
  • MIT researchers claim that deep learning neural networks need better uncertainty analysis to reduce errors.
  • "Deep evidential regression" reduces uncertainty after only one pass on a network, greatly reducing time and memory.
  • This could help mitigate problems in medical diagnoses, autonomous driving, and much more.
Keep reading Show less

Skyborne whales: The rise (and fall) of the airship

Can passenger airships make a triumphantly 'green' comeback?

R. Humphrey/Topical Press Agency/Getty Images
Technology & Innovation

Large airships were too sensitive to wind gusts and too sluggish to win against aeroplanes. But today, they have a chance to make a spectacular return.

Keep reading Show less
Surprising Science

Vegans are more likely to suffer broken bones, study finds

Vegans and vegetarians often have nutrient deficiencies and lower BMI, which can increase the risk of fractures.

Scroll down to load more…
Quantcast