Meltdown: Why our systems fail and what we can do about it

Today, we are in the golden age of meltdowns. More and more of our systems are in the danger zone, but our ability to manage them hasn’t quite caught up.

Ceasar Medina died because of a computer glitch.

Though he was shot in a botched robbery attempt, his killer—a convicted felon named Jeremiah Smith—should have been behind bars at the time. But Smith was one of thousands of inmates that the Washington State Department of Corrections accidentally released because of a software problem: a bug in the DOC’s computer code that, for over a decade, miscalculated prisoner sentences.

Surprising meltdowns like the one at the DOC happen all the time. At UCSF—one of the world’s best hospitals—a sophisticated pharmacy robot and a high-tech prescription system confused a doctor, lulled a pharmacist into approving a massive overdose of a routine antibiotic, and automatically packaged 38 pills, instead of the single pill the doctor intended. A nurse, comforted by the barcode scanner that confirmed the dosage, gave the pills one by one to her patient, a 16-year-old boy, who nearly died as a result.

In 2012, Wall Street giant Knight Capital unintentionally traded billions of dollars of stock and lost nearly $500 million in just half an hour because of a software glitch. It was a stunning meltdown that couldn’t have happened a decade earlier, when humans still controlled trading.

And at the airlines, technological glitches, combined with ordinary human mistakes, have caused outages in reservation and ticketing systems, grounded thousands of flights, and accidentally given pilots vacation during the busy holiday season. These issues cost the airlines hundreds of millions of dollars and delayed nearly a million passengers.

To understand why these kinds of failures keep happening, we turn to an unexpected source: a 93-year-old sociologist named Charles Perrow. After the Three Mile Island nuclear meltdown in 1979, Perrow became interested in how simple human errors spiral out of control in complex technological systems. For Perrow, Three Mile Island was a wake-up call. The meltdown wasn’t caused by a massive external shock like an earthquake or a terrorist attack. Instead, it emerged from the interaction of small failures—a plumbing glitch, a maintenance crew’s oversight, a stuck-open valve, and a series of confusing indicators in the control room.

The official investigation blamed the plant’s staff. But Perrow thought that was a cheap shot since the accident could only be understood in retrospect. That was a scary conclusion. Here was one of the worst nuclear accidents in history, but it wasn’t due to obvious human errors or a big external shock. It somehow just emerged from small mishaps that came together in a weird way.

Over the next four years, Perrow trudged through the details of hundreds of accidents. He discovered that a combination of two things cause systems to exhibit the kind of wild, unexpected behaviors that occurred at Three Mile Island.

The first element is complexity. For Perrow, complexity wasn’t a buzzword; it had a specific definition. A complex system is more like an elaborate web than an assembly line; many of its parts are intricately linked and can easily affect one another. Complexity also means that we need to rely on indirect indicators to assess most situations. We can’t go in to take a look at what’s happening in the belly of the beast. In a nuclear power plant, for example, we can’t just send someone to see what’s happening in the core. We need to piece together a full picture from small slivers—pressure indications, water flow measurements, and the like.

The second part of Perrow’s theory has to do with how much slack there is in a system. He borrowed a term from engineering: tight coupling. When a system is tightly coupled, there is little buffer among its parts. The margin for error is thin, and the failure of one part can easily affect the others. Everything happens quickly, and we can’t just turn off the system while we deal with a problem.

In Perrow’s analysis, it’s the combination of complexity and tight coupling that pushes systems into the danger zone. Small errors are inevitable in complex systems, and once things begin to go south, such systems produce baffling symptoms. No matter how hard we try, we struggle to make a diagnosis and might even make things worse by solving the wrong problem. And if the system is also tightly coupled, we can’t stop the falling dominoes. Failures spread quickly and uncontrollably.

When Perrow came up with his framework in the early 1980s, the danger zone he described was sparse: it included exotic systems like nuclear facilities and space missions. But in the intervening years, we’ve steadily added complexity and tight coupling to many mundane systems. These days, computers—often connected to the internet—run everything from cars to cash registers and from pharmacies to prisons. And as we add new features to existing technologies—such as mobile apps to airline reservation systems—we continue to increase complexity. Tight coupling, too, is on the rise, as the drive for lean operations removes slack and leaves little margin for error.

This doesn’t necessarily imply that things are worse than they used to be. What it does suggest, though, is that we are facing a different kind of challenge, one where massive failures come not from external shocks or bad apples, but from combinations of technological glitches and ordinary human mistakes.

We can’t turn back the clock and return to a simpler world. Airlines shouldn’t switch back to paper tickets and traders shouldn’t abandon computers. Instead, we need to figure out how to manage these new systems. Fortunately, an emerging body of research reveals how we can overcome these challenges.  

The first step is to recognize that the world has changed. But that’s a surprisingly hard thing to do, even in an era where businesses seem to celebrate new technologies like blockchain and AI. When we interviewed the former CEO of Knight Capital years after the firm’s technological meltdown, he said, “We weren’t a technology company—we were a broker that used technology.” Thinking of technology as a support function, rather than the core of a company, has worked for years. But it doesn’t anymore.

We need to assess our projects or businesses through the lens of complexity and tight coupling. If we are operating in the danger zone, we can try to simplify our systems, increase transparency, or introduce more slack. But even when we can’t change our systems, we can change how we manage them.

Consider a climbing expedition to Mount Everest. There are many hidden risks, from crevasses and falling rocks to avalanches and sudden weather changes. Altitude sickness causes blurred vision, and overexposure to UV rays leads to snow blindness. And when a blizzard hits, nothing is visible at all. The mountain is a complex and tightly coupled system, and there isn’t much we can do about that.         

But we can still take steps to make climbing Everest safer. In the past, for example, logistical problems plagued several Everest expeditions: delayed flights, customs issues, problems with supply deliveries, and digestive ailments.

In combination, these small issues caused delays, put stress on team leaders, took time away from planning, and prevented climbers from acclimating themselves to high altitudes. And then, during the final push to the summit, these failures interacted with other problems. Distracted team leaders and exhausted climbers missed obvious warning signs and made mistakes they wouldn’t normally make. And when the weather turns bad on Everest, a worn-out team that’s running behind schedule stands little chance.

Once we realize that the real killer isn’t the mountain but the interaction of many small failures, we can see a solution: rooting out as many logistical problems as possible. And that’s what the best mountaineering companies do. They treat the boring logistical issues as critical safety concerns. They pay a lot of attention to some of the most mundane aspects of an expedition, from hiring logistical staff who take the burden off team leaders to setting up well- equipped base camp facilities. Even cooking is a big deal. As one company’s brochure put it, “Our attention to food and its preparation on Everest and mountains around the world has led to very few gastrointestinal issues for our team members.”

You don’t need to be a mountain climber to appreciate this lesson. After a quality control crisis, for example, managers at pharmaceutical giant Novo Nordisk realized that the firm’s manufacturing had become too complex and unforgiving to manage in traditional ways. In response, they came up with a new approach to finding and addressing small issues that might become big problems.

First, the company created a department of about twenty people who scan for new challenges that managers might ignore or simply not have the time to think about. They talk with non-profits, environmental groups, and government officials about emerging technologies and changing regulations. The goal is to make sure that the company doesn’t ignore small signs of brewing trouble.

Novo Nordisk also uses facilitators to make sure important issues don’t get stuck at the bottom of the hierarchy (as they did before the quality control crisis). The facilitators—around two dozen people recruited from among the company’s most respected managers—work with every unit at least once every few years, evaluating whether there are concerns unit managers may be ignoring. “We go around and find a number of small issues,” a facilitator explained. “We don’t know if they would develop into something bigger if we ignored them. But we don’t run the risk. We follow up on the small stuff.”  

Other organizations use a different approach to manage this kind of complexity. NASA’s Jet Propulsion Laboratory (JPL) does some of the most complex engineering work in the world. Its mission statement is “Dare Mighty Things” or, less formally, “If it’s not impossible, we’re not interested.”

Over the years, JPL engineers have had their share of failures. In 1999, for example, they lost two spacecraft destined for Mars—one because of a software problem onboard the Mars Polar Lander and the other because of confusion about whether a calculation used the English or the metric system.

After these failures, JPL managers began to use outsiders to help them manage the risk of missions. They created risk review boards made up of scientists and engineers who worked at JPL, NASA, or contractors—but who weren’t associated with the missions they reviewed and didn’t buy into the same assumptions as mission insiders.

But JPL’s leaders wanted to go even further. Every mission that JPL runs has a project manager responsible for pursuing ground-breaking science while staying within a tight budget and meeting an ambitious schedule. Project managers walk a delicate line. When under pressure, they might be tempted to take shortcuts when designing and testing critical components. So senior leaders created the Engineering Technical Authority (ETA), a cadre of outsiders within JPL. Every project is assigned an ETA engineer, who makes sure that the project manager doesn’t make decisions that put the mission at risk. 

If an ETA engineer and a project manager can’t agree, they take their issue to Bharat Chudasama, the manager who runs the ETA program. When an issue lands on his desk, Chudasama tries to broker a technical solution. He can also try to get project managers more money, time, or people. And if he can’t resolve the issue, he brings it to his boss, JPL’s chief engineer. Such channels for skepticism are indispensable in the danger zone because the ability of any one individual to know what’s going on is limited, and the cost of being wrong is just too high.

This approach isn’t rocket science. In fact, the creation of outsiders within an organization has a long history. For centuries, when the Roman Catholic Church was considering whether to declare a person a saint, it was the job of the Promoter of the Faith, popularly known as the Devil’s Advocate, to make a case against the candidate and prevent any rash decisions. The Promoter of the Faith wasn’t involved in the decision-making process until he presented his objections, so he was an outsider free from the biases of those who had made the case for a candidate in the first place.

The sports writer Bill Simmons proposed something similar for sports teams. “I’m becoming more and more convinced that every professional sports team needs to hire a Vice President of Common Sense,” Simmons wrote. “One catch: the VP of CS doesn’t attend meetings, scout prospects, watch any film or listen to any inside information or opinions; he lives the life of a common fan. They just bring him in when they’re ready to make a big decision, lay everything out and wait for his unbiased reaction.”  

These solutions might sound obvious, and yet we rarely use them in practice. We don’t realize that many of our decisions contribute to complexity and coupling, resulting in increasingly vulnerable systems. We tend to focus on big, external shocks while ignoring small problems that can combine into surprising meltdowns. And we often marginalize skeptics instead of creating roles for them.

Today, we are in the golden age of meltdowns. More and more of our systems are in the danger zone, but our ability to manage them hasn’t quite caught up. And we can see the results all around us. The good news is that smart organizations are finding ways to navigate this new world, and we can all learn from them.


Excerpted from MELTDOWN by Chris Clearfield and András Tilcsik. Reprinted by arrangement with Penguin Press, a member of Penguin Group (USA) LLC, A Penguin Random House Company. Copyright © Christopher Clearfield and András Tilcsik, 2018.

How to vaccinate the world’s most vulnerable? Build global partnerships.

Pfizer's partnerships strengthen their ability to deliver vaccines in developing countries.

Susan Silbermann, Global President of Pfizer Vaccines, looks on as a health care worker administers a vaccine in Rwanda. Photo: Courtesy of Pfizer.
  • Community healthcare workers face many challenges in their work, including often traveling far distances to see their clients
  • Pfizer is helping to drive the UN's sustainable development goals through partnerships.
  • Pfizer partnered with AMP and the World Health Organization to develop a training program for healthcare workers.
Keep reading Show less

Scientists find a horrible new way cocaine can damage your brain

Swiss researchers identify new dangers of modern cocaine.

Getty Images
Mind & Brain
  • Cocaine cut with anti-worming adulterant levamisole may cause brain damage.
  • Levamisole can thin out the prefrontal cortex and affect cognitive skills.
  • Government health programs should encourage testing of cocaine for purity.
Keep reading Show less

Scientists claim the Bible is written in code that predicts future events

The controversy around the Torah codes gets a new life.

Michael Drosnin
Surprising Science
  • Mathematicians claim to see a predictive pattern in the ancient Torah texts.
  • The code is revealed by a method found with special computer software.
  • Some events described by reading the code took place after the code was written.
Keep reading Show less
Politics & Current Affairs

Political division is nothing new. Throughout American history there have been numerous flare ups in which the political arena was more than just tense but incideniary. In a letter addressed to William Hamilton in 1800, Thomas Jefferson once lamented about how an emotional fervor had swept over the populace in regards to a certain political issue at the time. It disturbed him greatly to see how these political issues seemed to seep into every area of life and even affect people's interpersonal relationships. At one point in the letter he states:

"I never considered a difference of opinion in politics, in religion, in philosophy, as cause for withdrawing from a friend."

Today, we Americans find ourselves in a similar situation, with our political environment even more splintered due to a number of factors. The advent of mass digital media, siloed identity-driven political groups, and a societal lack of understanding of basic discursive fundamentals all contribute to the problem.

Civil discourse has fallen to an all time low.

The question that the American populace needs to ask itself now is: how do we fix it?

Discursive fundamentals need to be taught to preserve free expression

In a 2017 Free Speech and Tolerance Survey by Cato, it was found that 71% of Americans believe that political correctness had silenced important discussions necessary to our society. Many have pointed to draconian university policies regarding political correctness as a contributing factor to this phenomenon.

It's a great irony that, colleges, once true bastions of free-speech, counterculture and progressiveness, have now devolved into reactionary tribal politics.

Many years ago, one could count on the fact that universities would be the first places where you could espouse and debate any controversial idea without consequence. The decline of staple subjects that deal with the wisdom of the ancients, historical reference points, and civic discourse could be to blame for this exaggerated partisanship boiling on campuses.

Young people seeking an education are given a disservice when fed biased ideology, even if such ideology is presented with the best of intentions. Politics are but one small sliver for society and the human condition at large. Universities would do well to instead teach the principles of healthy discourse and engagement across the ideological spectrum.

The fundamentals of logic, debate and the rich artistic heritage of western civilization need to be the central focus of an education. They help to create a well-rounded citizen that can deal with controversial political issues.

It has been found that in the abstract, college students generally support and endorse the first amendment, but there's a catch when it comes to actually practicing it. This was explored in a Gallup survey titled: Free Expression on Campus: What college students think about First amendment issues.

In their findings the authors state:

"The vast majority say free speech is important to democracy and favor an open learning environment that promotes the airing of a wide variety of ideas. However, the actions of some students in recent years — from milder actions such as claiming to be threatened by messages written in chalk promoting Trump's candidacy to the most extreme acts of engaging in violence to stop attempted speeches — raise issues of just how committed college students are to
upholding First Amendment ideals.

Most college students do not condone more aggressive actions to squelch speech, like violence and shouting down speakers, although there are some who do. However, students do support many policies or actions that place limits on speech, including free speech zones, speech codes and campus prohibitions on hate speech, suggesting that their commitment to free speech has limits. As one example, barely a majority think handing out literature on controversial issues is "always acceptable."

With this in mind, the problems seen on college campuses are also being seen on a whole through other pockets of society and regular everyday civic discourse. Look no further than the dreaded and cliche prospect of political discussion at Thanksgiving dinner.

Talking politics at Thanksgiving dinner

As a result of this increased tribalization of views, it's becoming increasingly more difficult to engage in polite conversation with people possessing opposing viewpoints. The authors of a recent Hidden Tribes study broke down the political "tribes" in which many find themselves in:

  • Progressive Activists: younger, highly engaged, secular, cosmopolitan, angry.
  • Traditional Liberals: older, retired, open to compromise, rational, cautious.
  • Passive Liberals: unhappy, insecure, distrustful, disillusioned.
  • Politically Disengaged: young, low income, distrustful, detached, patriotic, conspiratorial
  • Moderates: engaged, civic-minded, middle-of-the-road, pessimistic, Protestant.
  • Traditional Conservatives: religious, middle class, patriotic, moralistic.
  • Devoted Conservatives: white, retired, highly engaged, uncompromising,

Understanding these different viewpoints and the hidden tribes we may belong to will be essential in having conversations with those we disagree with. This might just come to a head when it's Thanksgiving and you have a mix of many different personalities, ages, and viewpoints.

It's interesting to note the authors found that:

"Tribe membership shows strong reliability in predicting views across different political topics."

You'll find that depending on what group you identify with, that nearly 100 percent of the time you'll believe in the same way the rest of your group constituents do.

Here are some statistics on differing viewpoints according to political party:

  • 51% of staunch liberals say it's "morally acceptable" to punch Nazis.
  • 53% of Republicans favor stripping U.S. citizenship from people who burn the American flag.
  • 51% of Democrats support a law that requires Americans use transgender people's preferred gender pronouns.
  • 65% of Republicans say NFL players should be fired if they refuse to stand for the anthem.
  • 58% of Democrats say employers should punish employees for offensive Facebook posts.
  • 47% of Republicans favor bans on building new mosques.

Understanding the fact that tribal membership indicates what you believe, can help you return to the fundamentals for proper political engagement

Here are some guidelines for civic discourse that might come in handy:

  • Avoid logical fallacies. Essentially at the core, a logical fallacy is anything that detracts from the debate and seeks to attack the person rather than the idea and stray from the topic at hand.
  • Practice inclusion and listen to who you're speaking to.
  • Have the idea that there is nothing out of bounds for inquiry or conversation once you get down to an even stronger or new perspective of whatever you were discussing.
  • Keep in mind the maxim of : Do not listen with the intent to reply. But with the intent to understand.
  • We're not trying to proselytize nor shout others down with our rhetoric, but come to understand one another again.
  • If we're tied too closely to some in-group we no longer become an individual but a clone of someone else's ideology.

Civic discourse in the divisive age

Debate and civic discourse is inherently messy. Add into the mix an ignorance of history, rabid politicization and debased political discourse, you can see that it will be very difficult in mending this discursive staple of a functional civilization.

There is still hope that this great divide can be mended, because it has to be. The Hidden Tribes authors at one point state:

"In the era of social media and partisan news outlets, America's differences have become
dangerously tribal, fueled by a culture of outrage and taking offense. For the combatants,
the other side can no longer be tolerated, and no price is too high to defeat them.
These tensions are poisoning personal relationships, consuming our politics and
putting our democracy in peril.

Once a country has become tribalized, debates about contested issues from
immigration and trade to economic management, climate change and national security,
become shaped by larger tribal identities. Policy debate gives way to tribal conflicts.
Polarization and tribalism are self-reinforcing and will likely continue to accelerate.
The work of rebuilding our fragmented society needs to start now. It extends from
re-connecting people across the lines of division in local communities all the way to
building a renewed sense of national identity: a bigger story of us."

We need to start teaching people how to approach subjects from less of an emotional or baseless educational bias or identity, especially in the event that the subject matter could be construed to be controversial or uncomfortable.

This will be the beginning of a new era of understanding, inclusion and the defeat of regressive philosophies that threaten the core of our nation and civilization.