Skip to content
Starts With A Bang

New evidence strongly suggests COVID is natural, not from a lab leak

We don’t know with 100% certainty where SARS-CoV-2 first came from or how it first infected humans. But not all options are equally likely.
lab leak SARS-CoV-2
The SARS virus (orange) has a crown-like structure, meaning that it's part of the coronavirus family of diseases. The novel coronavirus SARS-CoV-2, also known as the virus which causes COVID-19 in humans, is the largest, most lethal and long-term detrimental new pandemic to hit planet Earth since the dawn of the 21st century. Despite having a genetic sequence of only ~30,000 base pairs in it, this virus has killed over 7 million people since 2020, with many estimates for the true number of deaths rising into the tens of millions.
Credit: NIH
Key Takeaways
  • Although most scientists have compelling reasons to favor a natural origin for SARS-CoV-2, a few prominent biologists have suggested a lab-leak origin for the pandemic.
  • No definitive origin has yet been found, but the discovery of three new strains of coronavirus in bats in Laos strongly suggest a natural origin for the disease, not a lab leak.
  • While it’s not yet possible to rule out the idea of a lab leak entirely, the latest clues strongly point to a natural origin as the most likely scenario.
Sign up for the Starts With a Bang newsletter
Travel the universe with Dr. Ethan Siegel as he answers the biggest questions of all

For approximately the past two years, humanity has suffered beneath the heel of a simple but world-changing organism: the novel coronavirus SARS-CoV-2. Since its emergence in humans late in 2019, this virus, and the disease it causes in humans, COVID-19, has ravaged the globe, leading to hundreds of millions of infections, tens of millions of those suffering long-term effects, and millions of deaths.

In addition, human society itself has seen dramatic and polarizing changes. Scientifically, we know that the most effective public health interventions include:

  • avoiding large, crowded, indoor gatherings
  • wearing a face mask that covers your nose and mouth
  • maintaining at least 6 feet (2 meters) of distance between yourself and those who aren’t members of your household
  • minimizing your contact time with those outside your household
  • giving people the resources they need to stay safely at home when infection rates soar
  • fully vaccinating your body against the virus

However, one question has preoccupied the minds of many: Where did SARS-CoV-2 come from? This coronavirus is unlike any other, and this one question has led to two main ideas. One is that the virus occurred naturally and spilled over into humans from human-animal contact. The other is that the virus first emerged in humans from a lab leak, originating from the Wuhan Institute of Virology. As 2021 reaches its end, here’s what we know so far about COVID-19’s origins.

A typical example of a scene at a fur farm, showing human-animal contact. Animals are often killed en masse prior to them being skinned by hand at a pelt or fur farm. This industry is a billion per year enterprise in China alone, and is a prime candidate for the zoonotic spillover of SARS-CoV-2 into humans. (Credit: Viktor Drachev/AFP)

Credit: Viktor Drachev/AFP

The natural origin hypothesis

For decades, humans have been studying how pandemics occur, with an eye toward prevention and countermeasures. Many different fields come together in this endeavor, including virology, immunology, epidemiology, disease ecology, and evolutionary biology, as each group of experts brings its own unique knowledge set. Although it’s oversimplifying matters quite a bit, the basic recipe for pandemic origins goes as follows:

  1. Human civilization, particularly over the past century, expands into previously wild territories.
  2. Habitat loss due to climate change and deforestation increases the potential for novel human-animal and animal-animal contact.
  3. In addition, humans regularly contact animals through industrialized animal agriculture, animal markets (both legal and illegal), and the fur trade.
  4. As a result, diseases that were previously only circulating in certain animals now have the potential to jump hosts: from animal to animal, from animal to human, or even from human-to-animal.
  5. With each new infection and each new host, the disease has a chance to further mutate and adapt, leading to novel strains, new infections, and — in the worst case — a highly infectious, deadly disease in humans.

That pathway has been the basic recipe for all prior pandemics over the last century, from swine flus to bird flus to SARS to MERS to HIV/AIDS. When it comes to SARS-CoV-2 and the disease it causes in humans, COVID-19, this is the default hypothesis of the overwhelming majority of experts in the field.

lab leak
Chinese virologist Shi Zhengli (L) is seen inside the P4 laboratory in Wuhan in this 2017 photo. The P4 epidemiological laboratory, part of the Wuhan Institute of Virology, is one of the world’s leading research centers on coronaviruses. (Credit: Johannes Eisele/AFP)

Credit: Johannes Eisele/AFP

The lab-leak hypothesis

On the other hand, there’s another option that a few scientists — along with a great many non-scientists — have been entertaining: that the disease spilled over into humans not from a natural event, but rather from a lab leak originating from the Wuhan Institute of Virology (WIV). Wuhan, where COVID-19 in humans is believed to have originated in late 2019, is the largest city around for many hundreds of miles, and a hub of industrial and commercial activity. Wuhan is the largest city not only in its province, Hubei, but in all of central China. Surrounded by large rural and borderline-wild areas, many novel viruses have been found in the animals in those regions. As a result, the Wuhan Institute of Virology, one of the top virological institutes in the world, was built in precisely this location.

A few pieces of evidence — circumstantial, sure, but compelling to many — didn’t sit right with a number of people.

  • Dr. Zhengli Shi, the chief scientist for emerging disease at WIV, performs research on bat viruses, some of which are incredibly similar to SARS-CoV-2.
  • Many of the animals, and the viruses that they carry, are known to exist in that lab, and not all of the viruses collected have even been genetically sequenced.
  • It’s possible to genetically engineer or modify viruses, and there are some peculiar features present in SARS-CoV-2 that it may be possible to create in a lab, not only via random evolution in the wild.
  • And that China is well-known for covering up any information that could make them look imperfect in any way.
Markets, such as this one in Hong Kong, often contain fruits, vegetables, animals, and other derivative products available for purchase. Produce and animals are brought in from up to thousands of kilometers away, including adjacent provinces and even foreign or offshore sources, for sale as such markets. Wuhan is the hub city for all of central/southern China. (Credit: Philip Fong/AFP)

Credit: Philip Fong/AFP

Taking the preceding facts together, you too might consider that perhaps the current pandemic owes its origins not to a natural disease spillover from human-animal contact, but from a mistake, or even a deliberate action, arising from the Wuhan Institute for Virology.

There’s one particular virus of interest that has led to the most popular and prolific hypothesis for how SARS-CoV-2 could have emerged from a non-natural origin: RaTG13. These two viruses are ~96% identical to one another, meaning that of the roughly ~30,000 base pairs encoded in their genomes, only ~1,200 of them differ from one another. In particular, though, these differences are clustered in two extremely important places:

  1. the furin-cleavage site, which is the part of the “spike” protein on the crown-like tips of the virus
  2. the specific protein that allows the virus to infect human cells: the receptor-binding domain (RBD), which binds to the protein found in the cells of many hosts: angiotensin-converting enzyme 2, known colloquially as the ACE2 receptor

If you look at an image of the coronavirus itself, both of these sites are suggestively close to one another: on the little “spike” parts of the coronavirus.

The host cell, shown in yellowish-brown, contains ACE2 (in green) on its outer cell membrane. SARS-CoV-2 is excellent at binding to this molecule, allowing it to infect the cell. ACE2 behaves as a cellular doorway, and the virus’s receptor-binding domain (RBD) is what allows it entry. (Credit: The Conversation)
The host cell, shown in yellowish-brown, contains ACE2 (in green) on its outer cell membrane. SARS-CoV-2 is excellent at binding to this molecule, allowing it to infect the cell. ACE2 behaves as a cellular doorway, and the virus’s receptor-binding domain (RBD) is what allows it entry. (Credit: The Conversation)

The way a coronavirus infects a host is through the combination of these two parts. The part of the spike that connects the bottom of the protein to the top is what allows the tip of the spike to move around, “hook in” to cells, and bind to them; that’s where the furin-cleavage site is located. Meanwhile, the RBD is located on the top of the spike, which binds to cells in the human body that contain the ACE2 protein.

Both of these factors, importantly, are what separate RaTG13, which does not infect humans, from SARS-CoV-2, which does.

What is ACE2?

ACE2 is a protein on the surface of many types of cells, including cells in the human body. Specifically, this protein is an enzyme, and its function is to cut up the large angiotensinogen protein into smaller proteins, and those smaller proteins then go on to regulate various functions within the cell. ACE2, on the surface of a human cell, acts like a lock, and what we call the receptor-binding domain (RBD) is like a key that allows entry into the cell. ACE2, in many ways, acts like a gateway, and the RBD on the tip of SARS-CoV-2 is what allows the virus to open the gate. It’s how the virus gets into the cell.

Horseshoe bats, as seen here in their natural environment, are abundant and diverse all across southern and central Asia, and carry a wide range of coronaviruses. The ancestral strain, RaTG13, is a 96% match for SARS-CoV-2, but is missing some important genetic details. (Credit: orientalizing/flickr)

Credit: orientalizing/flickr

The big idea of the most common incarnation of the lab-leak hypothesis, then, is that:

  • a naturally occurring virus that could not infect humans, like RaTG13, was taken as the starting point,
  • it was then experimented on in secret,
  • by being either passed through animal hosts, genetically modified, or used as a template for a printed virus,
  • where the end result was a viable virus that could infect humans, like SARS-CoV-2,
  • and then, either accidentally or willfully (and then successfully covered up in either case), actually did infect one or more humans,

That’s where and how the current pandemic started.

It’s important to note that lab leaks do, in fact, occur; although they’re rare, an accidental needle stick or animal bite is more than capable of transmitting a disease into humans, and events like this have occurred multiple times in the past.

But this specific scenario, even laid out like this, is difficult for most virology experts to imagine. Viruses are complex, highly evolved organisms, and most changes you could make would make the virus less efficient at doing what it does than natural evolution would admit. Most alterations to a virus’s genetic sequence would lead to it not replicating, not encoding proteins successfully, or to it not functioning as efficiently or effectively as a natural virus would. It would not necessarily be successful in any particular host with this specific genetic sequence, and it would not necessarily be good at evading the host’s immune system, and in particular of evading the pattern recognition receptors that recognize pathogens: TLRs (Toll-like receptors), which acts as booby traps for pathogens.

It would require, in short, a knowledge of how to create and manipulate viruses that far exceeds the limits of our modern scientific capabilities.

The genetic sequence of the coronavirus SARS-CoV-2 from patients, as compared to other known coronaviruses. RaTG13 shares some key similarities with SARS-CoV-2, but that simply indicates that they likely had a common ancestor some time ago. Genomic analysis indicates that they are likely cousins, originating from a common ancestor at some point more than ~10 years ago. (Credit: Peng Zhou et al., Nature, 2020)
The genetic sequence of the coronavirus SARS-CoV-2 from patients, as compared to other known coronaviruses. RaTG13 shares some key similarities with SARS-CoV-2, but that simply indicates that they likely had a common ancestor some time ago. Genomic analysis indicates that they are likely cousins, originating from a common ancestor at some point more than ~10 years ago. (Credit: Peng Zhou et al., Nature, 2020)

How can we know the origins of SARS-CoV-2 for certain?

But maybe, nevertheless, someone was researching a predecessor virus. Maybe someone through sheer luck, through an accident, or through a novel series of techniques, managed to hack-and-slash their way to creating SARS-CoV-2 in a lab. It’s not that the evidence supports this scenario (it doesn’t), but it’s that the evidence we have is insufficient to rule that unlikely scenario out.

The only way to truly falsify the notion of a lab leak would be to find a reservoir of viruses in an animal population — one that could have reasonably made it to, say, the wet market in Wuhan where the coronavirus appears to have first jumped to humans — that matches SARS-CoV-2 far more successfully than a virus like RaTG13.

It’s important to note that, after two years of searching, we haven’t found the smoking gun we’re so desperately seeking: an animal reservoir of a SARS-CoV-2-like virus that possesses both a furin-cleavage site on the spike protein and an ACE2 receptor on the tip. This isn’t exactly surprising; it’s exceedingly rare to find the animal origin of a novel disease quickly, and in many cases, it doesn’t happen at all. While the animal hosts for SARS-1 and for MERS were found in less than a year each, those are the exceptions. It took decades to find the animal intermediary for HIV, while the animal intermediary for measles, smallpox, and the most recent ebola outbreak have never been found.

This figure shows the structure of the spike protein in SARS-CoV-2. Panel A shows the spike homotrimer in its open configuration, while panel B shows the cleavage sites on the spike protein. (Credit: Walls et al., Cell, 2020)
This figure shows the structure of the spike protein in SARS-CoV-2. Panel A shows the spike homotrimer in its open configuration, while panel B shows the cleavage sites on the spike protein. Note how the configuration of a protein, and how it folds in its environment, controls many aspects of its functioning. Even an identically structured protein isn’t going to perform the same in different envrionments. (Walls et al., Cell, 2020)

So, what have we found?

We’ve found other betacoronaviruses that have furin-cleavage sites in them. When we map their genetic family trees, we find that they evolved multiple times, independently of one another, across the coronavirus family. When we look at the genetic details of the furin-cleavage site in SARS-CoV-2, we find that they’re completely consistent with natural evolution.

Moreover, the detail often pointed to as evidence the virus was created in a lab — the CGG-CGG sequence as a codon for arginine on the furin-cleavage site — typically shed light on what sort of host organism the pathogen is well-adapted to. Seeing this sequence tells you almost nothing about the likelihood of the virus being engineered; instead, it gives you information about the adaptations the virus has made to a particular host, so that it can avoid the host’s TLRs.

But when it comes to the RBD, a recent discovery has completely undermined the central theme of the lab-leak hypothesis: three newly discovered wild bat viruses, found in Laos, not China, are more genetically similar to SARS-CoV-2 than RaTG13 is and are able to enter human cells through the ACE2 pathway.

The lesser woolly horseshoe bat, shown here with a pup, is one of numerous horseshoe bat species found abundantly in Asia, including in Laos. A recent study of 645 caves in Laos uncovered many new strains of bat coronavirus that had never been seen before, including three with receptor-binding domains that differ from the one found in SARS-CoV-2 by only one or two nucleotides. (Credit: Rajesh Puttaswamaiah/CCA-SA-4.0)
The lesser woolly horseshoe bat, shown here with a pup, is one of numerous horseshoe bat species found abundantly in Asia, including in Laos. A recent study of 645 caves in Laos uncovered many new strains of bat coronavirus that had never been seen before, including three with receptor-binding domains that differ from the one found in SARS-CoV-2 by only one or two nucleotides. (Credit: Rajesh Puttaswamaiah/CCA-SA-4.0)

The new study, published online in September of 2021, identifies three horseshoe bat species (out of 645 bat caves sampled) that each contain a virus more similar to SARS-CoV-2 than RaTG13 is: BANAL-20-52, BANAL-20-103, and BANAL-20-236. The genetic match, which is approximately 97% identical to the genome of SARS-CoV-2, importantly contain the receptor-binding domain (RBD) that had previously been unobserved before the emergence of COVID-19, but matches the RBDs that had previously been sequenced in pangolins. The part of the novel coronavirus that allows is to infect human hosts, the RBD, has now been found in nature, in viruses more similar to SARS-CoV-2 than any ever seen before, and in multiple geographic locations.

According to Smriti Mallapaty, reporting in Nature:

“[A]ll three newly discovered viruses have individual sections that are more similar to sections of SARS-CoV-2 than seen in any other viruses. Viruses swap chunks of RNA with one another through a process called recombination, and one section in BANAL-20-103 and BANAL-20-52 could have shared an ancestor with sections of SARS-CoV-2 less than a decade ago…”

which likely means that we’re closing in on where the pandemic originated, and it’s farther away from the Wuhan Institute of Virology than almost anyone had imagined.

lab leak
The central idea of the lab leak hypothesis, that the virus spilled over from the Wuhan Institute of Virology, is only possible if the virus from which SARS-CoV-2 originated was actually ever inside the institute itself. If the virus originated in animals from Laos, that’s no longer a possibility. (Credit: S. Temmam et al., Research Square, 2021)

Credit: S. Temmam et al., Nature, 2022

What does this mean for future pandemics?

This new discovery confirms what many researchers focused on pandemic prevention have feared: that the part of SARS-CoV-2 that allows it to infect humans is widely circulating, naturally, in the wild. As humans continue to encroach on previously wild habitats, as climate change and deforestation continue to alter animal habitats, and as animal agriculture, the fur trade, and live animal markets continue to be common, we are going to see more and more of these novel coronaviruses begin to infect humans.

In other words, we can fully expect that COVID-19 isn’t going to be some outlier, but rather the start of a “new normal” heralded by the previous arrivals of SARS-1 and MERS: that novel respiratory syndromes are going to evolve in the wild and will infect humans, with the potential to create future outbreaks, epidemics, and in a worst-case scenario, future pandemics as well.

Now, more than ever, it’s important to invest in pandemic prevention and the mitigation of disease outbreaks. With hundreds of millions of infections and millions of deaths worldwide, it’s humbling to realize that we may not be in the middle of a battle with one disease, but at the start of a string of many. If we focus our efforts in the wrong place — on how we fear pandemics could emerge, rather than on how they’re actively emerging in nature — all of human civilization will suffer the consequences.

Sign up for the Starts With a Bang newsletter
Travel the universe with Dr. Ethan Siegel as he answers the biggest questions of all

Related

Up Next