Once a week.
Subscribe to our weekly newsletter.
The original marshmallow test was flawed, researchers now say
One of the most famous experiments in psychology might be completely wrong.
- A team of psychologists have repeated the famous marshmallow experiment and found the original test to be flawed.
- It joins the ranks of many psychology experiments that cannot be repeated, which presents a considerable problem for its findings.
- The finding that children with similar demographics had similar success as teenagers no matter what they did as toddlers raises questions about how flexible self-control is as a trait and how much it actually helps us get ahead.
Almost everybody has heard of the Stanford marshmallow experiment. For those of you who haven't, the idea is simple; a child is placed in front of a marshmallow and told they can have one now or two if they don't eat the one in front of them for fifteen minutes. Their ability to delay gratification is recorded, and the child is checked in on as they grow up to see how they turned out.
It is one of the most famous studies in modern psychology, and it is often used to argue that self-control as a child is a predictor of success later in life. However, an attempt to repeat the experiment suggests there were hidden variables that throw the findings into doubt.
Marshmallows for Everybody!
The new marshmallow experiment, published in Psychological Science in the spring of 2018, repeated the original experiment with only a few variations. More than 10 times as many children were tested, raising the number to over 900, and children of various races, income brackets, and ethnicity were included. The maximum time the children would have to wait for the marshmallow was cut in half.
This study discovered that the ability of the children to wait for the second marshmallow had only a minor positive effect on their achievements at age 15, at best being half as substantial as the original test found the behavior to be. More interestingly, this effect was nearly obliterated when the children's backgrounds, home environment, and cognitive ability at age four were accounted for. The behavior of the children 11 years after the test was found to be unrelated to whether they could wait for a marshmallow at age 4.
It was also found that most of the benefits to the children who could wait the whole seven minutes for the marshmallow were shared by the kids who ate the marshmallow seconds upon receiving it. This, in the researchers eyes, casted further doubt on the value of the "self-control" shown by the kids who did wait.Lead author Tyler W. Watts of New York University explained the results by saying, "Our results show that once background characteristics of the child and their environment are taken into account, differences in the ability to delay gratification do not necessarily translate into meaningful differences later in life." They also added "We found virtually no correlation between performance on the marshmallow test and a host of adolescent behavioral outcomes. I thought that this was the most surprising finding of the paper."
What does this mean for self-control as a virtue?
While the test doesn't prove that the virtue of self-control isn't useful in life, it is a nice trait to have; it does show that there is more at play than researchers previously thought.
The key finding of the study is that the ability of the children to delay gratification didn't put them at an advantage over their peers from with similar backgrounds. Students whose mothers had college degrees were all doing similarly well 11 years after they decided whether to eat the first marshmallow. The same was true for children whose mothers lacked a college education.
This opens the doors to other explanations for why children who turn out worse later might not wait for that second marshmallow.
Many thinkers, such as, Sendhil Mullainathan and Eldar Shafir, are now turning to the idea that the effects of living in poverty can lead to the tendency to set short-term goals, which would help explain why a child might not wait for the second marshmallow. If true, then this tendency may give way to lots of problems for at-risk children. After all, if your life experiences tell you that you have no assurances that there will be another marshmallow tomorrow, why wouldn't you eat the one in front of you right now?
They often point to another variation of the experiment which explored how kids reacted when an adult lied to them about the availability of an item. When a child was told they could have a second marshmallow by an adult who had just lied to them, all but one of them ate the first one. In the cases where the adult had come through for them before, most of the kids were able to wait for the second marshmallow.
Were the kids who ate the first marshmallow in the first study bad at self-control or just acting rationally given their life experiences? The same question might be asked for the kids in the newer study.Another interpretation is that the test subjects saw comparative improvements or declines in their ability for self-control in the decade after the experiment until everybody in a given demographic had a similar amount of it. If this is true, it opens up new questions on how to positively influence young people's ability to delay gratification and how severely our home lives can affect how we turn out.
What does this mean for experimental psychology?
The refutation of the findings of the original study is part of a more significant problem in experimental psychology where the results of old experiments can't be replicated. Some tests had a poor methodology, like the Stanford prison experiment, some didn't factor for all of their variables, and others relied on atypical test subjects and were shocked to find their findings didn't apply to the population at large, like the marshmallow test.
That last issue is so prevalent that the favored guinea pigs of psychology departments, Western, educated, industrialized, rich, democratic students, have gained the acronym WEIRD. This is a bigger problem than you might think because lots of ideas in psychology are based around the findings of studies which might not be generalizable. The original marshmallow test has been quoted endlessly and used in arguments for the value of character in determining life outcomes despite only having students at a pre-school on Stanford's campus involved, hardly a typical group of kids.
The Stanford marshmallow test is a famous, flawed, experiment. While it remains true that self-control is a good thing, the amount you have at age four is largely irrelevant to how you turn out. So, relax if your kindergartener is a bit impulsive. They still have plenty of time to learn self-control.
Some mysteries take generations to unfold.
- In 1959, a group of nine Russian hikers was killed in an overnight incident in the Ural Mountains.
- Conspiracies about their deaths have flourished ever since, including alien invasion, an irate Yeti, and angry tribesmen.
- Researchers have finally confirmed that their deaths were due to a slab avalanche caused by intense winds.
In February 1959, a group of nine hikers crossed through Russia's Ural Mountains as part of a skiing expedition. The experienced trekkers, all employed at the Ural Polytechnical Institute, were led by Igor Dyatlov. On the evening of February 1, all nine appear to have fled their tents into the Arctic temperatures, for which they were unprepared. None survived.
Six of the members died of hypothermia; three suffered from physical trauma. Some members were missing body parts—a tongue here, a few eyes there, a pair of eyebrows for good measure. According to reports, no hiker appears to have struggled or panicked. They were likely too quickly overtaken by the hostile environment in Western Russia.
All the members were young, mostly in their early twenties; one member, Semyon Zolotaryov, was 38. Good health didn't matter. Given the uncertain circumstances—what made them flee into the bitter cold?—the incident known as Dyatlov Pass has long been the type of Area 51-conspiracy theory that some people love to speculate about. A vicious animal attack? Infrasound-induced panic? Was the Soviet military involved? Maybe it was the katabatic winds that did them in. Local tribesmen might not have liked the intrusion.
Or perhaps it was aliens. Or a Yeti. Have we talked about Yeti aliens yet?
These theories and more have been floated for decades.
a: Last picture of the Dyatlov group taken before sunset, while making a cut in the slope to install the tent. b: Broken tent covered with snow as it was found during the search 26 days after the event.
Photographs courtesy of the Dyatlov Memorial Foundation.
Finally, a new study, published in the Nature journal Communications Earth & Environment, has put the case to rest: it was a slab avalanche.
This theory isn't exactly new either. Researchers have long been skeptical about the avalanche notion, however, due to the grade of the hill. Slab avalanches don't need a steep slope to get started. Crown or flank fractures can quickly release as little as a few centimeters of earth (or snow) sliding down a hill (or mountain).
As researchers Johan Gaume (Switzerland's WSL Institute for Snow and Avalanche Research SLF) and Alexander Puzrin (Switzerland's Institute for Geotechnical Engineering) write, it was "a combination of irregular topography, a cut made in the slope to install the tent and the subsequent deposition of snow induced by strong katabatic winds contributed after a suitable time to the slab release, which caused severe non-fatal injuries, in agreement with the autopsy results."
Conspiracy theories abound when evidence is lacking. Twenty-six days after the incident, a team showed up to investigate. They didn't find any obvious sounds of an avalanche; the slope angle was below 30 degrees, ruling out (to them) the possibility of a landslide. Plus, the head injuries suffered were not typical of avalanche victims. Inject doubt and crazy theories will flourish.
Configuration of the Dyatlov tent installed on a flat surface after making a cut in the slope below a small shoulder. Snow deposition above the tent is due to wind transport of snow (with deposition flux Q).
Photo courtesy of Communications Earth & Environment.
Add to this Russian leadership's longstanding battle with (or against) the truth. In 2015 the Investigative Committee of the Russian Federation decided to reopen this case. Four years later the agency concluded it was indeed a snow avalanche—an assertion immediately challenged within the Russian Federation. The oppositional agency eventually agreed as well. The problem was neither really provided conclusive scientific evidence.
Gaume and Puzrin went to work. They provided four critical factors that confirmed the avalanche:
- The location of the tent under a shoulder in a locally steeper slope to protect them from the wind
- A buried weak snow layer parallel to the locally steeper terrain, which resulted in an upward-thinning snow slab
- The cut in the snow slab made by the group to install the tent
- Strong katabatic winds that led to progressive snow accumulation due to the local topography (shoulder above the tent) causing a delayed failure
Case closed? It appears so, though don't expect conspiracy theories to abate. Good research takes time—sometimes generations. We're constantly learning about our environment and then applying those lessons to the past. While we can't expect every skeptic to accept the findings, from the looks of this study, a 62-year-old case is now closed.
Stay in touch with Derek on Twitter and Facebook. His most recent book is "Hero's Dose: The Case For Psychedelics in Ritual and Therapy."
Are "humanized" pigs the future of medical research?
The U.S. Food and Drug Administration requires all new medicines to be tested in animals before use in people. Pigs make better medical research subjects than mice, because they are closer to humans in size, physiology and genetic makeup.
In recent years, our team at Iowa State University has found a way to make pigs an even closer stand-in for humans. We have successfully transferred components of the human immune system into pigs that lack a functional immune system. This breakthrough has the potential to accelerate medical research in many areas, including virus and vaccine research, as well as cancer and stem cell therapeutics.
Existing biomedical models
Severe Combined Immunodeficiency, or SCID, is a genetic condition that causes impaired development of the immune system. People can develop SCID, as dramatized in the 1976 movie “The Boy in the Plastic Bubble." Other animals can develop SCID, too, including mice.
Researchers in the 1980s recognized that SCID mice could be implanted with human immune cells for further study. Such mice are called “humanized" mice and have been optimized over the past 30 years to study many questions relevant to human health.
Mice are the most commonly used animal in biomedical research, but results from mice often do not translate well to human responses, thanks to differences in metabolism, size and divergent cell functions compared with people.
Nonhuman primates are also used for medical research and are certainly closer stand-ins for humans. But using them for this purpose raises numerous ethical considerations. With these concerns in mind, the National Institutes of Health retired most of its chimpanzees from biomedical research in 2013.
Alternative animal models are in demand.
Swine are a viable option for medical research because of their similarities to humans. And with their widespread commercial use, pigs are met with fewer ethical dilemmas than primates. Upwards of 100 million hogs are slaughtered each year for food in the U.S.
In 2012, groups at Iowa State University and Kansas State University, including Jack Dekkers, an expert in animal breeding and genetics, and Raymond Rowland, a specialist in animal diseases, serendipitously discovered a naturally occurring genetic mutation in pigs that caused SCID. We wondered if we could develop these pigs to create a new biomedical model.
Our group has worked for nearly a decade developing and optimizing SCID pigs for applications in biomedical research. In 2018, we achieved a twofold milestone when working with animal physiologist Jason Ross and his lab. Together we developed a more immunocompromised pig than the original SCID pig – and successfully humanized it, by transferring cultured human immune stem cells into the livers of developing piglets.
During early fetal development, immune cells develop within the liver, providing an opportunity to introduce human cells. We inject human immune stem cells into fetal pig livers using ultrasound imaging as a guide. As the pig fetus develops, the injected human immune stem cells begin to differentiate – or change into other kinds of cells – and spread through the pig's body. Once SCID piglets are born, we can detect human immune cells in their blood, liver, spleen and thymus gland. This humanization is what makes them so valuable for testing new medical treatments.
We have found that human ovarian tumors survive and grow in SCID pigs, giving us an opportunity to study ovarian cancer in a new way. Similarly, because human skin survives on SCID pigs, scientists may be able to develop new treatments for skin burns. Other research possibilities are numerous.
The ultraclean SCID pig biocontainment facility in Ames, Iowa. Adeline Boettcher, CC BY-SA
Pigs in a bubble
Since our pigs lack essential components of their immune system, they are extremely susceptible to infection and require special housing to help reduce exposure to pathogens.
SCID pigs are raised in bubble biocontainment facilities. Positive pressure rooms, which maintain a higher air pressure than the surrounding environment to keep pathogens out, are coupled with highly filtered air and water. All personnel are required to wear full personal protective equipment. We typically have anywhere from two to 15 SCID pigs and breeding animals at a given time. (Our breeding animals do not have SCID, but they are genetic carriers of the mutation, so their offspring may have SCID.)
As with any animal research, ethical considerations are always front and center. All our protocols are approved by Iowa State University's Institutional Animal Care and Use Committee and are in accordance with The National Institutes of Health's Guide for the Care and Use of Laboratory Animals.
Every day, twice a day, our pigs are checked by expert caretakers who monitor their health status and provide engagement. We have veterinarians on call. If any pigs fall ill, and drug or antibiotic intervention does not improve their condition, the animals are humanely euthanized.
Our goal is to continue optimizing our humanized SCID pigs so they can be more readily available for stem cell therapy testing, as well as research in other areas, including cancer. We hope the development of the SCID pig model will pave the way for advancements in therapeutic testing, with the long-term goal of improving human patient outcomes.
Adeline Boettcher earned her research-based Ph.D. working on the SCID project in 2019.
Satellite imagery can help better predict volcanic eruptions by monitoring changes in surface temperature near volcanoes.
- A recent study used data collected by NASA satellites to conduct a statistical analysis of surface temperatures near volcanoes that erupted from 2002 to 2019.
- The results showed that surface temperatures near volcanoes gradually increased in the months and years prior to eruptions.
- The method was able to detect potential eruptions that were not anticipated by other volcano monitoring methods, such as eruptions in Japan in 2014 and Chile in 2015.
How can modern technology help warn us of impending volcanic eruptions?
One promising answer may lie in satellite imagery. In a recent study published in Nature Geoscience, researchers used infrared data collected by NASA satellites to study the conditions near volcanoes in the months and years before they erupted.
The results revealed a pattern: Prior to eruptions, an unusually large amount of heat had been escaping through soil near volcanoes. This diffusion of subterranean heat — which is a byproduct of "large-scale thermal unrest" — could potentially represent a warning sign of future eruptions.
Conceptual model of large-scale thermal unrestCredit: Girona et al.
For the study, the researchers conducted a statistical analysis of changes in surface temperature near volcanoes, using data collected over 16.5 years by NASA's Terra and Aqua satellites. The results showed that eruptions tended to occur around the time when surface temperatures near the volcanoes peaked.
Eruptions were preceded by "subtle but significant long-term (years), large-scale (tens of square kilometres) increases in their radiant heat flux (up to ~1 °C in median radiant temperature)," the researchers wrote. After eruptions, surface temperatures reliably decreased, though the cool-down period took longer for bigger eruptions.
"Volcanoes can experience thermal unrest for several years before eruption," the researchers wrote. "This thermal unrest is dominated by a large-scale phenomenon operating over extensive areas of volcanic edifices, can be an early indicator of volcanic reactivation, can increase prior to different types of eruption and can be tracked through a statistical analysis of little-processed (that is, radiance or radiant temperature) satellite-based remote sensing data with high temporal resolution."
Temporal variations of target volcanoesCredit: Girona et al.
Although using satellites to monitor thermal unrest wouldn't enable scientists to make hyper-specific eruption predictions (like predicting the exact day), it could significantly improve prediction efforts. Seismologists and volcanologists currently use a range of techniques to forecast eruptions, including monitoring for gas emissions, ground deformation, and changes to nearby water channels, to name a few.
Still, none of these techniques have proven completely reliable, both because of the science and the practical barriers (e.g. funding) standing in the way of large-scale monitoring. In 2014, for example, Japan's Mount Ontake suddenly erupted, killing 63 people. It was the nation's deadliest eruption in nearly a century.
In the study, the researchers found that surface temperatures near Mount Ontake had been increasing in the two years prior to the eruption. To date, no other monitoring method has detected "well-defined" warning signs for the 2014 disaster, the researchers noted.
The researchers hope satellite-based infrared monitoring techniques, combined with existing methods, can improve prediction efforts for volcanic eruptions. Volcanic eruptions have killed about 2,000 people since 2000.
"Our findings can open new horizons to better constrain magma–hydrothermal interaction processes, especially when integrated with other datasets, allowing us to explore the thermal budget of volcanoes and anticipate eruptions that are very difficult to forecast through other geophysical/geochemical methods."