Google's AI Learns Betrayal and "Aggressive" Actions Pay Off
Google's DeepMind artificial intelligence learns what it takes to win, making human-like choices in competitive situations.
As the development of artificial intelligence continues at breakneck speed, questions about whether we understand what we are getting ourselves into persist. One fear is that increasingly intelligent robots will take all our jobs. Another fear is that we will create a world where a superintelligence will one day decide that it has no need for humans. This fear is well-explored in popular culture, through books and films like the Terminator series.
Another possibility is maybe the one that makes the most sense - since humans are the ones creating them, the machines and machine intelligences are likely to behave just like humans. For better or worse. DeepMind, Google’s cutting-edge AI company, has shown just that.
Recently, the DeepMind team ran a series of tests to investigate how the AI would respond when faced with certain social dilemmas. In particular, they wanted to find out whether the AI is more likely to cooperate or compete.
One of the tests involved 40 million instances of playing the computer game Gathering, during which DeepMind showed how far it’s willing to go to get what it wants. The game was chosen because it encapsulates aspects of the classic “Prisoner’s Dilemma” from game theory.
Pitting AI-controlled characters (called “agents”) against each other, DeepMind had them compete to gather the most virtual apples. Once the amount of available apples got low, the AI agents started to display "highly aggressive" tactics, employing laser beams to knock each other out. They would also steal the opponent’s apples.
Here’s how one of those games played out:
The DeepMind AI agents are in blue and red. The apples are green, while the laser beams are yellow.
The DeepMind team described their test in a blog post this way:
“We let the agents play this game many thousands of times and let them learn how to behave rationally using deep multi-agent reinforcement learning. Rather naturally, when there are enough apples in the environment, the agents learn to peacefully coexist and collect as many apples as they can. However, as the number of apples is reduced, the agents learn that it may be better for them to tag the other agent to give themselves time on their own to collect the scarce apples.”
Interestingly, what appears to have happened is that the AI systems began to develop some forms of human behavior.
“This model... shows that some aspects of human-like behaviour emerge as a product of the environment and learning. Less aggressive policies emerge from learning in relatively abundant environments with less possibility for costly action. The greed motivation reflects the temptation to take out a rival and collect all the apples oneself,” said Joel Z. Leibo from the DeepMind team to Wired.
Besides the fruit gathering, the AI was also tested via a Wolfpack hunting game. In it, two AI characters in the form of wolves chased a third AI agent - the prey. Here the researchers wanted to see if the AI characters would choose to cooperate to get the prey because they were rewarded for appearing near the prey together when it was being captured.
"The idea is that the prey is dangerous - a lone wolf can overcome it, but is at risk of losing the carcass to scavengers. However, when the two wolves capture the prey together, they can better protect the carcass from scavengers, and hence receive a higher reward,” wrote the researchers in their paper.
Indeed, the incentivized cooperation strategy won out in this instance, with the AI choosing to work together.
This is how that test panned out:
The wolves are red, chasing the blue dot (prey), while avoiding grey obstacles.
If you are thinking “Skynet is here”, perhaps the silver lining is that the second test shows how AI’s self-interest can include cooperation rather than the all-out competitiveness of the first test. Unless, of course, its cooperation to hunt down humans.
Here's a chart showing the results of the game tests that shows a clear increase in aggression during "Gathering":
Movies aside, the researchers are working to figure out how AI can eventually “control complex multi-agent systems such as the economy, traffic systems, or the ecological health of our planet – all of which depend on our continued cooperation”.
One nearby AI implementation where this could be relevant - self-driving cars which will have to choose safest routes, while keeping the objectives of all the parties involved under consideration.
The warning from the tests is that if the objectives are not balanced out in the programming, the AI might act selfishly, probably not for everyone’s benefit.
What’s next for the DeepMind team? Joel Leibo wants the AI to go deeper into the motivations behind decision-making:
“Going forward it would be interesting to equip agents with the ability to reason about other agent’s beliefs and goals,” said Leibo to Bloomberg.
Dominique Crenn, the only female chef in America with three Michelin stars, joins Big Think Live this Thursday at 1pm ET.
Scientists discover the inner workings of an effect that will lead to a new generation of devices.
- Researchers discover a method of extracting previously unavailable information from superconductors.
- The study builds on a 19th-century discovery by physicist Edward Hall.
- The research promises to lead to a new generation of semiconductor materials and devices.
Credit: Gunawan/Nature magazine
New research conducted on mice suggests repeated heavy drinking causes synaptic dysfunctions that lead to anxiety.
- The study was conducted on mice, who were given the equivalent of five drinks daily for 10 days.
- Images of the alcoholic mice brains showed synaptic dysfunctions related to microglia (immune cells in the brain).
- The results suggest that regulating TNF, a signaling protein related to systemic inflammation, may someday play a part in treating alcohol addiction.
3D surface rendering of confocal maximum projection images showing volume reconstruction of PSD-95 within CD68 structures in microglia (Iba1+ cell) on tissue sections from prefrontal cortices of WT and TNF KO mice after exposure to EtOH or H2O
The role of TNF in anxiety<p>But the new study revealed an interesting finding about TNF. To find out how TNF interacts with anxiety, the researchers gave to the alcoholic mice a drug called <a href="https://www.drugs.com/mtm/pomalidomide.html" target="_blank">pomalidomide</a>, which blocks the production of TNF. After, the mice showed improved synaptic functioning and less anxiety-like behaviors.</p><p style="margin-left: 20px;">"This study suggests that regulating the levels of TNF might eventually be useful when treating alcohol addiction," Relvas told Inverse.</p>
Pixabay<p>Still, it's unclear whether or how TNF regulation might work its way into alcohol addiction treatments. After all, even if science can fix the anxiety aspect of alcoholism, heavy drinking still exacts heavy tolls on other parts of the body and brain.</p><p>For now, it's probably best to keep your drinking within moderate levels: <a href="https://health.gov/our-work/food-nutrition/2015-2020-dietary-guidelines/guidelines/appendix-9/" target="_blank">Most</a> <a href="https://www.nytimes.com/2018/08/27/health/alcohol-drinking-health.html" target="_blank">research</a> suggests that having one to two drinks per day yields no significant negative health consequences.</p>
Students who think the world is just cheat less, but they need to experience justice to feel that way.
- Students in German and Turkish universities who believed the world is just cheated less than their pessimistic peers.
- The tendency to think the world is just is related to the occurence of experiences of justice.
- The findings may prove useful in helping students adjust to college life.