A.I. is translating messages of long-lost languages

MIT and Google researchers use deep learning to decipher ancient languages.

A.I. is translating messages of long-lost languages
  • Researchers from MIT and Google Brain discover how to use deep learning to decipher ancient languages.
  • The technique can be used to read languages that died long ago.
  • The method builds on the ability of machines to quickly complete monotonous tasks.


There are about 6,500-7,000 languages currently spoken in the world. But that's less than a quarter of all the languages people spoke over the course of human history. That total number is around 31,000 languages, according to some linguistic estimates. Every time a language is lost, so goes that way of thinking, of relating to the world. The relationships, the poetry of life uniquely described through that language are lost too. But what if you could figure out how to read the dead languages? Researchers from MIT and Google Brain created an AI-based system that can accomplish just that.

While languages change, many of the symbols and how the words and characters are distributed stay relatively constant over time. Because of that, you could attempt to decode a long-lost language if you understood its relationship to a known progenitor language. This insight is what allowed the team which included Jiaming Luo and Regina Barzilay from MIT and Yuan Cao from Google's AI lab to use machine learning to decipher the early Greek language Linear B (from 1400 BC) and a cuneiform Ugaritic (early Hebrew) language that's also over 3,000 years old.

Linear B was previously cracked by a human – in 1953, it was deciphered by Michael Ventris. But this was the first time the language was figured out by a machine.

The approach by the researchers focused on 4 key properties related to the context and alignment of the characters to be deciphered – distributional similarity, monotonic character mapping, structural sparsity and significant cognate overlap.

They trained the AI network to look for these traits, achieving the correct translation of 67.3% of Linear B cognates (word of common origin) into their Greek equivalents.

What AI can potentially do better in such tasks, according to MIT Technology Review, is that it can simply take a brute force approach that would be too exhausting for humans. They can attempt to translate symbols of an unknown alphabet by quickly testing it against symbols from one language after another, running them through everything that is already known.

Next for the scientists? Perhaps the translation of Linear A - the Ancient Greek language that no one has succeeded in deciphering so far.

You can check out their paper "Neural Decipherment via Minimum-Cost Flow: from Ugaritic to Linear B" here.

Noam Chomsky on Language’s Great Mysteries

Noam Chomsky contemplates the basic, yet still unanswerable, questions of linguistics.


How New York's largest hospital system is predicting COVID-19 spikes

Northwell Health is using insights from website traffic to forecast COVID-19 hospitalizations two weeks in the future.

Credit: Getty Images
Sponsored by Northwell Health
  • The machine-learning algorithm works by analyzing the online behavior of visitors to the Northwell Health website and comparing that data to future COVID-19 hospitalizations.
  • The tool, which uses anonymized data, has so far predicted hospitalizations with an accuracy rate of 80 percent.
  • Machine-learning tools are helping health-care professionals worldwide better constrain and treat COVID-19.
Keep reading Show less

3,000-pound Triceratops skull unearthed in South Dakota

"You dream about these kinds of moments when you're a kid," said lead paleontologist David Schmidt.

Excavation of a triceratops skull in South Dakota.

Credit: David Schmidt / Westminster College
Surprising Science
  • The triceratops skull was first discovered in 2019, but was excavated over the summer of 2020.
  • It was discovered in the South Dakota Badlands, an area where the Triceratops roamed some 66 million years ago.
  • Studying dinosaurs helps scientists better understand the evolution of all life on Earth.
Keep reading Show less

World's oldest work of art found in a hidden Indonesian valley

Archaeologists discover a cave painting of a wild pig that is now the world's oldest dated work of representational art.

Pig painting at Leang Tedongnge in Indonesia, made at 45,500 years ago.

Credit: Maxime Aubert
Surprising Science
  • Archaeologists find a cave painting of a wild pig that is at least 45,500 years old.
  • The painting is the earliest known work of representational art.
  • The discovery was made in a remote valley on the Indonesian island of Sulawesi.
Keep reading Show less

What can Avicenna teach us about the mind-body problem?

The Persian polymath and philosopher of the Islamic Golden Age teaches us about self-awareness.

Photo by Andrew Spencer on Unsplash
Mind & Brain
Philosophers of the Islamic world enjoyed thought experiments.
Keep reading Show less
Videos

The incredible physics behind quantum computing

Can computers do calculations in multiple universes? Scientists are working on it. Step into the world of quantum computing.

Scroll down to load more…
Quantcast