Listen to New Google AI Program Talk Like a Human and Write Music

Google's DeepMind creates AI that blows away existing speech synthesizers. 

Google-owned artificial intelligence company DeepMind presented a deep neural network that generates amazingly human-like speech. Called WaveNet, this AI makes a significant advancement over existing speech synthesizers. What’s more, it can write pretty good classical music. 


DeepMind is a British company, previously known for creating machine-learning AI software that beat the world champion of the notoriously-intricate game Go. Machine learning allows computer systems to teach themselves and make predictions based on gathered data.

The company claims that its WaveNet creates speech that can mimic any human voice and closes the gap with human speech performance by more than 50%. Google’s 500-person blind test study found people rating WaveNet’s English speech at a 4.21 (5 being realistic human speech), while concatenate speech got a 3.86 and parametric an even worse 3.67.

WaveNet also generated speech in Mandarin, which got similar results.

 

They did this by re-imagining currently used text-to-speech (TTS) processes. The two most common being concatenative TTS, used by Apple’s Siri, which involves pre-recorded fragments of speech, and parametric TTS, which sounds even less natural, getting speech generated through computer algorithms.

What’s different about WaveNet is that its can directly model the raw waveform of an audio signal, an extremely complicated task that required a novel neural network. WaveNet learns from voice recordings, then on its own creates speech. This independence also allows the program to generate other kinds of audio, like music.

 

To bolster their claim, DeepMind released some samples, comparing their WaveNets with samples made by concatenate and parametric TTS. You be the judge. 

Parametric:

parametric-1.wav

parametric-2.wav

And now, this is what WaveNet generated:

wavenet-1.wav

wavenet-2.wav

After it was trained on a dataset of classical piano music, WaveNet produced these intriguing musical creations of its own:

sample_1.wav

sample_2.wav

sample_3.wav

What are the implications of this new tech? While it also means our eventual robotic overlords should being easier to talk to, virtual AI assistants like Siri or Cortana could benefit sooner. Google isn’t promising this is headed straight to such applications, however, as WaveNet requires serious computing power.

This achievement shows again the potential of DeepMind's neural networks which can and are being used for fraud and spam detection, handwriting recognition, image search, translation and other tasks.

DeepMind also made a number of Google's data centers use energy more efficiently, slashing its electricity bill. Previously, DeepMind trained its AI to beat dozens of video games.

In a very Google move, the paper on WaveNet is available on Google Drive here.

Want to know more about DeepMind? Check out this video:

The world and workforce need wisdom. Why don’t universities teach it?

Universities claim to prepare students for the world. How many actually do it?

Photo: Take A Pix Media / Getty Images
Sponsored by Charles Koch Foundation
  • Many university mission statements do not live up to their promise, writes Ben Nelson, founder of Minerva, a university designed to develop intellect over content memorization.
  • The core competencies that students need for success—critical thinking, communication, problem solving, and cross-cultural understanding, for example—should be intentionally taught, not left to chance.
  • These competencies can be summed up with one word: wisdom. True wisdom is the ability to apply one's knowledge appropriately when faced with novel situations.
Keep reading Show less

What the world will look like in the year 250,002,018

This is what the world will look like, 250 million years from now

On Pangaea Proxima, Lagos will be north of New York, and Cape Town close to Mexico City
Surprising Science

To us humans, the shape and location of oceans and continents seems fixed. But that's only because our lives are so short.

Keep reading Show less

Six-month-olds recognize (and like) when they’re being imitated

A new study may help us better understand how children build social cognition through caregiver interaction.

Personal Growth
  • Scientists speculate imitation helps develop social cognition in babies.
  • A new study out of Lund University shows that six-month-olds look and smile more at imitating adults.
  • Researchers hope the data will spur future studies to discover what role caregiver imitation plays in social cognition development.
  • Keep reading Show less

    New study connects cardiovascular exercise with improved memory

    Researchers at UT Southwestern noted a 47 percent increase in blood flow to regions associated with memory.

    An elderly man runs during his morning exercises at the promenade on the Bund along the Huangpu Rive the Bund in Shanghai on May 18, 2017.

    Photo: Johannes Eisele/AFP via Getty Images
    Surprising Science
    • Researchers at UT Southwestern observed a stark improvement in memory after cardiovascular exercise.
    • The year-long study included 30 seniors who all had some form of memory impairment.
    • The group of seniors that only stretched for a year did not fair as well in memory tests.
    Keep reading Show less
    Scroll down to load more…