Listen to New Google AI Program Talk Like a Human and Write Music

Google's DeepMind creates AI that blows away existing speech synthesizers. 

Google-owned artificial intelligence company DeepMind presented a deep neural network that generates amazingly human-like speech. Called WaveNet, this AI makes a significant advancement over existing speech synthesizers. What’s more, it can write pretty good classical music. 


DeepMind is a British company, previously known for creating machine-learning AI software that beat the world champion of the notoriously-intricate game Go. Machine learning allows computer systems to teach themselves and make predictions based on gathered data.

The company claims that its WaveNet creates speech that can mimic any human voice and closes the gap with human speech performance by more than 50%. Google’s 500-person blind test study found people rating WaveNet’s English speech at a 4.21 (5 being realistic human speech), while concatenate speech got a 3.86 and parametric an even worse 3.67.

WaveNet also generated speech in Mandarin, which got similar results.

 

They did this by re-imagining currently used text-to-speech (TTS) processes. The two most common being concatenative TTS, used by Apple’s Siri, which involves pre-recorded fragments of speech, and parametric TTS, which sounds even less natural, getting speech generated through computer algorithms.

What’s different about WaveNet is that its can directly model the raw waveform of an audio signal, an extremely complicated task that required a novel neural network. WaveNet learns from voice recordings, then on its own creates speech. This independence also allows the program to generate other kinds of audio, like music.

 

To bolster their claim, DeepMind released some samples, comparing their WaveNets with samples made by concatenate and parametric TTS. You be the judge. 

Parametric:

parametric-1.wav

parametric-2.wav

And now, this is what WaveNet generated:

wavenet-1.wav

wavenet-2.wav

After it was trained on a dataset of classical piano music, WaveNet produced these intriguing musical creations of its own:

sample_1.wav

sample_2.wav

sample_3.wav

What are the implications of this new tech? While it also means our eventual robotic overlords should being easier to talk to, virtual AI assistants like Siri or Cortana could benefit sooner. Google isn’t promising this is headed straight to such applications, however, as WaveNet requires serious computing power.

This achievement shows again the potential of DeepMind's neural networks which can and are being used for fraud and spam detection, handwriting recognition, image search, translation and other tasks.

DeepMind also made a number of Google's data centers use energy more efficiently, slashing its electricity bill. Previously, DeepMind trained its AI to beat dozens of video games.

In a very Google move, the paper on WaveNet is available on Google Drive here.

Want to know more about DeepMind? Check out this video:

How to vaccinate the world’s most vulnerable? Build global partnerships.

Pfizer's partnerships strengthen their ability to deliver vaccines in developing countries.

Susan Silbermann, Global President of Pfizer Vaccines, looks on as a health care worker administers a vaccine in Rwanda. Photo: Courtesy of Pfizer.
Sponsored
  • Community healthcare workers face many challenges in their work, including often traveling far distances to see their clients
  • Pfizer is helping to drive the UN's sustainable development goals through partnerships.
  • Pfizer partnered with AMP and the World Health Organization to develop a training program for healthcare workers.
Keep reading Show less

James Patterson on writing: Plotting, research, and first drafts

The best-selling author tells us his methods.

Videos
  • James Patterson has sold 300 million copies of his 130 books, making him one of the most successful authors alive today.
  • He talks about how some writers can overdo it by adding too much research, or worse, straying from their outline for too long.
  • James' latest book, The President is Missing, co-written with former President Bill Clinton, is out now.
Keep reading Show less

How to split the USA into two countries: Red and Blue

Progressive America would be half as big, but twice as populated as its conservative twin.

Image: Dicken Schrader
Strange Maps
  • America's two political tribes have consolidated into 'red' and 'blue' nations, with seemingly irreconcilable differences.
  • Perhaps the best way to stop the infighting is to go for a divorce and give the two nations a country each
  • Based on the UN's partition plan for Israel/Palestine, this proposal provides territorial contiguity and sea access to both 'red' and 'blue' America
Keep reading Show less

Why the White House Correspondents’ Association dinner won’t feature a comedian in 2019

It's the first time the association hasn't hired a comedian in 16 years.

(Photo by Anna Webber/Getty Images for Vulture Festival)
Culture & Religion
  • The 2018 WHCA ended in controversy after comedian Michelle Wolf made jokes some considered to be offensive.
  • The WHCA apologized for Wolf's jokes, though some journalists and many comedians backed the comedian and decried arguments in favor of limiting the types of speech permitted at the event.
  • Ron Chernow, who penned a bestselling biography of Alexander Hamilton, will speak at next year's dinner.
Keep reading Show less