Google DeepMind WaveNet Mimics Human Speech

Google’s artificial intelligence (AI) division DeepMind has figured out how to make computers generate speech which sounds natural by mimicking human speech.

To do this DeepMind created WaveNet, a deep generative model of raw audio waveforms. Put into simple terms, WaveNet uses an artificial neural network, which attempts to mimic how the human brain disseminates information, to focus on the construction of sound waves in their raw waveforms and tries to model likely patterns in how they produce natural speech.

Through this process WaveNet learns how to synthesise speech as well as other audio signals such as music. It differs from common text-to-speech system, which form sentences from large databases of short speech fragments and assemble them into sentences, resulting in speech that sounds classically robotic and stilted.

“[Text-to-speech] makes it difficult to modify the voice (for example switching to a different speaker, or altering the emphasis or emotion of their speech) without recording a whole new database,” said DeepMind.

WaveNet

WaveNet nural networkTraining WaveNet to recognise speech patterns then allow it to learn how to produce natural speech takes an enormous amount of computing power.

So it is unlikely Google will be taking the technology and adding it into the next version of Android or Google Now.

However, Google has use similar deep learning neural networks to create the smart image recognition features found in some of its software including Google Photos.

A lot of the complexity of WaveNet stems from the need for it to take at least 16,000 samples of waveforms a second, which means it has to process a vast amount of data.

But, as DeepMind’s research and AI development continues to progress, it would not be surprising to see refined versions and slimmed-down versions of WaveNet appear in smart Google services.

DeepMind has been making waves this year with its deep learning systems, having produced AlphaGO an AI that can bat top human players of the infamously complicated Chinese board game Go.

The technology industry is characterised by rapid change and populated by colourful figures. New developments are often so transformational they seem hard to believe… and in some cases natural scepticism is justified. But can you spot the fake stories from the real ones?

Roland Moore-Colyer

As News Editor of Silicon UK, Roland keeps a keen eye on the daily tech news coverage for the site, while also focusing on stories around cyber security, public sector IT, innovation, AI, and gadgets.

Recent Posts

Virgin Media O2 To Invest £700m To ‘Transform’ 4G, 5G Network

Virgin Media O2 confirms it will invest £2m a day for new mobile masts, small…

2 days ago

Tesla Cybertruck Deliveries On Hold Due To Faulty Side Trim

Deliveries of Telsa's 'bulletproof' Cybertruck are reportedly on hold, amid user complaints side trims are…

2 days ago

Apple Plots Live Translation Option For AirPods – Report

New feature reportedly being developed by Apple for iOS 19, that will allow AirPods to…

2 days ago

Binance Token Rises After Trump Stake Report

Binance BNB token rises after WSJ report the Trump family is in talks to secure…

3 days ago

iRobot Admits ‘Substantial Doubt’ Over Continued Operation

After failed Amazon deal, iRobot warns there is “substantial doubt about the Company's ability to…

3 days ago

Meta’s Community Notes To Use X’s Algorithm

Community Notes testing across Facebook, Instagram and Threads to begin next week in US, using…

3 days ago