Products You May Like
The Deep Voice programme is built by technology giant Baidu, which is described as the Asian counterpart to Google.
Using AI, it uses a technique called deep neural network to mimic British and American voices from only a handful of audio clips.
The development has raised concerns of a Machiavellian new generation of fake news in which manipulating the voice of politicians or celebrities would be possible for anyone with the right software on their bedroom computer.
Former research fellow at the University of Edinburgh and co-founder of speech technology company CerProc Dr Matthew Aylett said: “The Baidu work is really interesting.
“They are a very credible research team. Their samples sound really good. The question is whether they cherry-picked some of the samples to make it sound better.”
The Times published a reconstruction of a speech by US by John F Kennedy which was the work of CereProc.
Author of bestseller An Unfinished Life, a biography of the late President, said: “The recording was understandably troubling to people.
“It’s a bit like bringing someone back from the dead and JFK remains a greatly admired voice and president. In fact, the Trump presidency adds to the nostalgia for Kennedy.”
It was constructed over eight weeks from 831 recordings.
But the recent advances mean in a few years it may only take a few hours to produce a thoroughly convincing imitation of Theresa May or Boris Johnson, according to Dr Aylett.
Baidu claims it can do the same trick with a handful of samples adding up to less than a minute.
Rather than building the cloned voice from scratch Deep Voice takes the model from a library made up of 2,400 voices and alters them until it matches the speaker.
The company has also rebuilt the voice of film critic Roger Ebert and edited the Queen’s speeches to make it sound as though she had composed a rap for her diamond jubilee.
The company usually asks clients to record themselves reading more than 600 sentences, which equates to around 40 minutes of speech.
But the company behind photo editing software Photoshop, Adobe, claims it has made a voice-cloning counterpart to its image-editing software that only needs 20 minutes of audio.
Dr Aylett added that during the 2012 US election his company declined a West Coast company’s offer to buy its synthesised Barack Obama voice.
The voice cloning market is mostly made up of people who fear they will lose their power of speech to chronic diseases, such as amyotrophic lateral sclerosis, which afflicted the late Stephen Hawking.