Creepy Speaking Neural Networks

March 22, 2017

Tech artist [Alexander Reben] has shared some work in progress with us. It’s a neural network trained on various famous peoples’ speech (YouTube, embedded below). [Alexander]’s artistic goal is to capture the “soul” of a person’s voice, in much the same way as death masks of centuries past. Of course, listening to [Alexander]’s Rob Boss is no substitute for actually watching an old Bob Ross tape — indeed it never even manages to say “happy little trees” — but it is certainly recognizable as the man himself, and now we can generate an infinite amount of his patter.

Behind the scenes, he’s using WaveNet to train the networks. Basically, the algorithm splits up an audio stream into chunks and tries to predict the next chunk based on the previous state. Some pre-editing of the training audio data was necessary — removing the laughter and applause from the Colbert track for instance — but it was basically just plugged right in.

The network seems to over-emphasize sibilants; we’ve never heard Barack Obama hiss quite like that in real life. Feeding noise into machines that are set up as pattern-recognizers tends to push them to the limits. But in keeping with the name of this series of projects, the “unreasonable humanity of algorithms”, it does pretty well.

He’s also done the same thing with multiple speakers (also YouTube), in this case 110 people with different genders and accents. The variation across people leads to a smoother, more human sound, but it’s also not clearly anyone in particular. It’s meant to be continuously running out of a speaker inside a sculpture’s mouth. We’re a bit creeped out, in a good way.

We’ve covered some of [Alexander]’s work before, from the wince-inducing “Robot Bites Man” to the intellectual-conceptual “All Prior Art“. Keep it coming, [Alexander]!

37 thoughts on “Creepy Speaking Neural Networks”

--A says:

March 22, 2017 at 4:42 am

Elliot, seems a [Name] failure occurred the last PP.
–A

Report comment

Reply
1. Mike Szczys says:
  
  March 22, 2017 at 7:16 am
  
  Quite right, should be Alexander. I’ve made the changes, thanks!
  
  Report comment
  
  Reply
wartoaster says:

March 22, 2017 at 5:00 am

Colbear

Also, I’m glad they upgraded the voice effects for the new season of Twin Peaks

Report comment

Reply
1. Mike Szczys says:
  
  March 22, 2017 at 7:24 am
  
  Ha! That reference takes me back. I’m surprised I didn’t hear it the first time I watched the demo.
  
  Report comment
  
  Reply
2. dizot says:
  
  March 22, 2017 at 9:48 pm
  
  I knew that I heard “this is a Formica table” during the Rob Boss whisper segment.
  
  Report comment
  
  Reply
mime says:

March 22, 2017 at 5:16 am

Recently they came up with a way to adjust video of people speaking so that a 3rd person could ‘move their mouth’ (look at the video and you’ll understand: https://youtu.be/ohmajJTcpNk

The output of this article of voice emulation may now sound creepy and artificial, but once the neural networks get more training data, one day they will produce voices that are indistinguishable from reality.

I bet that in “certain countries” suddenly foreign heads of state will start saying really interesting things..

Report comment

Reply
1. Truth says:
  
  March 22, 2017 at 5:48 am
  
  I wonder will royalties go to dead actors, currently living family, who are recycled by a merger of the two technologies. I’m sure SAG-AFTRA (Screen Actors Guild‐American Federation of Television and Radio Artists) will block it until the royalties are sorted out.
  
  Report comment
  
  Reply
  1. Echo_Hotel (@Echo_Hotel) says:
    
    March 22, 2017 at 6:03 pm
    
    Presumably these if going to be something that the actor owns rather than the studio with actors who want the cutting room floor scraps for training their doppelganger, taking home slightly less in the short term for the promise of eternal CG youth.
    
    Report comment
    
    Reply
2. Sheldon says:
  
  March 22, 2017 at 6:24 am
  
  Now that was impressive – I can see it being really good on films that need a re-dub for voices (say for foreign language, script changes or even, er, ‘PG-13′ classifications reasons). That way one wouldn’t get quite so distracted by the visuals not matching the audio and forcing them to pick bad re-dub words (see all the oddball variants with Bruce Willis’ infamous “yippee-ki-yay..” line in DieHard)
  
  Report comment
  
  Reply
  1. Mike says:
    
    March 22, 2017 at 9:51 pm
    
    And…
    dubbing in different languages in the actors “real” voice for foreign distribution
    
    Report comment
    
    Reply
CRJEEA says:

March 22, 2017 at 6:08 am

Next stop, Max Headroom.

Report comment

Reply
Mike says:

March 22, 2017 at 6:20 am

who is Obamer?

Report comment

Reply
1. guest says:
  
  March 22, 2017 at 7:16 am
  
  The Russian sounding one.
  
  Report comment
  
  Reply
RW ver 0.0.2 says:

March 22, 2017 at 6:21 am

Pretty much sounds like sweeping the shortwave dial during high magnetosphere activity, hmmm that might be voice of america, that might be BBC world service, that might be something french…

Report comment

Reply
rasz_pl says:

March 22, 2017 at 6:31 am

garbage in, garbage out

Report comment

Reply
markscudder says:

March 22, 2017 at 6:43 am

You’ve never heard Barack Obama hiss like that, huh?

Besides the obvious joke, that bigot has one of the most pronounced sibilance problems of any modern American celebrity. Only Paul Harvey had it worse. Perhaps we rationalize away the faults of those we adore. It was unsurprising to me the neural net picked it up.

Report comment

Reply
1. sneftel says:
  
  March 22, 2017 at 8:29 am
  
  yaaaaay let’s bring your political views into a blog post on neural networks
  
  Report comment
  
  Reply
  1. Dave Davidson says:
    
    March 22, 2017 at 9:34 am
    
    Train a Neural net with Political decisions and outcomes world wide for the last 2000 years and see what it produces.
    
    Report comment
    
    Reply
    1. sneftel says:
      
      March 22, 2017 at 10:23 am
      
      cerberarchy
      
      Report comment
      
      Reply
  2. Valentin says:
    
    March 22, 2017 at 1:44 pm
    
    Attempting to derail a point with an emotionally triggered response. That is all.
    
    This is how religious figures justify their heinous teachings. I,e. “omgg
    think of the CHILDREN and their precious little Sunday school.
    
    Report comment
    
    Reply
    1. Quin says:
      
      March 23, 2017 at 6:54 pm
      
      Look, a pure example of conservative virtue signaling. I bet the precious snowflake thinks that’s something only “those nasty lie-beral sjw” do.
      
      Report comment
      
      Reply
2. Dan says:
  
  March 22, 2017 at 4:06 pm
  
  http://arcturi.com/sitebuilder/images/Obama_Reptilian-270×270.jpg
  
  Report comment
  
  Reply
drew says:

March 22, 2017 at 8:09 am

Rob Boss whisper is what you hear coming from the shadowy corners of your cabin in the woods when you think you are alone and slowly going insane. This is both creepy and hilarious. Imagining that one coming out of a bust of Bob Ross now continuously and its freaking me the hell out

Report comment

Reply
1. thatfatninja says:
  
  March 22, 2017 at 9:19 am
  
  The random whispers was one thing, the whispering along with the dog mountain google dream interpretation put it over the top.
  
  Report comment
  
  Reply
  1. thatfatninja says:
    
    March 22, 2017 at 9:20 am
    
    were^^
    
    Report comment
    
    Reply
2. chango says:
  
  March 22, 2017 at 9:33 am
  
  Make it animatronic and you’re guaranteed to win the HaD prize.
  
  Report comment
  
  Reply
notarealemail says:

March 22, 2017 at 9:53 am

Rob Boss is reminiscent of garbled cell phone noise.

Report comment

Reply
Joe says:

March 22, 2017 at 10:03 am

Just sounds like an old cassette played backwards. If I wasn’t too lazy I’d reverse it just to make sure.

Report comment

Reply
Keith says:

March 22, 2017 at 10:12 am

A new weapon in our war against automated robocallers!

Report comment

Reply
localhost says:

March 22, 2017 at 11:09 am

Train this with Rick Astley and Justin Bieber. But don’t listen to the output if you do that.

Report comment

Reply
echodelta says:

March 22, 2017 at 12:28 pm

A few decades ago I saw it coming, Howard Cosell or Walter Cronkite reading to you the news. Sadly like TV-movies-CGI it will be the reason for turning it all off.
Like laptop Data Jockeys making “music” of someones work into just worthless noise, end run. Power off.

Report comment

Reply
mabarnett0 says:

March 22, 2017 at 12:36 pm

Not quite unsettling enough. I know! Lets train it on anime characters.
https://youtu.be/FsVSZpoUdSU

Report comment

Reply
1. Dan says:
  
  March 22, 2017 at 4:08 pm
  
  Ah that is the video I wanted to point out, much better, and informative.
  
  And without creepy subliminal “lizard people” references.
  
  Report comment
  
  Reply
2. Crazy Cat Man says:
  
  March 23, 2017 at 11:39 am
  
  9000 is pretty funny: “AAAAAAAAH! AAAAAAAAAAAAAAH! AAAH! AAAAAAAAAAAAAAAAAAAAAH!
  
  Report comment
  
  Reply
gregkennedy says:

March 22, 2017 at 6:42 pm

Is this really an NN? It seems more like Markov chain using sound instead of text.

Report comment

Reply
Tinker Duck says:

March 23, 2017 at 1:58 pm

Never thought about applying the term uncanny valley to audio, until now…

Report comment

Reply
Aprilia says:

April 3, 2018 at 1:29 am

thats good, how you can do it?

Report comment

Reply

Hackaday

Creepy Speaking Neural Networks

37 thoughts on “Creepy Speaking Neural Networks”

Leave a Reply to sneftelCancel reply

Search

Never miss a hack

If you missed it

After 30 Years, Virtual Boy Gets Its Chance To Shine

How Vibe Coding Is Killing Open Source

Building Natural Seawalls To Fight Off The Rising Tide

Ask Hackaday: How Do You Digitize Your Documents?

The Amazing Maser

Our Columns

The Surprising Hackability Of A Knock-Off Chinese Toy Camera

Hackaday Links: February 1, 2026

Secret Ingredients

Hackaday Podcast Episode 355: Person Detectors, Walkie Talkies, Open Smartphones, And A WiFi Traffic Light

Did We Overestimate The Potential Harm From Microplastics?

37 thoughts on “Creepy Speaking Neural Networks”

Leave a Reply to sneftelCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns