Hallucinating Machines Generate Tiny Video Clips

September 29, 2016

Hallucination is the erroneous perception of something that’s actually absent – or in other words: A possible interpretation of training data. Researchers from the MIT and the UMBC have developed and trained a generative-machine learning model that learns to generate tiny videos at random. The hallucination-like, 64×64 pixels small clips are somewhat plausible, but also a bit spooky.

The machine-learning model behind these artificial clips is capable of learning from unlabeled “in-the-wild” training videos and relies mostly on the temporal coherence of subsequent frames as well as the presence of a static background. It learns to disentangle foreground objects from the background and extracts the overall dynamics from the scenes. The trained model can then be used to generate new clips at random (as shown above), or from a static input image (as shown in pairs below).

Currently, the team limits the clips to a resolution of 64×64 pixels and 32 frames in duration in order to decrease the amount of required training data, which is still at 7 TB. Despite obvious deficiencies in terms of photorealism, the little clips have been judged “more realistic” than real clips by about 20 percent of the participants in a psychophysical study the team conducted. The code for the project (Torch7/LuaJIT) can already be found on GitHub, together with a pre-trained model. The project will also be shown in December at the 2016 NIPS conference.

21 thoughts on “Hallucinating Machines Generate Tiny Video Clips”

John Spencer says:

September 29, 2016 at 1:41 pm

The thing is hallucinating because it isn’t sleeping properly. Put a few electric sheep in the training data and the spookiness will go.

Report comment

Reply
Dan#942164212 says:

September 29, 2016 at 2:06 pm

That hospital category is like something out of a post Chernobyl nightmare.

http://reason.csail.mit.edu/~vondrick/vgan/hospital/727.gif

http://reason.csail.mit.edu/~vondrick/vgan/hospital/996.gif

http://reason.csail.mit.edu/~vondrick/vgan/hospital/998.gif

Report comment

Reply
Moryc says:

September 29, 2016 at 2:15 pm

They could get more training data by replacing clips with links to YouTube videos. Or better yet: make it watch every YouTube video in order of generated addresses. Learning capability would be limited only by bandwidth and number of videos watched simultaneously…

Report comment

Reply
1. JWhitten says:
  
  September 30, 2016 at 6:11 am
  
  Of course there’s one really obvious application… right? ;-)
  
  Report comment
  
  Reply
2. geekmaster says:
  
  September 30, 2016 at 8:10 am
  
  Oh, so it can generate syntetic cat videos? ;-)
  
  Report comment
  
  Reply
3. Giligain says:
  
  September 30, 2016 at 7:42 pm
  
  Oops, my thumb reported this comment [by accident] & I can’t take it back. If this site doesn’t want false reports, then design a button to cancel previous input.
  
  Report comment
  
  Reply
  1. Elliot Williams says:
    
    September 30, 2016 at 11:09 pm
    
    Meh. We don’t mind. If ten people do it by mistake, they just end up in moderation, and we approve them again!
    
    Report comment
    
    Reply
  2. The Skipper says:
    
    September 30, 2016 at 11:44 pm
    
    Giligain!
    
    Report comment
    
    Reply
RW says:

September 29, 2016 at 2:55 pm

This one scares me a little about the future of machine perception and prediction…

https://hackaday.com/wp-content/uploads/2016/09/10.gif

“Your honor, after exhaustive examination of the black box of my client’s self driving car, it appears the software expected the pedestrian to evaporate by the time she reached the middle of his lane.”

Report comment

Reply
Thomas Wrobel says:

September 29, 2016 at 3:20 pm

A lot of these seem very Slit-scan esq.
I guess because footage bias’s heavily towards things moving on a horizontal axis rather then a virtual the “time out of sync” effect ends up similar.

Report comment

Reply
richfiles says:

September 29, 2016 at 3:57 pm

Part of me wants there to be a “Hallucinate your pic” drop box using this software… But I fear they’d need to create an “anatomy” category… O_o
I WOULD love to be able to play with this, but I have not got the first idea how to even try to make any of the downloadable stuff actually work.

Report comment

Reply
1. RoGeorge says:
  
  September 29, 2016 at 4:08 pm
  
  For “Hallucinate your pic” you may want to try Google Deep Dream: http://deepdreamgenerator.com/
  
  Report comment
  
  Reply
  1. richfiles says:
    
    September 29, 2016 at 7:15 pm
    
    Sure, the psychedelic Google ones… But I mean the little GIFs. Those are pretty cool, and I bet they’d look sweet as a forum icon or something of the sort.
    
    Report comment
    
    Reply
Rodney McKay says:

September 29, 2016 at 5:33 pm

Needs porn clips.

Report comment

Reply
Simon Leese says:

September 29, 2016 at 7:52 pm

This is really interesting from a VR point of view.

Imagine for a moment if these videos were generated in real time, based on a few set images of actions or things that were to be in the VR world and the hallucinating machine generates context, and visuals that are, as far as we’re aware, real and life like.

While the premise of this is for what’s not there. Having it fill the void of what VR does want us to see may be a step forward in having a true life like graphics, if not in a streamed video generated rather than simply rendered from polygons.

Super interesting to see where they take this, and what ‘trippy’ VR experiences that are outside of our normal reality we’ll be able to conceive.

Report comment

Reply
Elliot Williams says:

September 30, 2016 at 12:06 am

‘the little clips have been judged “more realistic” than real clips by about 20 percent of the participants’ who were all on acid at the time.

Report comment

Reply
1. Moritz Walter says:
  
  September 30, 2016 at 5:05 am
  
  Hahahahahahahaha
  
  Report comment
  
  Reply
Tore Lund says:

September 30, 2016 at 3:36 am

So this will be the next in television production? When auto colorized BW photos and movies started appearing in documentaries, in the 90’s, a complete rerun of every single celluloid stump started appearing repackaged as “WW1 Never before seen footage” and “Nazis in color” etc. Now we will be able to include still photos since the dawn of photography. Old french postcards will be uploaded as GIF’s everywhere!!!

Another field would be VR. The algorithms might be suitable to remove the “clinical” feel of computer generated environments?

Report comment

Reply
Turing Complete Machine Machine Machine Mach.... says:

September 30, 2016 at 7:14 am

How did they feed the mushrooms to the computer?
Or a test subject smoking pot was brain scanned and this is how the system interpreted the data?

This makes me remember a joke about a dude smoking his first joint on the balcony, seeing nice changing orange shades moving up and down, from left to right, then again after taking another sip. After the trip ends, he goes back in the house and his mother asks: where have you been? on the balcony. for two days???

Report comment

Reply
1. geekmaster says:
  
  September 30, 2016 at 8:14 am
  
  Smoking a sip of shrooms? Your story confuzzles me…
  
  Report comment
  
  Reply
  1. RW says:
    
    September 30, 2016 at 8:29 am
    
    But only after he’d mainlined some glue and sniffed some acid.
    
    Report comment
    
    Reply