Attempting To Generate Photorealistic Video With Neural Networks

Over the past decade, we’ve seen great strides made in the area of AI and neural networks. When trained appropriately, they can be coaxed into generating impressive output, whether it be in text, images, or simply in classifying objects. There’s also much fun to be had in pushing them outside their prescribed operating region, as [Jon Warlick] attempted recently.

[Jon]’s work began using NVIDIA’s GauGAN tool. It’s capable of generating pseudo-photorealistic images of landscapes from segmentation maps, where different colors of a 2D image represent things such as trees, dirt, or mountains, or water. After spending much time toying with the software, [Jon] decided to see if it could be pressed into service to generate video instead.

The GauGAN tool is only capable of taking in a single segmentation map, and outputting a single image, so [Jon] had to get creative. Experiments were undertaken wherein a video was generated and exported as individual frames, with these frames fed to GauGAN as individual segmentation maps. The output frames from GauGAN were then reassembled into a video again.

The results are somewhat psychedelic, as one would expect. GauGAN’s single image workflow means there is only coincidental relevance between consecutive frames, creating a wild, shifting visage. While it’s not a technique we expect to see used for serious purposes anytime soon, it’s a great experiment at seeing how far the technology can be pushed. It’s not the first time we’ve seen such technology used to create full motion video, either. Video after the break.

16 thoughts on “Attempting To Generate Photorealistic Video With Neural Networks

    1. yea … photorealistic, I think one does not mean what that word means… I cant even call the results nice, its just a high res tilemap on a camera system, and it looks the part

  1. This is impressive, but it’s not photorealistic nor video. It’s a sequence of somewhat realistic still frames. Turning a method which works well for single frames to produce a coherent video is much harder than a “for loop”.

  2. Awsome, in the same way that a car with square wheels is. I’ll just be over in the corner having an epileptic fit, let me know when the technology really does work.

  3. Actually they aren’t.

    NNs are an ABSTRACTION, of ONE aspect, of the way neurons work.

    It’s a VERY important distinction. When you optimize an abstraction, you really REALLY have to stop yourself from generalizing any ideas based off it (or it’s results).

    It’s not about whether or not we understand the brain.
    It’s about NNs NOT being LIKE brains.

    A Car and a skateboard are both “4 wheeled vehicles that can be used for transportation”. But there are things a Car can do that a skateboard can’t. And the reverse is true.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.