Light Fields: Missing Ingredient For Immersive 3D Video Gets Improved

46 time-synchronized action cameras make up the guts of the capture device.

3D video content has a significant limitation, one that is not trivial to solve. Video captured by a camera — even one with high resolution and a very wide field of view — still records a scene as a flat plane, from a fixed point of view. The limitation this brings will be familiar to anyone who has watched a 3D video (or “360 video”) in VR and moved their head the wrong way. In these videos one is free to look around, but may not change the position of their head in the process. Put another way, pivoting one’s head to look up, down, left, or right is fine. Moving one’s head higher, lower, closer, further, or to the side? None of that works. Natural movements like trying to peek over an object, or moving slightly to the side for a better view simply do not work.

Light field video changes that. It is captured using a device like the one in the image above, and Google has a resource page giving an excellent overview of what light field video is, what it can look like, and how they are doing it. That link covers recent improvements to their camera apparatus as well as to video encoding and rendering, but serves as a great show-and-tell of what light fields are and what they can do.

Light field image, with viewer’s point of view moving in a figure eight pattern. Colors show depth layers interpolated by software.

The meta-camera is a hemisphere just under one meter in diameter that contains an array of 46 time-synchronized action cameras pointed outwards. On the software end, camera input is used to reconstruct the scene and create a 6 DoF volumetric video. In other words, the perspective of the video correctly changes depending on a user moving their point of view (within an area very roughly corresponding to the size of the camera device, anyway.)

The other significant improvement is in compression and rendering of the resulting video. By reducing the video down to a small, fixed number of depth layers to represent the light field content, conventional video encoding and compression can be leveraged to deliver lightweight representations that can render easily on just about any platform. A picture is worth a thousand words, so here is a short animation showing a light field image. The point of view moves in a figure eight, and the perspectives and sight lines all change exactly as one would expect them to. The animation also briefly peeks behind the curtain, showing the color-coded depth layers that the software uses to decide what belongs where.

You can download the PDF of the SIGGRAPH 2020 technical paper, or browse the DeepView resource page hosted on GitHub for plenty of in-browser demos and a downloadable VR headset demo. The team’s video presentation is also embedded below, and gives an excellent overview.

Light fields don’t have to be complex affairs, and there is plenty of room for curious hackers to explore. Interested? [Alex Hornstein] shared a fascinating presentation on working with light fields in his 2018 Hackaday Superconference talk, so check it out.

13 thoughts on “Light Fields: Missing Ingredient For Immersive 3D Video Gets Improved

  1. For still images this can be done with a single camera fixed to a rod, one end of which is in some kind of mechanical socket. So that one could scan a half sphere surface. Maybe even a string or two would do. Then glue all with this fancy software.

  2. OMG, those vids are full of artefacts. Maybe it’s me, but none of the presented video are correct, they are all disturbing. The girl running in the cavern blends with it when she leaves. The cars rear is blurred and deformed and so is the man’s back with the horse. I’m not even speaking of the nausea it’s inducing.

  3. As a friend of mine put it, it is not real 3D unless you can look up a Scots kilt and check if they are going commando.
    These days with so much metadata and telemetry harvesting, you would probably not want that you even attempted to do that stored forever.

    The above appears to be a good stepping stone away from the current stereoscopic systems, in the right direction, but it will not

  4. This is absolutely amazing. I love this. Now put 3 of these in a triangular pattern with a distance of say 15 meters each. Then you could literally walk around in a scene. Imagine this. You could walk straight up to the actors and look them in the eyes of you want. You could watch a movie countless times while having another experience through the change of perspective. Mass adoption would come through porn of course ;-)

Leave a Reply to VladimirCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.