Light Fields are a subtle but critical element to making 3D video look “real”, and it has little to do with either resolution or field of view. Meta (formerly Facebook) provides a look at a prototype VR headset that provides light field passthrough video to the user for a more realistic view of their surroundings, and it uses a nifty lens and aperture combination to make it happen.
As humans move our eyes (or our heads, for that matter) to take in a scene, we see things from slightly different perspectives in the process. These differences are important cues for our brains to interpret our world. But when cameras capture a scene, they capture it as a flat plane, which is different in a number of important ways from the manner in which our eyes work. A big reason stereoscopic 3D video doesn’t actually look particularly real is because the information it presents lacks these subtleties.
How is this connected to VR headsets? The video passthrough feature of VR — where one sees the real world via external cameras — is increasingly understood to be an important feature, but has limitations. Visual distortions from software processing are one, but video passthrough also suffers from the same issues that 3D videos have: they just don’t look real, and they don’t actually look 3D. This is more than just a cosmetic problem; it gets in the way of interacting with the world. That includes not just handling items but also things like walking around without bumping into corners, or going down stairs rather faster than one intended.
Light fields are the missing link to making 3D video captured by cameras look more real, and one way to capture light fields is to glue up a whole bunch of cameras. Each camera captures a scene from a slightly different perspective, and software can process the resulting data into a light field video that manages to confer all (or at least most) of the little details our brains are expecting to see.
Meta’s light field passthrough prototype (the “Flamera” headset, about halfway down that page) takes the clever approach of using a lens array combined with apertures to create an optic that modifies how a camera sees the world, instead of using an array of cameras and processing the results. The optic looks like a sort of compound eye, allowing the headset to deliver light field passthrough video that is of remarkably higher quality than the usual options.
Want to know more about light fields? We’ve seen fascinating work from Google on light field video as well as a past Hackaday Superconference talk that does a great job of explaining why it’s so important, and how light fields can be approached even as a hobbyist.
I’ve never understood why everyone seems to want active videocamera to screen passthough as the point of research. The best result for “pass through” would be to just let us use our eyes directly – which really wouldn’t be that hard to do, take the AR style projection so the VR generation isn’t directly in the path of the eyes and give the AR set an outer lens made of the same magic polarizing on demand screen as the auto-dark welding helmets – they will go plenty dark enough to let the VR experience be all you can see, and when turned to ‘Pass through’ mode will only be providing some tint and minor distortions to the outside world like wearing a pair of shades…
This is a fun idea though, and I can see it being really useful for lots of things.
I think Mixed Reality is one of the big reasons for wanting this to work, since pretty much all the other requirements for a great experience is present, like sub-millimeter prevision motion tracking, high resolution cameras/displays, human interaction devices with sensory feedback, etc, and mixed reality has very high potential in business for things like remote-meetings with a proper person instead of a flat-screen, product demonstrations without disconnecting the wearer complete from the current environment or even as a manual labour aid like a builder having a level or measuring-tape “built into his eyes” instead of setting up a laser-level, bricklayer cord, etc, and having to adjust it all the time
It’s not quite at the sub-millimeter level yet, and the processing delays and buffer lag means that objects shift out of place (as you move). Mixed reality stuff is going to be pretty weird and motion sickness inducing for a long time still.
Projecting opaque images onto transparent glasses is much harder than just putting a blackout LCD into the optical path, though.
If you paste an LCD onto the surface of regular glasses, the mask image will be very blurry (try holding a finger 2cm from your eye while focusing on the other side of the room). You could put a vague shadow behind an AR image, and that’s not useless, but you couldn’t get sharp outlines.
If you focus the mask LCD through the same optics that project the main displays at a comfortable distance, then it won’t do anything at all; the projected displays add to the light from the real world, they can’t block it.
Speaking of focus, that’s a problem too. When you look at distant objects IRL, your eyes change their focal length, but in VR they don’t (which is part of why stereo displays are less convincing, especially with objects close to your face). In a “passive” AR headset, your eyes would have to work both ways at once, and the likely result would be discomfort, unless all the AR content is at a specific, fixed distance from your head.
And then there’s dynamic range. If there was an AR display technology that could match the brightness of a sunny day, that would mean pointing a device powerful enough to light fires directly into your eyes, so maybe don’t open image attachments from anyone you don’t trust.
The pass-through camera approach is complicated and unsatisfying, but for now it’s the only way “real” AR is feasible.
(If you could make LCD panels with VERY high resolution, though, then true digital holography becomes an option, and that could open up lots of possibilities)
You would focus on the AR or the room selectively in the same way you do when walking around reading a book or for the more smartphone addicted social media – focused close most of the time with snaps to a longer focus when you hear or see in the peripheral something that might demand attention.
And I’m not talking about pasting an LCD to create the image on the surface of regular glasses at all! You project onto a usually semi-reflective coated and carefully shaped lens in most AR systems – that lens you can just see through pretty much normally and the optics required to bring that reflected AR display into a sensible focal distance for your eye are not in the path of the eye.
So yes the AR still has to add brightness for you to see it over the world behind in AR but that is fine for many AR users, BUT sometimes you really want to block out the world. So if you go and coat the outside of those lens with what is effectively a single pixel of LCD that just blocks out the outside world – as done in the auto-darkening welding helmets suddenly you are not adding light to the real world, as you just blocked out the real world so much that semi-reflective AR lens is now passing so little light from the real world all you see is the ‘VR’ display.
Oh I see what you mean. Yeah, you could certainly have a Google Glass-type headset that switches to full VR.
And if it’s just a “flat” HUD then the focus isn’t a problem, because you can choose a (fixed) focal distance that matches the stereo separation. But if the virtual image plane is 2m from your head, and you look at a Pokemon on the other side of the street, then its surroundings will be blurry, because your eyes are still focused at 2m. This is one reason why you can’t just turn any movie into 3D by closing one eye.
I guess if you had optics that could rapidly change the focal distance of the image plane, combined with eye tracking, this might be fixable in software. That would probably make plain VR feel better, too. But I assume the optics would be bulky and expensive.
You’re missing the forest for the trees
“The optic looks like a sort of compound eye, allowing the headset to deliver light field passthrough video that is of remarkably higher quality than the usual options.”
Sort of.
https://engineeringcommunity.nature.com/posts/43439-achromatic-metalens-array-for-full-colour-light-field-imaging
Hmm… looks like something LYTRO did a few years ago, already. Their approach was to put the lens array directly onto the imaging sensor.
Both designs suffer from vastly reduced resolution, requiring sensors with even finer pixel pitch, which in turn lowers low light performance and image noise. The whole contraption sets back image sensor innovation at least a couple of generations, I suppose.
I am not saying the idea is bad, it’s pretty cool in fact, but the drawbacks should be pointed out. Hopefully the idea does not again go down the LYTRO drain.
I am sure that AI based methods will be picking up pretty fast in this field, since pattern recognition is AI’s stronghold.
They’re both light-field capture systems, so they have similarities… but then that’s the much the same as saying every phone looks like something Apple did a few years ago, already.
Total capture resolution here is honestly irrelevant, because we know that is just going to keep improving anyway. The point here is to get the projection correct, which is not presently possible with current approaches.
The low light isn’t seriously worse with higher densities nowadays, because there isn’t that much area being wasted on the sensor anymore, even for stupidly high pixel densities on cell phone sensors. Not that low light is amazing now, but it’s better, and pitch isn’t the biggest limiting factor.
I really think we’re overdue for some light-field tech being used in cameras. Even if it’s mostly just for aiding focusing or the like. More than canon’s dual-pixel stuff.
In fairness, the typical passthrough camera system isn’t particularly high resolution, so there’s plenty of scope for choosing a higher res, slightly higher quality sensor to compensate for the loss of effective resolution from the lens array
The low light isn’t seriously worse with higher densities nowadays, because there isn’t that much area being wasted on the sensor anymore, even for stupidly high pixel densities on cell phone sensors. Not that low light is amazing now, but it’s better, and pitch isn’t the biggest limiting factor.
I really think we’re overdue for some light-field tech being used in cameras. Even if it’s mostly just for aiding focusing or the like. More than canon’s dual-pixel stuff.
comment glitch again.
In lens design parlance, the apparent location of the lens aperture is the ‘Entrance pupil’, located at the ‘nodal point’. This jumble of optics moves the entrance pupil location of the optical system to some distance behind the physical lens, ideally to coincide with the entrance pupil of the eyeball. It does this at the cost of assembling and aligning many fiddly little lenses, going through a complicated post-assembly calibration step, requiring fairly sophisticated processing to reassemble a complete image at the end, and losing most of the resolution of the camera array in the process.
The irony is that many very decent lenses currently available for conventional cameras *already* have an entrance pupil behind the lens, and some are even outside (behind) the camera body itself! No special calibration needed, and the full resolution of the sensor is preserved. Folks interested in doing multi-image panoramas have a very strong interest in knowing where the entrance pupil (since it is the apparent location of the camera), so there are databases and listings of the location of the entrance pupil for various lenses.
It took me a moment to realize that all they built is a low profile light field camera lens assembly.
There is no “light field pass-through”, just a reproduction of a stereo view into that light field.
What also sticks out is that the array is mounted in portrait orientation – unlikely to support a wide field of view. In a limit, the lens array should eventually wrap around the corner with multiple sensors, or become something insect eye – like.
Here’s a fun project where they directly printed lenses and baffles into image sensors, in case you were wondering how the lens array optics could be further miniaturized.
https://www.ito.uni-stuttgart.de/en/research/group-ods/printoptics/