ZeroNVS is one of those research projects that is rather more impressive than it may look at first glance. On one hand, the 3D reconstructions — we urge you to click that first link to see them — look a bit grainy and imperfect. But on the other hand, it was reconstructed using a single still image as an input.
How is this done? It’s NeRFs (neural radiance fields) which leverages machine learning, but with yet another new twist. Existing methods mainly focus on single objects and masked backgrounds, but a new approach makes this method applicable to a variety of complex, in-the-wild images without the need to train new models.
There are a ton of sample outputs on the project summary page that are worth a browse if you find this sort of thing at all interesting. Some of the 360 degree reconstructions look rough, some are impressive, and some are a bit amusing. For example indoor shots tend to reconstruct rooms that look good, but lack doorways.
There is a research paper for those seeking additional details and a GitHub repository for the code, but the implementation requires some significant hardware.
Awesome, looks like the braindance scenes from CP2077
Just watched about half of “CYBERPUNK 2077 V First Time BD BRAINDANCE Scene HD” over on youtube.
I hadn’t stumbled across this yet.
This looks to have a number of Sci-Fi bits that I like for late night watching.
Spiritplumber, Thanks for the mention of it!
nice to see how the bicycle looks submersed inside a block of grass
It also thinks a single motorcycle front fork leg ought to be enough.
Pretty darn impressive, though.
It works for Vespa,
B^)
That’s really impressive. If they can get a dataset small enough to run standalone, it’d be invaluable for machine vision systems — it’d let the machine apply some degree of common sense to what it’s seeing. For example, answering questions like ‘I can’t see behind this object. Is it likely to be something I can drive around?’ (It’d obviously have to _check_ once it can see behind the object, but it’d still be very useful to have.)
I imagine humans do better because we have a more in depth knowledge of “bicycles” to apply to an incomplete image.
holy crap, they made the camera rotate scene from “Enemy of the State” real
Exactly what I was thinking.
The only ones that could ever have a use for this is the news (US news), where the journalists can act like idiots that think it’s real, to be picked up by a selection of a very dumb audience.
Looks cool, but sadly you need a $6600 GPU to play with it..
Lots of vegetation and the bicycle wheels become 4 “vines” from the backside instead of clearer.
Didn’t Hackaday cover this before?
Yes! I thought so too, but I couldn’t find it.
I remember 360 images were a thing in the 90s, especially on Macs/QTime.
Our school also had a “360°” photo tour made of still images.
Oh well, so many cool things in the 90s (to early 2000s).
Then social media came along and everything got so soulless, commercial and bleak.
And CSS and Javascript replaced all the funny, hand-made HTML sites..
But I’m getting off-topic…
Pictures/images.. 3D images (anaglyph) were a thing in the 90s, too. Especially the cheap red/cyan ones. Stereoscopic images hidden in dot patterns were a thing, too.
Great! Now I can convert all my old “classic” 2d movies to modern holos!
The bicycle example is a great example of how “AI” still doesn’t actually *understand* anything, given that any human looking at that image would know pretty well what a bicycle should look like and how that image should come out.