Rendering a 3D environment from Kinect video

posted Nov 15th 2010 12:00pm by
filed under: Kinect hacks, video hacks

[Oliver Kreylos] is using an Xbox Kinect to render 3D environments from real-time video. In other words, he takes the video feed from the Kinect and runs it through some C++ software he wrote to index the pixels in a 3D space that can be manipulated as it plays back. The image above is the result of the Kinect recording video by looking at [Oliver] from his right side. He’s moved the viewer’s playback perspective to be above and in front of him. Part of his body is missing and there is a black shadow because the camera cannot see these areas from its perspective. This is very similar to the real-time 3D scanning we’ve seen in the past, but the hardware and software combination make this a snap to reproduce. Get the source code from his page linked at the top and don’t miss his demo video after the break.

[Thanks Peter]



76 Responses to Rendering a 3D environment from Kinect video

  • Michael says:

    I wonder if you took 3-4 of theses and synchronized them, putting one on each wall of a room, you could get a higher quality environment.

  • Hmmmm, this would be interesting to see with TWO Kinect’s, that’d fill all the blanks, right?

  • I love it!!!! Can you take 3 of these and point them at the center of the room so as to build a complete 3D image without shadows???

  • mixadj says:

    Thats aweswome…………

  • this is probably one of the cooler Kinect hacks I’ve seen… I was apprehensive about the device at first, but now I want one just to fool around with.

  • spyder_21 says:

    Its a nice start towards the right direction. I might buy one soon if cool stuff like this comes out. Would not buy it to play stupid kinecct games.

  • xeracy says:

    @Michael Bradley – 3x Kinect == Cheap Mocap? Im looking at getting one for some form of live production visuals.

  • This is incredible. The Kinect is going to open up a lot of avenues of research.

  • Garak says:

    Really really cool…

    Will using more than one kinect sensors work? My understanding is that it projects a “grid” of IR dots over its field of view and uses them for the measurement. Will intersecting grids confuse the sensors?

  • Roon says:

    I really want to see someone do something with 4 of these, you could do so much…

  • xeracy says:

    @Garak – i imagine this could be done my quickly turning the dots on/off and sampling them in a continuous cycle.

  • turn.self.off says:

    ok, how long until someone builds a esper machine out of all this?!

  • @xeracy, I think so, and this guy did a great job. I am impressed with how when he rotates it, how much information is available around the corners, ie: the front of his face, when the cam is to the side.

    When he rotated, I had flash back to The Matrix, the first scene when the girl is in the air, and all stops, camera rotates, and she continues. Just imaging, that was done with several still cameras all positioned, etc…. with this, just freeze, rotate, and continue!!!

  • Drake says:

    Next stop 3D PORN!

    Anywho …

    Would interfacing the kinect with a wii be HaD worthy?

    I may have a crack at it later this week if so …

  • xeracy says:

    @Michael Bradley – and the kicker? ITS ALL REAL TIME! I really wish i was skilled enough to do this on my own.

  • macpod says:

    Wow, I did not realize the system was that precise. I thought in situations as this that the depth stepping increments would be closer to a foot or so in distance if not more.

    Judging from the coffee mug and torso shots however, it seems the distance granularity is much smaller! Now I want one

  • IssacBinary says:

    Just use real time photoshop content aware fill to fill in the gaps ;)

  • Oren Beck says:

    Hopefully, someone will write a “Parser” environment for Kinect data. To make files that could somehow be rendered into Skeinforge Etc parameter/object details.

    Enlisting the commercial solid print bureaus like oh-Shapeways and their peers in a scheme of “printed object credits for prize funding” might kickstart the ideas.

    It would be way cool to have 3D busts of my Grandkids..

  • Filespace says:

    i would like to see this guy manipulate a virtual object in real-time if even a ball perhaps

  • jc says:

    I’m wondering… what would it look like if you added a mirror in the kinect’s vield of vision?

  • Whatnot says:

    Sorry guys but you likely can’t do more than one at the same time, it projects a IR pattern as part of gathering data, that would interfere and fail with more than one in a room.

  • Sci says:

    Now someone just needs to hook this up to a cheap 3D display for live holographic video calls.
    My bets are on something like the DLP project+spinning mirror combo. Or possibly a spinning LCD if someone can get all the power & signal connections to it.

  • rizla says:

    In response to the can it use more than one kinetic. If you changed the frequency of the IR, would you be able incorporate more kinetics without having them step over each other?

  • TheZ says:

    Quick Thought: You can use different IR wavelengths. You would have to replace the IR LEDs. And code how to detect them.

    >Sorry guys but you likely can’t do more than one at the same time, it projects a IR pattern as part of gathering data, that would interfere and fail with more than one in a room.

  • chbab says:

    It seems that your shape evolved did a huge work in this direction for its player projection as its silhouette is very clean compared to what you see here … It opens the door to augmented reality stuff with a cheap device :)

  • Removed says:

    kind of reminds me of that software from movie Deja Vu

  • rasz says:

    IR projector in Kinect means you cant use more than one at the same time. You can sync them like ToF cameras.

    But you could use few more normal cameras and use Kinect depth info to reconstruct/simulate/cheat the whole scene.
    There are algorithms that reconstruct 3D scene from ONE video feed http://www.avntk.com/3Dfromfmv.htm
    Having few at different angles + one with 3D data should speed things up.

  • da13ro says:

    Not sure if the kinect requires any reference points for calibration – but instead of different wavelengths (which I imagine would be difficult/impractical) couldn’t you setup a shutter system, solid state. Block the IR of other units, sample data on one and cycle. Would slow down your available refresh rate.

    Sweet hack mate, very impressed.

  • rasz says:

    ^^^CANT sync them like ToF cameras.

    I like the idea about different IR wavelengths, but i think Kinect uses laser instead of led.

    I guess you could use two Kinects directly in front of each other just making sure IR dots dont end up at each other cameras – that would give you almost 90% of 3D and texture data.

  • cornelius785 says:

    @TheZ

    I don’t know about different wavelengths for multiple kinects. Alot of it depends if the kinect can differentiate between different wavelenghts AND being able to hack the firmware to do stuff appropriately. Hasn’t all the hacking been on the computer side of just controlling it and getting useful information back? I’d either go with very narrow filters or synchronize all the kinect together and some multiplexing. If it is possible, I’m sure someone will figure out.

    Wasn’t it in Hitch Hiker’s Guide to the galaxy that they mention the progression of user interfaces as: physical button -> touch interface -> wave hand and hope it works? Isn’t the third stage upon now? I’m wondering how long I have to wait before I can control my Mythtv box with hand gestures in the air.

  • aarong11 says:

    Hmm, instead of replacing the IR LEDs, wouldn’t it be possible to place IR filters in front of both the LED and Detector of different Kinects? That way, each Kinect should only detect the wavelength of IR light it was emitting.

  • sarrin32 says:

    you could put on a gimp suit where each joint is a different colour (forearms, hands, thighs etc). The computer could use the colour coding to identify each joint, do measurements etc…then it could do motion capture….add the motion capture to a real time or post calculated 3d scene with digital actors….hurray. How long till we get kinect to bvh converters?

  • Eamon says:

    The trick would be to calculate SIFT points on some frames, and use those to track objects as they move. This is the basic mechanism behind current reconstruction techniques, whether they use one or two cameras. The depth map would improve the fidelity of the representation, and should provide shortcuts that would let this run faster.

  • blue carbuncle says:

    Keep up the good work everyone! Kinect is coming along nicely :)

  • Mike says:

    Outstanding work…

  • jim says:

    I swear I’ve imagined doing this for years, and how cool the glitches and shadows in some set ups would look.

  • Colecago says:

    That is pretty amazing. Picture does not do it justice, video is awesome.

  • qn4 says:

    In regards to those arguing against using multiple Kinects at once, one could consider putting something along the lines of the ‘shutter glasses’ (used for many of the current ’3D’ displays) over the IR projectors, and dropping the (depth) frames not associated with the ‘currently projecting’ Kinect. I’m sure that a bit of crafty software design could interpolate the two 15Hz (normally 30Hz IIRC) streams fairly well, too.
    Better yet if the exposure time is less than 1/60th of a second (30Hz/2) and the sync can be intentionally offset…

  • NatureTM says:

    After this seeing this, I’m absolutely getting one. Very cool.

  • qn4 says:

    da13ro beat me to it… I really have to refresh the page sometimes before posting things. Still, this thing is full of awesome capabilities, and it’s great to see that so many skilled people are making use of it.

  • Martin says:

    Could you polarise the IR coming from two kinects at 90 degrees to each other. Then use filters on the cameras to block the other set.

  • macw says:

    polarization would work just fine provided that there was enough light remaining after the fact for the camera to work properly.

    So would strobing them on and off alternately — it’s a very common procedure when you have multiple sensors operating on the same band (ultrasonic distance sensors are a notable case, since there’s not much ability to reject returns).

    Filtering for wavelength might work, provided that you had physical bandpass filters on the camera. The depth sensor is monochrome and would react basically the same to any frequency it’s responsive to (different brightness but because it’s an uncontrolled environment you can’t rely on that to differentiate two sources).

    I would guess that polarization would be the cheapest and quickest to implement, with bandpass filters being not that much more complex. Time-division multiplexing would only be a good idea where you absolutely cannot modify the kinect hardware in any way for some reason…otherwise it’s just a waste of effort.

    I do really want to see what happens if you put a mirror in the path, though. I’m imagining a “window” in the feed through which you can look and see the other side of your room, just as if the mirror were actually a window into an alternate dimension :P

  • Torwag says:

    about the IR grid stuff I would assume they modulate the IRs by some frequency to avoid influence of other IR emitters. Either that, or one has to modulate the IR by themself. After that it would be relatively easy to use several units simultaenoulsy by giving each of them a different modulation frequency and using an electrical filters or an FFT algorithm to isolate the indvidual frequencies from each other.

    No need for different wavelength, optical filters, etc.

  • TR says:

    Its unlikely that they modulate the IR output (as in pulse the laser/LED whichever it is) because the camera would have to be able to capture at least that fast. >120Hz for US incandescent lights. So, a camera that captures video at greater than say 300-400 FPS to adequately figure out whats noise and whats signal. Doubt they used anything like that. I think the polarization would be the best bet without opening up the connect. Would try it if I had another connect and time…. Maybe someone can try using two pairs of the free 3D theater glasses. One glasses lens for each projector and depth camera.

  • curious says:

    Could you hook up multiple kinects to capture the other angles of the room and have a full 3d map?

  • I just looked at this guys youtube page, OMG, this guy is on it! I thought I was fast with code (only uControllers) this guy rocks! He did some augmented vr, addressed the mirror question, etc..

  • qwed88 says:

    I’m not wanting live video like that, but if it could fix a Kinect to a rotating base and scan environments in 3d! This would be functional I could use this.

    Or software written to take slices so an object could be rotated in front of it and scanned. This would make a relatively cheap 3d scanner.

    Seems as if one could use this with a program like Zbrush to sculpt with your hands.

  • qwed88 says:

    As a follow up to my comment I just saw another video of his, with a cg model of a creature sitting on his desk moving and all in real time. He’s really not that far from the Zbrush idea.

    Imagine a Kinect above your monitor as your sculpting with your hands the model on the monitor?

  • EquinoXe says:

    re 2 kinect: in theory should be easy.
    get 4 linear polarization filters (2 for each kinect).
    Polarize kinect 1 @ 135º and kinect 2 @ 45º
    (place a polarizing filter on depth cam as well as upon IR source)
    now both kinects can’t see each other but they can see their own beam.

  • yeah says:

    @EquinoXe

    Polarization is normally not sustained under diffuse reflection, so even though the light would be polarized, the light coming back from the scene wouldn’t. I guess a shutter system as suggested above could do the trick, but then you’d have to hope that the kinect doesn’t use any type of temporal coherence.

  • Mr Hacker says:

    @EquinoXe
    they could see each other , in fact , the beam is not blocked from confusing both of them , instead , i suggest using the driver,, try aligning both of them so their field of view does not have intersecting ir planes

  • Mr Hacker says:

    wait a second , take kinect’s depth sensor and fit it on cars , viola – instant crash proof car from the future

  • Mr Hacker says:

    @garak
    http://www.wired.com/images_blogs/gadgetlab/2010/11/Canesta-howitworks1.jpg
    read it , kinect doesnt use dots and i think my previous statement is incorrect as it uses a plane of infrared , so this intersection might be not a big problem

  • yeah says:

    @Mr Hacker

    That image turned out to be a bit of a lie/simplification. It actually does use infrared dots, and it does not use ‘time of flight’, as that image suggests. To get an idea what the dots look like, have a look at http://www.dailyvsvidz.com/2010/11/kinect-infrared-projection-dots-vs.html

  • Tom says:

    @All
    - Bandpass filters on the diode and detector would theoretically work, but would SEVERELY cut down on the signal-noise ratio. You have to select the filters very carefully (the set with the widest non-overlapping bandwidth possible, above visible, but less than the cuttoff of the (presumably) silicon detector… then crunch the math on the power loss and possibly find a substantially more powerful diode..

    - Gating the signal with a shutter might work, provided the SNR was high enough and the software didn’t tweak out with the dots disappearing at the duty cycle of the shutter. (e.g. some automatic camera exposure algorithm, or the location processing algorithm, etc.)

    - Polarization would NOT work. The majority of reflected light would not maintain polarization.

    @yeah
    - There are no coherence issues here that I can think of (temporal or spatial). AFAIK this system works with incoherent light. Furthermore, gating the signal does not change the temporal coherence properties of the source.

    IMHO, the bandpass filters are your best bet for multiplexing multiple units. However, it requires careful selection of the filters, and possibly a more powerful source.

  • Necromant says:

    Woah… Looks like I’m gonna get one… as soon as the rush clears and prices become reasonable…

  • anon says:

    the next step i would take is just get it to remember the sides of objects it can’t see anymore.

  • yorak says:

    Could one simple supercheap option for more complete 3d scanning be using a mirror? You would probably need some matrix algebra magic to do the necessary transformations for the 3d scene seen trough the mirror but it could be possible if you can take the hit from decreased accuracy. Also the refelcted dots from the mirror may interfere the foreside scanning.

  • Whatnot says:

    So about that kinect IR, that’s a laser right? You could not possibly get that dot pattern over a large area from a LED could you? I’m not sure but it just makes no sense to me for it to be a LED, even though there are some pretty powerful IR LED, but with that pattern and the surface it covers and the seemingly uniformity of intensity (in the youtubes I saw) and all.

  • Whatnot says:

    Oh and the dots remain the same size as I recall? A non-laser source would have them be larger farther back would it not? And out of focus.

  • fdsfdsf says:

    If more kinects will be used, actor’s body may be projected into 3d environment (into video game, for example). If that’s combined with a 3d helmet…

  • Whatnot says:

    @fdsfdsf This is about hacking it, this is hackaday, this isn’t about microsoft’s plans, although if they made a multi-kinect setup people could hack that too I guess.

    If you want to talk about the kinect as used on the xbox, here’s a fun link:
    http://www.engadget.com/2010/11/15/microsoft-exec-caught-in-privacy-snafu-says-kinect-might-tailor/

  • Cpt.Soda says:

    I imagine some geometry could be extrapolated from earlier frames. Especially for things that don’t move, like the background it should be doable. That way you wouldn’t have to solve the problem of using multiple kinects and at least reduce the shadowing a bit depending on the footage. (3D scanning a room dy moving the kinect?)

    Also different colors for the dots (multiple kinect setup) and a color filter for the sensors would be nice. I don’t know how well that works in non visible light (are there even filters for 3 different colors in non visible light? Are these in the used spectrum?)

    The possibilities of this technology seem to be endless.

  • Einomies says:

    The most obvious and immediate application of this technology would be to fix webcams.

    Because the camera usually sits on top of the screen, and you are presumably looking at the middle of the screen where the picture is.

    This has the unintended consequence of making people look down all the time when they’re actually trying to look into your eyes. It’s like they’re peeking at your boobs all the time.

    If you’d automatically shift the camera to the point where the user appears to be looking, video conferencing would feel much more natural.

  • Willrandship says:

    Well, the more kinects the better. You’d probably want at least four (in a pyramid shape) but 6 would be easier to set up (like walls of a cube) and would give better picture output.

    I want a kinect now :)

  • DarkFader says:

    I’ve read somewhere that the IR camera’s only work @ 30 fps. A bit low imho. But perhaps easier to time divide if you can hack it a bit and synchronize. And only if the chip doesn’t freak out.

  • Joshua says:

    I would love to see how accurate this is. A 3d scanner with code to interface into solidworks and make solid models would be AMAZING. Maybe one camera waved around at different angles and compared to previous versions of the same item.

    Faro arms are expensive :P

  • justme says:

    I don’t think the resolution of the ir camera is good enough for a detailed real time 3d scanner that can compare to what david laser scanner can do as far as fine detail goes. You are going to need to mod a high res camera to see the kinects ir light pattern to get better detail in the scans I think.

  • kf says:

    What if you simply placed two kinects on either side of the room facing directly at each other? The pattern from one would be in the shadow of the pattern from the other. A dot from one might land directly on the other’s sensor, but if the pattern stays consistant you’d only need to adjust the positions slightly. You’d have a black vertical band around everything, but it would still be twice as good as with one.

  • Nathan says:

    You know this could be really good for video games.
    (I know it is but they have been doing so many cool thing it makes you forget)

  • Ozzy_Coff says:

    This might be a great brain wave or could be floored please criticise and share your view of this concept I suggest… Anyway the device projects a “grid” of IR dots over its field of view and uses them for the measurement. There for we are thinking that having more than one will cause the individual intersecting grids to confuse the sensors right? Could we be able to sync both grids into one uniform grid? From this uniform grid each different camera could take their individual perception of the grid and then be pieced together into rendering the 3d environment.

  • CutThroughStuffGuy says:

    Solving the IR interference problem is easy. Just set up 4 of these and modify 3 of them to put off UV, x-ray and gamma radiation instead of IR.

    Oh wait… you want to scan people? Hmm.

    Red, Blue, Yellow dots + IR?

  • th0mas says:

    This + 3d printer + work = printable action figures of yourself

  • CJ says:

    Has anyone made an object that can be imported into 3d software such as 3ds max?

    Thanks!

    CJ

  • Robot says:

    Has anyone noted that Mr Kreylos has a web page of interesting projects?

    Anyway, I noted that he got two Kinect units working together. Scroll to the bottom of the linked page to see more: http://idav.ucdavis.edu/~okreylos/ResDev/Kinect/index.html

    - Robot

  • Leave a Reply

    XHTML: You can use these tags: <a href="" title="" rel=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

    Hack a Day serves up fresh hacks each day, every day from around the web as well as hacking related news.

    Send us your hacks










         




    Hacks

    Resources