Web-enabled Kinect

There are Kinect hacks out there for robot vision, 3D scanners, and even pseudo-LIDAR setups. Until now, one limiting factor to these builds is the requirement for a full-blown computer on the device to deal with the depth maps and do all the necessary processing and computation. This doesn’t seem like much of a problem since [wizgrav] published Intrael, an HTTP interface for the Kinect.

[Eleftherios] caught up to [wizgrav] at his local hackerspace where he did a short tutorial on Intrael. [wizgrav]‘s project provides each frame from the Kinect over HTTP wrapped up in JSON arrays. Everything a Kinect outputs aside from sound is now easily available over the Internet.

The project is meant to put computer vision outside the realm of desktops and robotic laptops and into the web. [wizgrav] has a few ideas on what his project can be used for, such as smart security cameras and all kinds of interactive surfaces.

After the break, check out the Intrael primer [wizgrav] demonstrated (it’s Greek to us, but there are subtitles), and a few demos of what Intrael ‘sees.’

Comments

  1. Matt says:

    So now we’re wrapping video in JSON in HTTP?
    What is the world coming to…

    Seriously, give the JSON a rest, folks!
    I don’t think your HTTP server is going to be processing video in real time, so there’s no reason to not dump raw frames in a TCP stream.

    • Matt says:

      Correction: Shouldn’t have trusted the hackaday summary.
      They’re shipping analyzed data from the frame over JSON over HTTP.

      So it’s maybe even slightly useful.
      But will most certainly require a computer to do the heavy lifting on the Kinect side. Not that that’s a bad thing.

      Just a bad summary here.

      • wizgrav says:

        The computer needed for the heavy lifting should be no more than an ARM Cortex A8 @ 1Ghz. The client server approach also has the advantage of decoupling the box that handles the kinect and the one handling the output. It’s been tested and works great over wifi. This comes very handy since the kinect can track up to 10m away, the usb cable length is no longer an issue

  2. Josh says:

    Whats the difference between this and the work already done with the 6th sense computer? Not much IMO.

  3. Chris Allick says:

    Here is the same thing done with processing and regular video:

    http://badankles.com/?p=209

    • wizgrav says:

      Not quite, I think matt was right when he said that the summary is misleading. The images are not base64 encoded. They’re encoded as JPEGs and presented as a stream through a tag using a technique called MJPEG over http. You can read about it here

      http://en.wikipedia.org/wiki/Motion_JPEG

      The MJPEG stream and the one that delivers the data from the blob tracking are served from separate paths on the server. I like what you did though, I opted for lossy transmission for practical reasons. I also experimented with websockets for the JSON data delivery but got fed up with the protocol changing all the time. Intrael(optionally) supports Server Sent Events which is basically a one-way websocket, maybe that would be of interest to you as well.

      http://en.wikipedia.org/wiki/Server-sent_events

      • Chris Allick says:

        sorry, i posted too quickly. i should have explained that it achieves similar objectives in that it streams videos over websockets from a device like a PS3 eye camera.

        as a side note though my understanding is that MJPEG is rather lossy and not a good format for this data. if you only cared about kinect, an ogg stream would be much better.

        for my purposes, i just wanted to stream video to an ipad. and the same technique can be used from an ipad app back to the web.

      • wizgrav says:

        Yeah, a vorbis stream would achieve better compression ratio even though it would be lossy as well. Another reason MJPEG was chosen was the ease of implementation compared to regular video streaming. You just have to stream regular JPEGs with a text boundary between them. It’s much lighter on resources than normal video compression and the results are still usable as crop material. But if you want to further analyze the pixel data in the browser the best solution would be a lossless format like PNG which I also tested but the file sizes got pretty big. I’m still thinking about it though.

      • wizgrav says:

        sorry I meant theora not vorbis

  4. Chris Allick says:

    sorry, i should say using websockets.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 94,037 other followers