If a picture is worth a thousand words, a video must be worth millions. However, computers still aren’t very good at analyzing video. Machine vision software like OpenCV can do certain tasks like facial recognition quite well. But current software isn’t good at determining the physical nature of the objects being filmed. [Abe Davis, Justin G. Chen, and Fredo Durand] are members of the MIT Computer Science and Artificial Intelligence Laboratory. They’re working toward a method of determining the structure of an object based upon the object’s motion in a video.
The technique relies on vibrations which can be captured by a typical 30 or 60 Frames Per Second (fps) camera. Here’s how it works: A locked down camera is used to image an object. The object is moved due to wind, or someone banging on it, or any other mechanical means. This movement is captured on video. The team’s software then analyzes the video to see exactly where the object moved, and how much it moved. Complex objects can have many vibration modes. The wire frame figure used in the video is a great example. The hands of the figure will vibrate more than the figure’s feet. The software uses this information to construct a rudimentary model of the object being filmed. It then allows the user to interact with the object by clicking and dragging with a mouse. Dragging the hands will produce more movement than dragging the feet.
The results aren’t perfect – they remind us of computer animated objects from just a few years ago. However, this is very promising. These aren’t textured wire frames created in 3D modeling software. The models and skeletons were created automatically using software analysis. The team’s research paper (PDF link) contains all the details of their research. Check it out, and check out the video after the break.
Continue reading “Interactive Dynamic Video”
I’ve developed or have been involved with a number of imaging technologies, everything from DIY synthetic aperture radar, the MIT thru-wall radar, to the next generation of ultrasound imaging devices. Imagery is cool, but what the end-user often wants is some way by which to get an answer as opposed to viewing a reconstruction. So let’s figure that out.
We’re kicking-off a discussion on how to apply deep learning to more than just beating Jeopardy champions at their own game. We’d like to apply deep learning to hard data, to imagery. Is it possible to get the computer to accurately provide the diagnosis?
I helped to organize a seminar series/discussion panel in New York City on November 13th (you know, for those readers who are closer to New York than to Munich). This discussion panel includes David Ferrucci (the guy who lead the IBM Watson program), MIT Astrophysicist Max Tagmark, and the person who created genetic sequencing on a chip: Jonathan Rothberg. As the vanguard of creativity and enthusiasm in everything technical we’d like the Hackaday community to join the conversation.
Continue reading “Next Week in NYC: How the Age of Machine Consciousness is Transforming Our Lives”
[Gustaf] has been playing around with machine vision for a while and sent in his latest project in on our tip line. It’s a video based car radar system that can detect cars in a camera’s field of vision while cruising down the highway.
Like [Gustaf]’s previous experiments with machine vision where he got a computer to recognize and count yellow cylinders and green rectangles, the radar build uses ADABoost and the AForge AI/Machine Vision C# framework. [Gustef] used an evolutionary algorithm to detect the presence of a car in a video frame, first by selecting 150 images of cars from a pre-recorded video, and the another 1,850 images were selected by a computer and confirmed as a car by a human eye.
With 2000 images of cars in its database, [Gustaf]’s machine vision algorithm is able to detect a car in real-time as he drove down a beautiful Swedish highway. In addition to overlaying a rectangle underneath each car in a video frame and an awesome Terminator-style HUD in the upper right corner, [Gustaf] also a distance display above the hood of his car.
It’s an awesome build that makes us wonder if [Gustef] is building an autonomous car. Even if he’s not, it really makes us want to install a video HUD in our whip, just to see this in action.
It’s neat how a project from 2004 can still be relevant if it’s done really well. This is the case with AVRcam. It uses an Atmel AVR mega8 and can do some pretty impressive things, like track up to eight objects at 30fps. The hardware and software is also open source, so it should be possible to build one yourself. There are many projects like it on the internet, though often they require much beefier hardware. Although, these days you can fit a computer inside a match box, so we see more and more projects just throwing a full USB camera on a robot to do simple things like line following. It’s debatable which solution is more elegant, but maybe not which one is more impressive.
Here’s yet another robot hoping to dominate the human race through the power of ROCK. Cythbot was built to demonstrate Cyth Systems machine vision systems. The device uses a camera to watch the Guitar Hero monitor and identify notes for button presses. The strum bar is then triggered after a delay. The notes are identified solely by pixel intensity since star power can cause them to change shape and color. All button presses are done using pneumatics. The whole system is self-contained and doesn’t require a separate computer for processing. Our favorite part is that the controller remains completely unmodified and the industrial light tree used to indicate notes. The team says that the pneumatics aren’t quite fast enough to hit 100%, unlike some humans. Video of the bot in action after the break. Continue reading “Cythbot, pneumatic Guitar Hero”