Training machines to effectively complete tasks is an ongoing area of research. This can be done in a variety of ways, from complex programming interfaces, to systems that understand commands in natural langauge. A team from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) wanted to see if it was possible for humans to communicate more directly when training a robot. Their system allows a user to correct a robot’s actions using only their brain.
The concept is simple – using an EEG cap to detect brainwaves, the system measures a special type of brain signals called “error-related potentials”. Simply noticing the robot making a mistake allows the robot to correct itself, and for a nice extra touch – blush in embarassment.
This interface allows for a very intuitive way of working with a robot – upon noticing a mistake, the robot is able to automatically stop or correct its behaviour. Currently the system is only capable of being used for very simple tasks – the video shows the robot sorting objects of two types into corresponding bins. The robot knows that if the human has detected an error, it must simply place the object in the other bin. Further research seeks to expand the possibilities of using this automatic brainwave feedback to train robots for more complex tasks. You can read the research paper here.
MIT’s CSAIL work on lots of exciting projects – their video microphone technology is truly astounding.
[Thanks to Adam Connor-Simmons for the tip!]
MIT’s Computer Science and Artificial Intelligence Laboratory, CSAIL, put out a paper recently about an interesting advance in 3D printing. Naturally, being the computer science and AI lab the paper had a robotic bend to it. In summary, they can 3D print a robot with a rubber skin of arbitrarily varying stiffness. The end goal? Shock absorbing skin!
They modified an Objet printer to print simultaneously using three materials. One is a UV curing solid. One is a UV curing rubber, and the other is an unreactive liquid. By carefully depositing these in a pattern they can print a material with any property they like. In doing so they have been able to print mono body robots that, simply put, crash into the ground better. There are other uses of course, from joints to sensor housings. There’s more in the paper.
We’re not sure how this compares to the Objet’s existing ability to mix flexible resins together to produce different Shore ratings. Likely this offers more seamless transitions and a wider range of material properties. From the paper it also appears to dampen better than the alternatives. Either way, it’s an interesting advance and approach. We wonder if it’s possible to reproduce on a larger scale with FDM.
If a picture is worth a thousand words, a video must be worth millions. However, computers still aren’t very good at analyzing video. Machine vision software like OpenCV can do certain tasks like facial recognition quite well. But current software isn’t good at determining the physical nature of the objects being filmed. [Abe Davis, Justin G. Chen, and Fredo Durand] are members of the MIT Computer Science and Artificial Intelligence Laboratory. They’re working toward a method of determining the structure of an object based upon the object’s motion in a video.
The technique relies on vibrations which can be captured by a typical 30 or 60 Frames Per Second (fps) camera. Here’s how it works: A locked down camera is used to image an object. The object is moved due to wind, or someone banging on it, or any other mechanical means. This movement is captured on video. The team’s software then analyzes the video to see exactly where the object moved, and how much it moved. Complex objects can have many vibration modes. The wire frame figure used in the video is a great example. The hands of the figure will vibrate more than the figure’s feet. The software uses this information to construct a rudimentary model of the object being filmed. It then allows the user to interact with the object by clicking and dragging with a mouse. Dragging the hands will produce more movement than dragging the feet.
The results aren’t perfect – they remind us of computer animated objects from just a few years ago. However, this is very promising. These aren’t textured wire frames created in 3D modeling software. The models and skeletons were created automatically using software analysis. The team’s research paper (PDF link) contains all the details of their research. Check it out, and check out the video after the break.
Continue reading “Interactive Dynamic Video”
A Group of MIT, Microsoft, and Adobe researchers have managed to reproduce sound using video alone. The sounds we make bounce off every object in the room, causing microscopic vibrations. The Visual Microphone utilizes a high-speed video camera and some clever signal processing to extract an audio signal from these vibrations. Using video of everyday objects such as snack bags, plants, Styrofoam cups, and water, the team was able to reproduce tones, music and speech. Capturing audio from light isn’t exactly new. Laser microphones have been around for years. The difference here is the fact that the visual microphone is a completely passive device. No laser or special illumination is required.
The secret is in the signal processing, which the team explains in their SIGGRAPH paper (pdf link). They used a complex steerable pyramid along with wavelet filters to obtain local pixel motion values. These local values are averaged into a global motion value. From this global motion value the team is able to measure movement down to 1/1000 of a pixel. Plenty of resolution to decode audio data.
Most of the research is performed with high-speed video cameras, which are well outside the budget of the average hacker. Don’t despair though, the team did prove out that the same magic can be performed with consumer cameras, albeit with lower quality results. The team took advantage of the rolling shutter found in most of today’s CMOS imager based consumer cameras. Rolling shutter CMOS sensors capture images one row at a time. Each row can be processed in a similar fashion to the frames of the high-speed camera. There are some inter-frame gaps when the camera isn’t recording anything though. Even with the reduced resolution, it’s easy to pick out “Mary had a little lamb” in the video below.
We’re blown away by this research, and we’re sure certain organizations will be looking into it for their own use. Don’t pull out your tin foil hats yet though. Foil containers proved to be one of the best sound reflectors.
Continue reading “Focus Your Ears with The Visual Microphone”