Quad-copter Controlled With Voice Commands


In the video above you’ll see two of our favorite things combined, a quad-copter that is voice controlled. The robot responds to natural language so you can tell it to “take off and fly forward six feet”, rather than rely on a cryptic command set. The demonstration shows both an iPhone and a headset used as the input microphone. Language is parsed by a computer and the resulting commands sent to the four-rotor UAV.

This makes us think of the Y.T.’s robot-aided assault in Snow Crash. Perhaps our inventions strive to achieve the fiction that came before it.

[Via Bot Junkie]

25 thoughts on “Quad-copter Controlled With Voice Commands

  1. Well it looks like we are well on the way to having another hobby ruined. This will remove all fine motor skill and practice from flying, the way that software has ruined the club DJ. It used to be an art, not everyone could do it. Now any two bit hack with a bit of cha ching can jump right in.

    …I love it!

  2. Awesome! I voice-enabled an RC car once (with controller hooked up to the computer), and I never even considered moving beyond one-word commands (“forward” “left” “stop” etc). It looks to me like the bot has a pre-defined map of its environment, combined with some real-time scanning. I wonder how much semantic information is in the map? e.g. is “the windows” a single point, or are there many areas marked as “the windows” and the bot selects the nearest one?

    I suspect its not looking around and working out for itself what “the windows” are!

  3. I’m curious what they are using for scanning the environment that provides such fine resolution, yet is light enough. Some sort of laser rangefinder? Anyone have any information?

  4. @gx,
    You bet it’s Ubuntu. They aren’t about to let a BSOD crash their voiceCopter!

    Wow, if it could just parse and follow the commands faster…. Still amazingly cool.

  5. WOW! Naturel voice commands in a noisy enviroment

    After they get the lag down, they should get it able to follow someone or the operator. that would be real useful for search and rescue, law and order and military applications. That way the operator could say follow me or follow suspect and forget about the copter until needed.

  6. How does it know which door to go to or which window to face?

    If i asked you to go to the door, you most likely response would be to go to the nearest door, in the clip however the copter goes to a distant door?

    More clarification on its understanding of commands is needed.

  7. I did a little poking around and found that these are the guys at the MIT CSAIL. I don’t expect they’ll be GPL’ing anything anytime soon, but a couple posters above noted that they’ve accomplished some of the features this thing has so the info *is* available. It’s just a matter of putting it together.

    As someone stated above, it’s amazing that this thing has that much processing on-board and is still light enough to fly.

  8. its nice to see that the people in white coats have been working on that thing that i was thinking about like 5 years ago but required too much braincell. closing the gap between computer language and human language. the possibilities are endless, and go way beyond flying a drone. voice recognition is out of my range of ability, but if you could make it recognize individual voices, and transcribe the speech, it could pick the instructor of a class, and type what they say, working math/chemistry/physics problems, and also freeing up the human from writing/typing class notes.

  9. Simply amazing!!! can anyone else say “secret government contract” LOL.

    this is just sheer brilliance and I can just see something like this (without such noisy propulsion) being used in movies, u know, where the guy has some kind of autonomous robotic thing flying near his head….. or even as military drones as seen in Dark Angel… mmmmmmm Jessica Alba *drools*

  10. Unless I’m mistaken, the actual processing is off-board, with the cameras and sensors sending data to the computer and the computer sending back where to go (stabilization and actual flight control are probably on board, however)

  11. “am i the only one that feels this is going to decapitate a lot of people?” — dmcbeing

    You’re supposed to tell it, “PLEASE DON’T KILL ME!” before it gets to that point.

  12. Hi,

    To answer some of the questions,

    * It uses a map that’s annotated with the locations of all the windows, doors, etc. We have a prototype system that uses automatic object detection with the camera, but we didn’t use it in this demo. It figures out which landmark to use based on the bot’s orientation and the spatial relation. (If you say “go away from the windows” it will use a different set than “go to the windows.” Although actually it’s marginalizing so it uses all the windows with different weights.)

    * The delay in the second part of the video is almost all the speech recognizer. We think the speech recognizer was slow because the buzz from the quadcopter was activating the speech detector, but we haven’t had a chance to look into it.

    * It is using one onboard Ubuntu machine doing controls and one offboard laptop doing almost everything else.

    More information is here:

  13. Looks like it is using a HOKUYO UTM-30LX Rapid-URG Scanning Laser Range Finder

    So a bit pricy for a ‘toy’

    If you actually look into the project, the voice control is just about one of the easiest and least impressive parts.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.