Quad-copter controlled with voice commands

In the video above you’ll see two of our favorite things combined, a quad-copter that is voice controlled. The robot responds to natural language so you can tell it to “take off and fly forward six feet”, rather than rely on a cryptic command set. The demonstration shows both an iPhone and a headset used as the input microphone. Language is parsed by a computer and the resulting commands sent to the four-rotor UAV.

This makes us think of the Y.T.’s robot-aided assault in Snow Crash. Perhaps our inventions strive to achieve the fiction that came before it.

[Via Bot Junkie]

Comments

  1. Greg says:

    Well it looks like we are well on the way to having another hobby ruined. This will remove all fine motor skill and practice from flying, the way that software has ruined the club DJ. It used to be an art, not everyone could do it. Now any two bit hack with a bit of cha ching can jump right in.

    …I love it!

  2. strider_mt2k says:

    Incredibly cool!

    I wanna try!

  3. FTWinston says:

    Awesome! I voice-enabled an RC car once (with controller hooked up to the computer), and I never even considered moving beyond one-word commands (“forward” “left” “stop” etc). It looks to me like the bot has a pre-defined map of its environment, combined with some real-time scanning. I wonder how much semantic information is in the map? e.g. is “the windows” a single point, or are there many areas marked as “the windows” and the bot selects the nearest one?

    I suspect its not looking around and working out for itself what “the windows” are!

  4. The R says:

    I’m curious what they are using for scanning the environment that provides such fine resolution, yet is light enough. Some sort of laser rangefinder? Anyone have any information?

  5. gz says:

    Looked like Ubuntu on the pc.

  6. Squirrel says:

    That looks like the MIT Quadrocopter from the IARC Competition, which used a 360 degree Laser Range Finder. I can’t seem to find the video right now.

  7. hunternet93 says:

    Wow. Just wow. I want one!

  8. BigBubbaX says:

    @gx,
    You bet it’s Ubuntu. They aren’t about to let a BSOD crash their voiceCopter!

    Wow, if it could just parse and follow the commands faster…. Still amazingly cool.

  9. monkeyslayer56 says:

    once the speed gets faster i see a lot of potentual in this and not just for a quad copter(which is insanely cool btw)

  10. feeleuphoria says:

    WOW! Naturel voice commands in a noisy enviroment

    After they get the lag down, they should get it able to follow someone or the operator. that would be real useful for search and rescue, law and order and military applications. That way the operator could say follow me or follow suspect and forget about the copter until needed.

  11. grovenstien says:

    How does it know which door to go to or which window to face?

    If i asked you to go to the door, you most likely response would be to go to the nearest door, in the clip however the copter goes to a distant door?

    More clarification on its understanding of commands is needed.

  12. Daley says:

    I did a little poking around and found that these are the guys at the MIT CSAIL. I don’t expect they’ll be GPL’ing anything anytime soon, but a couple posters above noted that they’ve accomplished some of the features this thing has so the info *is* available. It’s just a matter of putting it together.

    As someone stated above, it’s amazing that this thing has that much processing on-board and is still light enough to fly.

  13. M4CGYV3R says:

    Why is it so slow to respond? It doesn’t take nearly that long to parse the speech.

  14. dmcbeing says:

    am i the only one that feels this is going to decapitate a lot of people?

  15. jeditalian says:

    its nice to see that the people in white coats have been working on that thing that i was thinking about like 5 years ago but required too much braincell. closing the gap between computer language and human language. the possibilities are endless, and go way beyond flying a drone. voice recognition is out of my range of ability, but if you could make it recognize individual voices, and transcribe the speech, it could pick the instructor of a class, and type what they say, working math/chemistry/physics problems, and also freeing up the human from writing/typing class notes.

  16. Swarvey says:

    Simply amazing!!! can anyone else say “secret government contract” LOL.

    this is just sheer brilliance and I can just see something like this (without such noisy propulsion) being used in movies, u know, where the guy has some kind of autonomous robotic thing flying near his head….. or even as military drones as seen in Dark Angel… mmmmmmm Jessica Alba *drools*

  17. Squirrel says:

    Unless I’m mistaken, the actual processing is off-board, with the cameras and sensors sending data to the computer and the computer sending back where to go (stabilization and actual flight control are probably on board, however)

  18. Dustin says:

    Next step? Weebo from flubber haha. awesome project.

  19. ChalkBored says:

    “am i the only one that feels this is going to decapitate a lot of people?” — dmcbeing

    You’re supposed to tell it, “PLEASE DON’T KILL ME!” before it gets to that point.

  20. dana says:

    Who cares about stupid toy robots like this, grow up…

    If i want to play around with things like this I just start unreal tournament and use manta.

  21. Iv says:

    One order : “KILL ALL HUMANS”

  22. junkhacker says:

    @dana the toys of today are the tools of tomorrow
    and comparing someone’s real world accomplishments with a video game? please tell me you’re trolling

  23. Stefanie Tellex says:

    Hi,

    To answer some of the questions,

    * It uses a map that’s annotated with the locations of all the windows, doors, etc. We have a prototype system that uses automatic object detection with the camera, but we didn’t use it in this demo. It figures out which landmark to use based on the bot’s orientation and the spatial relation. (If you say “go away from the windows” it will use a different set than “go to the windows.” Although actually it’s marginalizing so it uses all the windows with different weights.)

    * The delay in the second part of the video is almost all the speech recognizer. We think the speech recognizer was slow because the buzz from the quadcopter was activating the speech detector, but we haven’t had a chance to look into it.

    * It is using one onboard Ubuntu machine doing controls and one offboard laptop doing almost everything else.

    More information is here:
    du.tkollar.com.

  24. gz says:

    Thanks Stefanie! Great project!

  25. bob says:

    Looks like it is using a HOKUYO UTM-30LX Rapid-URG Scanning Laser Range Finder

    So a bit pricy for a ‘toy’

    If you actually look into the project, the voice control is just about one of the easiest and least impressive parts.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 94,423 other followers