Speech recognition on an Arduino

Speech recognition is usually the purview of fairly high-powered computers chugging along at hundreds of Megahertz with megabytes of RAM. Bringing speech recognition to the low-power microcontroller you’d find in an Arduino sounds like the work of a mad scientist or Ph.D. candidate, but that’s exactly what [Arjo Chakravarty] did. He developed the μSpeech library for the Arduino to allow for speech recognition for a limited set of voice commands.

Where most speech recognition systems use FFT and very fancy math to determine what phonemes a user is saying, [Arjo]’s system does away with this unnecessary complexity in favor of using very, very basic integral and differential calculus.

From [Arjo]’s user guide for μSpeech (PDF warning) we can see it’s possible to connect a small microphone to the analog input of an Arduino and accept voice commands such as ‘left’, ‘right’, and ‘stop’. The accuracy is pretty good, as well – 80% if μSpeech is trying to recognize words, and 30-40% if μSpeech is programmed to recognize single phonemes.

Sadly we couldn’t find a demo video of μSpeech in action, but you’re more than welcome to grab it via github for your own project. Send us a video of μSpeech in action and we’ll put it up.

32 thoughts on “Speech recognition on an Arduino

  1. the dspic has a nice speech recognition library for free from microchip. i used it in a project and it is very accurate. the library supports 100 words if i recall correctly

    1. Hi Nafix,

      Could you please send me your project file for dspic, i just started learning micro controller stuff, i would really appreciate if you could help me with speech recognition on dspic, i bought the easyvr, but it does not have custon speaker independent words… and its accuracy was pretty bad….

      I just bought the pickit3, still trying to figure out / trying to learn mplab, any help / input would be great.

      thanks in advance.
      khan.

  2. really? voice recognition on an AVR ?

    anybody going to back this up?
    AKA try it and post a video of it.

    voice recognition wouldnt even run (well?) on a 386 @ 20MHz
    how does it run on an 8-BIT @ 16MHz ???

    if this is real, he really IS a code guru!

    1. It is limited domain (ie. predetermined set of words and phrases). You can also do decent speech synthesis on the AVR provided it’s limited domain.

    2. Way back in the day I had a Speech Recognition system (expansion cartridge) for a Commodore 64.

      Limited “trained” domain naturally, but still you could issue trained voice commands and have it recognise which command you wanted.

      Doing something significantly better than that on a modern microcontroller seems reasonable.

  3. I just tried it on a Leonardo and didn’t get anything on the microphone test. I’m using sparkfun’s microphone with built in preamp. Must be due to Leonardo.

  4. If this works it really is an amazing feat.

    It’s not that easy just to get an AVR to reliably detect simple audio tones…at least to do it flawlessly while leaving enough cycles free to do much else.

    I just abandoned the code method of tone detection and went with an LM567 in a recent project using a tiny88.

      1. Do you have an implementation of the Goertzel algorithm running on an Arduino? pjrc.com/teensy had one but it was never released, and didn’t run with the Arduino library at all.

  5. Why Waste an arduino on somthing like this? Radio shack had a limited voice recognition chip available ten or twenty years ago. I have one or two still in the origional packaging. They could understand Stop, Forward, Reverse, Left, Right, Yes, No, And a few other small words. it was distributed by Archer… I would have more info to give if I had the part number…

    1. Well sunny boy. Half of the things created on Arduino is for reverse engineering and creating yourself to simply learn. You can buy a ESR meter at the shop, people build one with these, you can buy ready made CNC routers but people choose to make their own etc etc. People build on these ideas to develop great things, so this is likely to be the stepping stone of something greater.

      Without further-ado next time don’t waste this precious time of yours writing such a stupid comment. hackaday has hundreds of posts where people literally ‘recreate something.

      1. Wow. What Did I do to be Snapped at? I Wasn’t Dissing it. I was just asking why. Last I checked, That was still Legal and prefered to being left in the dark and living like a mushroom. As Far as I was aware, I was asking a valid question. But If You really need to go and treat me like an idiot, Then that is your policy.

      2. Well, you came across as very much dissing. Pointing out an alternative prefab solution is ok and useful. But your tone was redundant. As was the “why waste…” question. People like to tinker and experiment making things them selves, hacking things apart and putting them together in novel ways, learning along the way, that is why.

    2. I had one of those too. It required way too many support components from what I remember. Of course… that’s too many support components for me to source as a kid on a one day per week paper route and pre-internet.

      But.. you asked why ‘waste an Arduino’. Depending on what you consider an Arduino it can be the simpler and less expensive path. I’m thinking homemade boarduino, just an AVR chip, crystal and two caps (or a resonator) mounted on something vs a full Arduino.

      On the other hand… if you mean a real, full Arduino. I don’t know. I would imagine the world’s stock of VCP200s has to be dwindling by now though!

  6. I second the cellphone comment, the Galaxy S and S2 speech recognition sucks flaming monkey balls.

    “What would you like to do?”… (inserts phone in microwave) “Microwave, 30 seconds, High.” “I don’t understand your co(click)mmenAAAUGGGH I AM ON FIRE OH THE HUMANITY!! ….. ”

    :-)
    Someone please do a PIC port of this PDQ, even 10 words is impressive on say a greetings card.

  7. (adds to projects list)

    I call it the “Mystery Card”.
    You have to talk to it and find the string of secret words to get it to reveal the recorded message.

  8. Can someone explain me how does the example code below, enter inside the while loop?


    int i[3],j,min,x;
    i[0] = umatch(collvoice,”sop”); //stop
    i[1] = umatch(collvoice,”ez”); //left
    i[2] = umatch(collvoice,”i”); //right
    //find the lowest number
    while(j<0){
    if(i[j]<min){
    x = j;
    min = i[j];
    }
    j++;
    }
    if(x == 0){
    stop();
    }
    if(x == 1){
    left();
    }
    if(x == 2){
    right();
    }

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s