Speech Recognition On An Arduino

September 22, 2012

Speech recognition is usually the purview of fairly high-powered computers chugging along at hundreds of Megahertz with megabytes of RAM. Bringing speech recognition to the low-power microcontroller you’d find in an Arduino sounds like the work of a mad scientist or Ph.D. candidate, but that’s exactly what [Arjo Chakravarty] did. He developed the μSpeech library for the Arduino to allow for speech recognition for a limited set of voice commands.

Where most speech recognition systems use FFT and very fancy math to determine what phonemes a user is saying, [Arjo]’s system does away with this unnecessary complexity in favor of using very, very basic integral and differential calculus.

From [Arjo]’s user guide for μSpeech (PDF warning) we can see it’s possible to connect a small microphone to the analog input of an Arduino and accept voice commands such as ‘left’, ‘right’, and ‘stop’. The accuracy is pretty good, as well – 80% if μSpeech is trying to recognize words, and 30-40% if μSpeech is programmed to recognize single phonemes.

Sadly we couldn’t find a demo video of μSpeech in action, but you’re more than welcome to grab it via github for your own project. Send us a video of μSpeech in action and we’ll put it up.

33 thoughts on “Speech Recognition On An Arduino”

nafix says:

September 22, 2012 at 10:53 am

the dspic has a nice speech recognition library for free from microchip. i used it in a project and it is very accurate. the library supports 100 words if i recall correctly

Report comment

Reply
1. Khan says:
  
  April 25, 2013 at 10:38 pm
  
  Hi Nafix,
  
  Could you please send me your project file for dspic, i just started learning micro controller stuff, i would really appreciate if you could help me with speech recognition on dspic, i bought the easyvr, but it does not have custon speaker independent words… and its accuracy was pretty bad….
  
  I just bought the pickit3, still trying to figure out / trying to learn mplab, any help / input would be great.
  
  thanks in advance.
  khan.
  
  Report comment
  
  Reply
NewCommentor1283 says:

September 22, 2012 at 11:24 am

really? voice recognition on an AVR ?

anybody going to back this up?
AKA try it and post a video of it.

voice recognition wouldnt even run (well?) on a 386 @ 20MHz
how does it run on an 8-BIT @ 16MHz ???

if this is real, he really IS a code guru!

Report comment

Reply
1. RicoElectrico says:
  
  September 22, 2012 at 11:29 am
  
  It is limited domain (ie. predetermined set of words and phrases). You can also do decent speech synthesis on the AVR provided it’s limited domain.
  
  Report comment
  
  Reply
2. James says:
  
  September 22, 2012 at 6:14 pm
  
  Way back in the day I had a Speech Recognition system (expansion cartridge) for a Commodore 64.
  
  Limited “trained” domain naturally, but still you could issue trained voice commands and have it recognise which command you wanted.
  
  Doing something significantly better than that on a modern microcontroller seems reasonable.
  
  Report comment
  
  Reply
3. sudobash1 says:
  
  September 26, 2012 at 10:11 pm
  
  Ah… Another example of why a fast system is not a solution to junk code. Give me good stable code any day.
  
  Report comment
  
  Reply
Thanatox says:

September 22, 2012 at 11:41 am

So do you say “micro Speech” or “mu Speech”?

Report comment

Reply
1. woin says:
  
  September 22, 2012 at 9:12 pm
  
  mi Speech
  
  Report comment
  
  Reply
GTech says:

September 22, 2012 at 1:25 pm

I just tried it on a Leonardo and didn’t get anything on the microphone test. I’m using sparkfun’s microphone with built in preamp. Must be due to Leonardo.

Report comment

Reply
BobFeg says:

September 22, 2012 at 1:26 pm

If this works it really is an amazing feat.

It’s not that easy just to get an AVR to reliably detect simple audio tones…at least to do it flawlessly while leaving enough cycles free to do much else.

I just abandoned the code method of tone detection and went with an LM567 in a recent project using a tiny88.

Report comment

Reply
1. Kuy says:
  
  September 22, 2012 at 3:33 pm
  
  Were you using the Goertzel algorithm? It’s computationally very lightweight and quite accurate if you sample at the right frequency.
  
  Report comment
  
  Reply
  1. wa5znu says:
    
    September 23, 2012 at 4:15 pm
    
    Do you have an implementation of the Goertzel algorithm running on an Arduino? pjrc.com/teensy had one but it was never released, and didn’t run with the Arduino library at all.
    
    Report comment
    
    Reply
Jonathan Moyer says:

September 22, 2012 at 1:58 pm

Why Waste an arduino on somthing like this? Radio shack had a limited voice recognition chip available ten or twenty years ago. I have one or two still in the origional packaging. They could understand Stop, Forward, Reverse, Left, Right, Yes, No, And a few other small words. it was distributed by Archer… I would have more info to give if I had the part number…

Report comment

Reply
1. DC says:
  
  September 22, 2012 at 4:02 pm
  
  It was the Motorola VCP200 Speaker-Independent Word Recognizer.
  
  http://support.radioshack.com/support_supplies/15365.htm
  
  Report comment
  
  Reply
2. Aqib says:
  
  September 22, 2012 at 4:25 pm
  
  Well sunny boy. Half of the things created on Arduino is for reverse engineering and creating yourself to simply learn. You can buy a ESR meter at the shop, people build one with these, you can buy ready made CNC routers but people choose to make their own etc etc. People build on these ideas to develop great things, so this is likely to be the stepping stone of something greater.
  
  Without further-ado next time don’t waste this precious time of yours writing such a stupid comment. hackaday has hundreds of posts where people literally ‘recreate something.
  
  Report comment
  
  Reply
  1. Brian Benchoff says:
    
    September 22, 2012 at 4:56 pm
    
    Well said.
    
    Report comment
    
    Reply
  2. Jonathan Moyer says:
    
    September 22, 2012 at 5:16 pm
    
    Wow. What Did I do to be Snapped at? I Wasn’t Dissing it. I was just asking why. Last I checked, That was still Legal and prefered to being left in the dark and living like a mushroom. As Far as I was aware, I was asking a valid question. But If You really need to go and treat me like an idiot, Then that is your policy.
    
    Report comment
    
    Reply
  3. roni says:
    
    September 23, 2012 at 1:27 am
    
    Well, you came across as very much dissing. Pointing out an alternative prefab solution is ok and useful. But your tone was redundant. As was the “why waste…” question. People like to tinker and experiment making things them selves, hacking things apart and putting them together in novel ways, learning along the way, that is why.
    
    Report comment
    
    Reply
3. Leif - KC8RWR says:
  
  September 22, 2012 at 8:07 pm
  
  I had one of those too. It required way too many support components from what I remember. Of course… that’s too many support components for me to source as a kid on a one day per week paper route and pre-internet.
  
  But.. you asked why ‘waste an Arduino’. Depending on what you consider an Arduino it can be the simpler and less expensive path. I’m thinking homemade boarduino, just an AVR chip, crystal and two caps (or a resonator) mounted on something vs a full Arduino.
  
  On the other hand… if you mean a real, full Arduino. I don’t know. I would imagine the world’s stock of VCP200s has to be dwindling by now though!
  
  Report comment
  
  Reply
duskwuff says:

September 22, 2012 at 2:32 pm

80% really isn’t very good. That’d mean it recognizes the wrong word about once every five tries.

Report comment

Reply
1. Leif - KC8RWR says:
  
  September 22, 2012 at 8:09 pm
  
  I don’t know… it might beat a lot of speech recognition I’ve seen on cellphones, OnStar, etc…
  
  Report comment
  
  Reply
  1. Squirrel says:
    
    September 23, 2012 at 9:33 am
    
    Siri apparently has about 60% accuracy.
    
    Report comment
    
    Reply
Dave says:

September 22, 2012 at 8:50 pm

“Arduino” and “Ph.D.” in the same sentence.
Walter Sobchak: “Has the whole world gone crazy?”

Report comment

Reply
bothersaidpooh says:

September 23, 2012 at 12:29 am

I second the cellphone comment, the Galaxy S and S2 speech recognition sucks flaming monkey balls.

“What would you like to do?”… (inserts phone in microwave) “Microwave, 30 seconds, High.” “I don’t understand your co(click)mmenAAAUGGGH I AM ON FIRE OH THE HUMANITY!! ….. ”

:-)
Someone please do a PIC port of this PDQ, even 10 words is impressive on say a greetings card.

Report comment

Reply
1. Jay says:
  
  September 23, 2012 at 2:06 am
  
  Heh, imagine talking to a greetings card.
  
  “Play the birthday song!”
  
  And then it plays the imperial march.
  
  Report comment
  
  Reply
bothersaidpooh says:

September 23, 2012 at 3:13 am

(adds to projects list)

I call it the “Mystery Card”.
You have to talk to it and find the string of secret words to get it to reveal the recorded message.

Report comment

Reply
Sebastian says:

September 24, 2012 at 4:04 am

I implemented also a speech recognition on an Arduino with the help of the easyVR Shield:
http://www.zipfelmaus.com/blog/arduino-speech-control-easyvr-shield/

Report comment

Reply
hecatomber says:

September 25, 2012 at 2:11 am

Can someone explain me how does the example code below, enter inside the while loop?

int i[3],j,min,x; i[0] = umatch(collvoice,”sop”); //stop i[1] = umatch(collvoice,”ez”); //left i[2] = umatch(collvoice,”i”); //right //find the lowest number while(j<0){ if(i[j]<min){ x = j; min = i[j]; } j++; } if(x == 0){ stop(); } if(x == 1){ left(); } if(x == 2){ right(); }

Report comment

Reply
Dorian says:

September 30, 2012 at 4:31 pm

This doesn’t appear to work. I also got no read for the calibration and my preamp circuit is working fine.

Report comment

Reply
Leandro says:

November 12, 2012 at 9:12 pm

I’ve found a good work around for my home automation project. I used the BitVoicer (http://www.bitsophia.com/BitVoicer.aspx) which is a great deal for those who are dealing with the low processing power of microcontrollers.

Report comment

Reply
1. adirockzz says:
  
  January 27, 2014 at 6:21 am
  
  @leandro,
  is Bitvoicer PC independent? means can i use it for robots or i have to connect it with pc all time?
  
  Report comment
  
  Reply
BotherSaidPooh says:

November 23, 2014 at 11:27 am

Could this recognize dog barks?

Report comment

Reply
Jonatas Loureiro says:

May 18, 2015 at 3:59 pm

Hey, I’m not sure if thats the place to ask, but anyone here knows if this speech recognition modules (EasyVR, LD3320) are able to recognise other sounds like claps, knocking on doors, dogs barking, etc? I really need one that can do this..

Report comment

Reply