An Arduino With Better Speech Recognition Than Siri

The lowly Arduino, an 8-bit AVR microcontroller with a pitiful amount of RAM, terribly small Flash storage space, and effectively no peripherals to speak of, has better speech recognition capabilities than your Android or iDevice.  Eighty percent accuracy, compared to Siri’s sixty.Here’s the video to prove it.

This uSpeech library created by [Arjo Chakravarty] uses a Goertzel algorithm to turn input from a microphone connected to one of the Arduino’s analog pins into phonemes. From there, it’s relatively easy to turn these captured phonemes into function calls for lighting a LED, turning a servo, or even replicating the Siri, the modern-day version of the Microsoft paperclip.

There is one caveat for the uSpeech library: it will only respond to predefined phrases and not normal speech. Still, that’s an extremely impressive accomplishment for a simple microcontroller.

This isn’t the first time we’ve seen [Arjo]’s uSpeech library, but it is the first time we’ve seen it in action. When this was posted months and months ago, [Arjo] was behind the Great Firewall of China and couldn’t post a proper demo. Since this the uSpeech library is a spectacular achievement we asked for a few videos showing off a few applications. No one made the effort, so [Arjo] decided to make use of his new VPN and show off his work to the world.

Video below.

Continue reading “An Arduino With Better Speech Recognition Than Siri”

Raspberry Pi Becomes A Universal Translator

hola-me-nombre-david-conroy

We’re still about 150 years away from the invention of the universal translator by [Lt Cdr Sato] of the Enterprise NX-01, but [Dave] has something that’s almost as good: a speech recognition, translation, and text to speech setup for the Raspberry Pi that theoretically allows anyone to speak in sixty different languages.

After setting up all the Linux audio cruft, [Dave] digs in and starts on converting the guttural vocalizations of a meat speaker into something Google’s speech to text service can understand. From there, it’s off to Google again, this time converting text in one language into the writings of another.

[Dave]’s end result is a shell script that works reasonably well for something that won’t be invented for another 150 years. The video below shows the script successfully translating English to spanish, but it should work equally well with other languages such as dutch and latin, as well as less popular language such as esperanto and french.

Continue reading “Raspberry Pi Becomes A Universal Translator”

Speech Recognition On An Arduino

Speech recognition is usually the purview of fairly high-powered computers chugging along at hundreds of Megahertz with megabytes of RAM. Bringing speech recognition to the low-power microcontroller you’d find in an Arduino sounds like the work of a mad scientist or Ph.D. candidate, but that’s exactly what [Arjo Chakravarty] did. He developed the μSpeech library for the Arduino to allow for speech recognition for a limited set of voice commands.

Where most speech recognition systems use FFT and very fancy math to determine what phonemes a user is saying, [Arjo]’s system does away with this unnecessary complexity in favor of using very, very basic integral and differential calculus.

From [Arjo]’s user guide for μSpeech (PDF warning) we can see it’s possible to connect a small microphone to the analog input of an Arduino and accept voice commands such as ‘left’, ‘right’, and ‘stop’. The accuracy is pretty good, as well – 80% if μSpeech is trying to recognize words, and 30-40% if μSpeech is programmed to recognize single phonemes.

Sadly we couldn’t find a demo video of μSpeech in action, but you’re more than welcome to grab it via github for your own project. Send us a video of μSpeech in action and we’ll put it up.

Sorting Resistors With Speech Recognition

If you’ve ever had to organize a bunch of resistors, you’ll know why [Anthony] created EESpeak. It’s a voice-controlled component look up tool that calculates a component value by listening to you read out color code bands.

In his demo video of EESpeak, [Anthony] reads off the color bands of several resistors whilst the program dutifully calculates and displays the value. [Anthony] also included support for calculating the value of capacitors and inductors by speaking the color bands, as well as EIA-96 codes for SMD parts.

In addition to taking speech input and flashing a component value on the screen, EESpeak also has a text-to-speech function that will tell you what a component without ever having to look at your monitor.

Even though the text-to-speech function seems a little cumbersome – it takes much longer for a computer to speak a value than to display it on the screen – using voice recognition to calculate component values is an awesome idea. With an extremely limited vocabulary the computer has to understand, the error rate of EESpeak is probably very low.

You can check out [Anthony]’s demo video after the break, and of course download the app on his blog.

Continue reading “Sorting Resistors With Speech Recognition”

Adding Speech Control To An Old Robotic Arm

[Joris Laurenssen] has been hanging onto this robotic arm for about twenty years. His most recent project uses some familiar tools to add voice control for each of the arm’s joints.

The arm has its own controller which connects via a DB-25 port. [Joris’] first task was to figure out what type of commands are being sent through the connection. He did some testing to establish the levels of the signals, then hooked up his Arduino and had it read out the values coming through the standard parallel connection. This let him quickly establish the simple ASCII character syntax used to command movement from the device. There’s only eight command sets, and it didn’t take much work to whip up a sketch that can now drive the device.

The second portion of the project is to use voice commands to push these parallel signals to the arm. Instead of reinventing the wheel he decided to use the speech recognition feature of his Android phone. He used Scripting Layer for Android (SL4A) and a Python script to interpret commands, push them to his computer via Telnet, and finally drive the arm. We’ve embedded the video demo after the break. He gives the commands in Dutch but he overlaid comments in English so you can tell what’s going on.

Continue reading “Adding Speech Control To An Old Robotic Arm”

Forget Siri – Make Wolfram Alpha Your Personal Assistant

So you can spend a bundle on a new phone and it comes with a voice-activated digital assistant. But let’s be honest, it’s much more satisfying if you coded up this feature yourself. Here’s a guide on doing just that by combining an Asterisk server with the Wolfram Alpha API.

Asterisk is a package we are already familiar with. It’s an open source Private Branch Exchange suite that lets you build your own telephone network. Chances are, you’re not going to build one just for this project, but if you do make sure to document the process and let us know about it. With the Asterisk server in place you just need to give the assistant script an extension (in this case it’s 4747).

But then there’s the problem of translating your speech into text which can be submitted as a Wolfram query. There’s an API for that too which uses Google to do that translation. From there you can tweak abbreviations and other parameters, but all-in-all your new assistant is ready to go. Call it up and ask what to do when you have a flat tire (yeah, that commercial drives us crazy too).

[Thanks M]

Robot Takes Voice Commands Via Open Source CSR

This is Chippu, a robot that [Achu] has been working on for some time. His most recent addition was to give the robot the ability to respond to voice commands. This is accomplished using a variation of the open source Continuous Speech Recognition package called Julius.

The package depends on two main parts, a set of acoustic models which let it match incoming sounds and a reference library of grammar which is built from those sounds. [Achu] published another post which goes into detail about using Julius on a Linux box. It seems like this is possible with less robust hardware (ie: on an embedded system) if you narrow down the number of acoustic and grammer models that need to be matched.

For now, Chippu is getting commands from a computer that runs the CSR. But this was only used as a proof-of-concept and [Achu] plans to transition the bot over to smaller hardware like the BeagleBoard.

Check out the demonstration of Chippu responding to voice commands in the video after the break.

Continue reading “Robot Takes Voice Commands Via Open Source CSR”