Speech generation and recognition have come a long way. It wasn’t that long ago that we were in a breakfast place and endured 30 minutes of a teenaged girl screaming “CALL JUSTIN TAYLOR!” into her phone repeatedly, with no results. Now speech on phones is good enough you might never use the keyboard unless you want privacy. Every time we ask Google or Siri a question and get an answer it makes us feel like we are living in Star Trek.
[Smcameron] probably feels the same way. He’s been working on a Star Trek-inspired bridge simulator called “Space Nerds in Space” for some time. He decided to test out the current state of Linux speech support by adding speech commands and response to it. You can see the results in the video below.
Continue reading “Talking Star Trek”
Speech recognition coupled with AI is the new hotness. Amazon’s Echo is a pretty compelling device, for a largish chunk of change. But if you’re interested in building something similar yourself, it’s just gotten a lot easier. Amazon has opened up a GitHub with instructions and code that will get you up and running with their Alexa Voice Service in short order.
If you read Hackaday as avidly as we do, you’ve already read that Amazon opened up their SDK (confusingly called a “Skills Kit”) and that folks have started working with it already. This newest development is Amazon’s “official” hello-world demo, for what that’s worth.
There are also open source alternatives, so if you just want to get something up and running without jumping through registration and licensing hoops, you’ve got that option as well.
Whichever way you slice it, there seems to be a real interest in having our machines listen to us. It’s probably time for an in-depth comparison of the various options. If you know of a voice recognition system that runs on something embeddable — a single-board computer or even a microcontroller — and you’d like to see us look into it, post up in the comments. We’ll see what we can do.
Thanks to [vvenesect] for the tip!
We’ve been scratching our heads about the various voice-recognition solutions out there. What would you really want to use one for? Turning off the lights in your bedroom without getting up? Sure, it has some 2001: A Space Odyssey
flare flair, but frankly we’ve already got a remote control for that. The best justification for voice control, in our mind, is controlling something while your hands or eyes are already busy.
[Patrick Sébastien Coulombe] clearly has both of his hands on his oscilloscope probes. That’s why he developed Speech2SCPI, a quick mash-up of voice recognition and an oscilloscope control protocol. It combines the Julius open-source speech recognizer project with the Standard Commands for Programmable Instruments (SCPI) syntax to make his scope obey his every command. You’ve got to watch the video below the break to believe how well it works. It even handles his French accent.
Continue reading “You Speak, Your Scope Obeys”
What if you could give yourself a standard eye exam at home? That’s the idea behind [Joel, Margot, and Yuchen]’s final project for [Bruce Land]’s ECE 4760—simulating the standard Snellen eye chart that tests visual acuity from an actual or simulated distance of 20 feet.
This test is a bit different, though. Letters are presented one by one on a TFT display, and the user must identify each letter by speaking into a microphone. As long as the user guesses correctly, the system shows smaller and smaller letters until the size equivalent to the 20/20 line of the Snellen chart is reached.
Since the project relies on speech recognition, the group had to consider things like background noise and the differences in human voices. They use a bandpass filter to screen out frequencies that fall outside the human vocal range. In order to determine the letter spoken, the PIC32 collects the first 256 and last 256 samples, stores them in two arrays, and performs FFT on the first set. The second set of samples undergoe Mel transformation, which helps the PIC assess the sample logarithmically. Finally, the system determines whether it should show a new letter at the same size, a new letter at a smaller size, or end the exam.
While this is not meant to replace eye exams done by certified professionals, it is an interesting project that is true to the principles of the Snellen eye chart. The only thing that might make this better is an e-ink display to make the letters crisp. We’d like to see Snellen’s tumbling E chart implemented as well for children who don’t yet know the alphabet, although that would probably require a vastly different input method. Be sure to check out the demonstration video after the break.
Don’t know who [Bruce Land] is? Of course he’s an esteemed Senior Lecturer at Cornell University. But he’s also extremely active on Hackaday.io, has many great embedded engineering lectures you can watch free-of-charge, and every year we look forward to seeing the projects — like this one — dreamed and realized by his students. Do you have final projects of your own to show off? Don’t be shy about sending in a tip!
Continue reading “Students Set Sights on DIY Eye Exams”
[Naran] was intrigued with the Amazon Echo’s ability to control home electronics, but decided to roll his own. By using a Raspberry Pi with the beta Prota OS, he managed to control some Phillips Hue bulbs and a homebrew smart outlet.
Prota has a speech application, which made the job simpler. He does point out though, that his project doesn’t replace the Echo’s ability to answer questions by searching the Internet. The advantage, though, is it is easily tailored to your specific application. Also, if you have a Raspberry Pi hanging around, you can’t beat the price. Continue reading “Voice Command with No Echo”
A gyroscope is a device made for measuring orientation and can typically be found in modern smartphones or tablet PCs to enable rich user experience. A team from Stanford managed to recognize simple words from only analyzing gyroscope signals (PDF warning). The complex inner workings of MEMS based gyroscopes (which use the Coriolis effect) and Android software limitations only allowed the team to only sniff frequencies under 200Hz. This may therefore explain the average 12% word recognition rate that was achieved with custom recognition algorithms. It may however still be enough to make you reconsider installing an app that don’t necessarily need access to the on-board sensors to work. Interestingly, the paper also states that STMicroelectronics currently have a 80% market share for smartphone / Tablet PCs gyroscopes.
On the same topic, you may be interested to check out a gyroscope-based smartphone keylogging attack we featured a couple of years ago.
The lowly Arduino, an 8-bit AVR microcontroller with a pitiful amount of RAM, terribly small Flash storage space, and effectively no peripherals to speak of, has better speech recognition capabilities than your Android or iDevice. Eighty percent accuracy, compared to Siri’s sixty.Here’s the video to prove it.
This uSpeech library created by [Arjo Chakravarty]
uses a Goertzel algorithm to turn input from a microphone connected to one of the Arduino’s analog pins into phonemes. From there, it’s relatively easy to turn these captured phonemes into function calls for lighting a LED, turning a servo, or even replicating the Siri, the modern-day version of the Microsoft paperclip.
There is one caveat for the uSpeech library: it will only respond to predefined phrases and not normal speech. Still, that’s an extremely impressive accomplishment for a simple microcontroller.
This isn’t the first time we’ve seen [Arjo]’s uSpeech library, but it is the first time we’ve seen it in action. When this was posted months and months ago, [Arjo] was behind the Great Firewall of China and couldn’t post a proper demo. Since this the uSpeech library is a spectacular achievement we asked for a few videos showing off a few applications. No one made the effort, so [Arjo] decided to make use of his new VPN and show off his work to the world.
Continue reading “An Arduino With Better Speech Recognition Than Siri”