Speech commands are all the rage on everything from digital assistants to cars. Adding it to your own projects is a lot of work, right? Maybe not. [Electronoobs] shows a speech board that lets you easily integrate 255 voice commands via serial communications with a host computer. You can see the review in the video below.
He had actually used a similar board before, but that version was a few years ago, and the new module has, of course, many new features. As of version 3.1, the board can handle 255 commands in a more flexible way than the older versions.
For just about any task you care to name, a Linux-based desktop computer can get the job done using applications that rival or exceed those found on other platforms. However, that doesn’t mean it’s always easy to get it working, and speech recognition is just one of those difficult setups.
A project called Voice2JSON is trying to simplify the use of voice workflows. While it doesn’t provide the actual voice recognition, it does make it easier to get things going and then use speech in a natural way.
The software can integrate with several backends to do offline speech recognition including CMU’s pocketsphinx, Dan Povey’s Kaldi, Mozilla’s DeepSpeech 0.9, and Kyoto University’s Julius. However, the code is more than just a thin wrapper around these tools. The fast training process produces both a speech recognizer and an intent recognizer. So not only do you know there is a garage door, but you gain an understanding of the opening and closing of the garage door.
Before smartphones and Internet of Things devices were widely distributed, the Automatic Packet Reporting System (APRS) was the way to send digital information out wirelessly from remote locations. In use since the 80s, it now has an almost hipster “wireless data before it was cool” vibe, complete with plenty of people who use it because it’s interesting, and plenty of others who still need the unique functionality it offers even when compared to more modern wireless data transmission methods. One of those is [Tyler] who shows us how to build an APRS system for a minimum of cost and size.
[Tyler]’s build is called Arrow and operates on the popular 2 metre ham radio band. It’s a Terminal Node Controller (TNC), a sort of ham radio modem, built around an ESP32. The ESP32 handles both the signal processing for the data and also uses its Bluetooth capability to pair to an Android app called APRSDroid. The entire module is only slightly larger than the 18650 battery that powers it, and it can be paired with a computer to send and receive any digital data that you wish using this module as a plug-and-play transceiver.
While the build is still has a few limitations that [Tyler] notes, he hopes that the project will be a way to modernize the APRS protocol using methods for radio transmission that have been improved upon since APRS was first implemented. It should be able to interface easily into any existing ham radio setup, although even small balloon-lofted radio stations can make excellent use of APRS without any extra equipment. Don’t forget that you need a license to operate these in most places, though!
[Udi] lives in an apartment with a pleasant balcony. He also has three kids who are home most of the time now, so he finds himself spending a little more time out on the balcony than he used to. To upgrade his experience, he installed a completely custom shade controller to automatically open and close his sunshade as the day progresses.
Automatic motors for blinds and other shades are available for purchase, but [Udi]’s shade is too big for any of these small motors to work. Finding a large servo with a 2:1 gear ration was the first step, as well as creating a custom mount for it to attach to the sunshade. Once the mechanical situation was solved, he programmed an ESP32 to control the servo. The ESP32 originally had control buttons wired to it, but [Udi] eventually transitioned to NFC for limit switch capabilities and also implemented voice control for the build as well.
While not the first shade controller we’ve ever seen, this build does make excellent use of appropriate hardware and its built-in features and although we suppose it’s possible this could have been done with a 555 timer, the project came together very well, especially for [Ubi]’s first Arduino-compatible build. If you decide to replicate this build, though, make sure that your shade controller is rental-friendly if it needs to be.
Is it just me or did January seem to last for about three months this year? A lot has happened since the turn of the decade 31 days ago, both in the normie world and in our space. But one of the biggest pieces of news in the hacker community is something that won’t even happen for four more months: Hackaday Belgrade. The annual conference in Hackaday’s home-away-from-home in Serbia was announced, and as usual, one had to be a very early bird to score discount tickets. Regular tickets are still on sale, but I suspect that won’t last long. The call for proposals for talks went out earlier in the month, and you should really consider standing up and telling the world what you know. Or tell them what you don’t know and want to find out – there’s no better way to make connections in this community, and no better place to do it.
Someone dropped a tip this week about the possible closing of Tanner Electronics, the venerable surplus dealer located in Carrollton, Texas, outside of Dallas and right around the corner from Dallas Makerspace. The report from someone visiting the store is that the owner has to either move the store or close it down. I spoke to someone at the store who didn’t identify herself, but she confirmed that they need to either downsize or close. She said they’re actively working with a realtor and are optimistic that they’ll find a space that fits their needs, but the clock is ticking – they only have until May to make the change. We covered Tanner’s in a 2015 article on “The Death of Surplus”. It would be sad to lose yet another surplus store; as much as we appreciate being able to buy anything and everything online, nothing beats the serendipity that can strike walking up and down aisles filled with old stuff. We wish them the best of luck.
Are you finding that the smartphone in your pocket is more soul-crushing than empowering? You’re not alone, and more and more people are trying a “digital detox” to free themselves from the constant stimulation. And there’s no better way to go about this than by turning your smartphone into a not-so-smart phone. Envelope, a paper cocoon for your phone, completely masks the screen, replacing it with a simple printed keypad. A companion app allows you to take and make phone calls or use the camera, plus provides a rudimentary clock, but that’s it. The app keeps track of how long you can go before unwrapping your phone and starting those sweet, sweet dopamine hits again. It reminds us a bit of the story we also saw this week about phone separation anxiety in school kids, and the steps schools are taking to mitigate that problem.
We saw a lot of articles this week on a LoRaWAN security vulnerability. The popular IoT network protocol has been billed as “secure by default”, but a white paper released by cybersecurity firm IOActive found a host of potential attack vectors. Their main beef seems to be that client devices which are physically accessible can be reverse engineered to reveal their encryption keys. They also point out the obvious step of taking the QR code off of client devices so an attacker can’t generate session keys for the device.
And finally, the mummy speaks! If you ever wondered what the voice of someone who lived 3,000 years ago sounded like, wonder no more. Using computed tomography (CT) data, scientists in the UK and Germany have recreated the vocal tract of Nesyamun, an Egyptian scribe and priest from the time of pharaoh Rameses XI. He died in his mid-50s, and his mummified remains have been studied since the 1800s. CT data was used to 3D-print Nesyamun’s larynx and nasopharynx, which was then placed atop a “Vocal Tract Organ”, possibly the strangest musical instrument in existence. The resulting vowel-like utterance is brief, to say the least, but it’s clear and strong, and it’s pretty impressive that we can recreate the voice of someone who lived and died three millennia ago.
It’s a system that initially sounds cumbersome, but through smart design, is actually quite streamlined. Users can talk to the system, which uses an Amazon Alexa device for natural language voice recognition. This enables HeyTeddy to respond to questions like “how do I use a flex sensor?” as well as direct commands, such as “Set pin 10 to 250”.
The demo video does a great job of demonstrating the system. While the system is not suited to professional development tasks, its has value as an educational tool for beginners. The system is able to guide users through both hardware setup on a breadboard, as well as guide them through tests when things don’t work. Once their experience level builds, code can be exported to the Arduino IDE for direct editing.
It’s a great tool that has plenty of promise to bring many more users into the hardware hacking fold. It’s out of the workshop of [MAKInteract], whose work we’ve seen before. Video after the break.
The device is built around Google’s AIY Voice Kit, which consists of a Raspberry Pi with some additional hardware and software to enable it to process voice queries. [Liz] combined this with a Raspberry Pi camera and the Google Cloud Vision API. This allows WhatIsThat to respond to users asking questions by taking a photo, and then identifying what it sees in the frame.
It may seem like a frivolous project to those with working vision, but there is serious potential for this technology in the accessibility space. The device can not only describe things like animals or other objects, it can also read text aloud and even identify logos. The ability of the software to go beyond is impressive – a video demonstration shows the AI correctly identifying a Boston Terrier, and attributing a quote to Albert Einstein.
Artificial intelligence has made a huge difference to the viability of voice recognition – because it’s one thing to understand the words, and another to understand what they mean when strung together. Video after the break.