There are a bunch of newly minted millionaires this week, after it was announced that Stack OverFlow would be acquired for $1.8 billion by European tech investment firm Prosus. While not exactly a household name, Prosus is a big player in the Chinese tech scene, where it has about a 30% stake in Chinese internet company Tencent. They trimmed their holdings in the company a bit recently, raising $15 billion in cash, which we assume will be used to fund the SO purchase. As with all such changes, there’s considerable angst out in the community about how this could impact everyone’s favorite coding help site. The SO leadership are all adamant that nothing will change, but only time will tell.
Like most of us, [Peter] had a bit of extra time on his hands during quarantine and decided to take a look back at speech recognition technology in the 1970s. Quickly, he started thinking to himself, “Hmm…I wonder if I could do this with an Arduino Nano?” We’ve all probably had similar thoughts, but [Peter] really put his theory to the test.
The hardware itself is pretty straightforward. There is an Arduino Nano to run the speech recognition algorithm and a MAX9814 microphone amplifier to capture the voice commands. However, the beauty of [Peter’s] approach, lies in his software implementation. [Peter] has a bit of an interplay between a custom PC program he wrote and the Arduino Nano. The learning aspect of his algorithm is done on a PC, but the implementation is done in real-time on the Arduino Nano, a typical approach for really any machine learning algorithm deployed on a microcontroller. To capture sample audio commands, or utterances, [Peter] first had to optimize the Nano’s ADC so he could get sufficient sample rates for speech processing. Doing a bit of low-level programming, he achieved a sample rate of 9ksps, which is plenty fast for audio processing.
To analyze the utterances, he first divided each sample utterance into 50 ms segments. Think of dividing a single spoken word into its different syllables. Like analyzing the “se-” in “seven” separate from the “-ven.” 50 ms might be too long or too short to capture each syllable cleanly, but hopefully, that gives you a good mental picture of what [Peter’s] program is doing. He then calculated the energy of 5 different frequency bands, for every segment of every utterance. Normally that’s done using a Fourier transform, but the Nano doesn’t have enough processing power to compute the Fourier transform in real-time, so Peter tried a different approach. Instead, he implemented 5 sets of digital bandpass filters, allowing him to more easily compute the energy of the signal in each frequency band.
The energy of each frequency band for every segment is then sent to a PC where a custom-written program creates “templates” based on the sample utterances he generates. The crux of his algorithm is comparing how closely the energy of each frequency band for each utterance (and for each segment) is to the template. The PC program produces a .h file that can be compiled directly on the Nano. He uses the example of being able to recognize the numbers 0-9, but you could change those commands to “start” or “stop,” for example, if you would like to.
[Peter] admits that you can’t implement the type of speech recognition on an Arduino Nano that we’ve come to expect from those covert listening devices, but he mentions small, hands-free devices like a head-mounted multimeter could benefit from a single word or single phrase voice command. And maybe it could put your mind at ease knowing everything you say isn’t immediately getting beamed into the cloud and given to our AI overlords. Or maybe we’re all starting to get used to this. Whatever your position is on the current state of AI, hopefully, you’ve gained some inspiration for your next project.
There’s little debate that Amazon’s Alexa ecosystem makes it easy to add voice control to your smart home, but not everyone is thrilled with how it works. The fact that all of your commands are bounced off of Amazon’s servers instead of staying internal to the network is an absolute no-go for the more privacy minded among us, and honestly, it’s hard to blame them. The whole thing is pretty creepy when you think about it.
Which is precisely why [André Hentschel] decided to look into replacing the firmware on his Amazon Echo with an open source alternative. The Linux-powered first generation Echo had been rooted years before thanks to the diagnostic port on the bottom of the device, and there were even a few firmware images floating around out there that he could poke around in. In theory, all he had to do was remove anything that called back to the Amazon servers and replace the proprietary bits with comparable free software libraries and tools.
Of course, it ended up being a little trickier than that. The original Echo is running on a 2.6.x series Linux kernel, which even for a device released in 2014, is painfully outdated. With its similarly archaic version of glibc, newer Linux software would refuse to run. [André] found that building an up-to-date filesystem image for the Echo wasn’t a problem, but getting the niche device’s hardware working on a more modern kernel was another story.
He eventually got the microphone array working, but not the onboard digital signal processor (DSP). Without the DSP, the age of the Echo’s hardware really started to show, and it was clear the seven year old smart speaker would need some help to get the job done.
The solution [André] came up with is not unlike how the device worked originally: the Echo performs wake word detection locally, but then offloads the actual speech processing to a more powerful computer. Except in this case, the other computer is on the same network and not hidden away in Amazon’s cloud. The Porcupine project provides the wake word detection, speech samples are broken down into actionable intents with voice2json, and the responses are delivered by the venerable eSpeak speech synthesizer.
As you can see in the video below the overall experience is pretty similar to stock, complete with fancy LED ring action. In fact, since Porcupine allows for multiple wake words, you could even argue that the usability has been improved. While [André] says adding support for Mycroft would be a logical expansion, his immediate goal is to get everything documented and available on the project’s GitLab repository so others can start experimenting for themselves.
They say the best things in life are free, but we would loudly argue that a dollar can go a long way, too. It all depends on what you do with it. When [lonesoulsurfer] saw this busted-up handheld racing game at the junk store, he fell in love with the lines of the case and gladly forked over a buck in order to give it a new life as a wicked little sound-bending machine with dancing LEDs.
Here’s how it works: [lonesoulsurfer] records a few seconds of whatever into the mic with the looping function switched off, then turns it back on to start the fun. He can vary the pitch with the speed controller pot, or add in some echo and reverb. Once the sound is dialed in, he works the pause button on the left to make melodies by stopping and restarting the loop, or just pausing it momentarily depending on the switch setting.
The electronics are a mashup of modules mixed with a custom PCB that combines the recording module with an LM386 amplifier and holds the coolest part of this build — those LEDs that dance to the music behind the toy’s original lenticular screen. Like most of [lonesoulsurfer]’s builds, it’s powered by an old cell phone battery that’s buck-boosted to 5 V. Check out the build and bleep-bloop video after the break.
Lenticular lenses are all kinds of fun. Get one that’s big enough, and you can use it to disappear for a while.
Since we’ve started inviting them into our homes, many of us have began casting a wary eye at our smart speakers. What exactly are they doing with the constant stream of audio we generate, some of it coming from the most intimate and private of moments? Sure, the big companies behind these devices claim they’re being good, but do any of us actually buy that?
It seems like the most prudent path is to not have one of these devices, but they are pretty useful tools. So this hardware mute switch for an Amazon Echo represents a middle ground between digital Luddism and ignoring the possible privacy risks of smart speakers. Yes, these devices all have software options for disabling their microphone arrays, but as [Andrew Peters] relates it, his concern is mainly to thwart exotic attacks on smart speakers, some of which, like laser-induced photoacoustic attacks, we’ve previously discussed. And for that job, only a hardware-level disconnect of the microphones will do.
To achieve this, [Andrew] embedded a Seeeduino Xiao inside his Echo Dot Gen 2. The tiny microcontroller grounds the common I²S data line shared by the seven (!) microphones in the smart speaker, effective disabling them. Enabling and disabling the mics is done via the existing Dot keys, with feedback provided by tones sent through the Dot speaker. It’s a really slick mod, and the amount of documentation [Andrew] did while researching this is impressive. The video below and the accompanying GitHub repo should prove invaluable to other smart speaker hackers.
We are always surprised that Amazon or Google doesn’t employ Kelsey Grammer — TV’s Frasier — as a spokesman for their smart home devices. After all, his catchphrase was, “I’m listening…” Maybe they don’t want to remind you that the device could, theoretically, be sending everything you say to them or a nefarious hacker or government agency. Sure, there’s a mute button and it lights up a red LED.
But if you are truly paranoid, that’s not enough. After all, the same people want to eavesdrop on you would be happy to fake a red light. [Electronupdate] had the same thought and decided to answer the question: does the mute button really mute your microphone? The answer required not only some case opening and analysis, but there was even some IC decapsulation.
We were impressed with the depth of the analysis. The tiny SMD parts are marked confusingly, and if you are really paranoid you don’t believe them anyway. But looking at the actual circuit die is pretty unambiguous. The parts in question turned out to be a Schmitt trigger, a flip flop, and a NAND gate.
Now that November of 2019 has passed, it’s a shame that some of the predictions made in Blade Runner for this future haven’t yet come true. Oh sure, 109 million people living in Los Angeles would be fun and all, but until we get our flying cars, we’ll just have to console ourselves with the ability to “Enhance!” photographs. While the new service, AI Image Enlarger, can’t tease out three-dimensional information, the app is intended to sharpen enlargements of low-resolution images, improving the focus and bringing up details in the darker parts of the image. The marketing material claims that the app uses machine learning, and is looking for volunteers to upload high-resolution images to improve its training set.
We’ve been on a bit of a nano-satellite bender around here lately, with last week’s Hack Chat discussing simulators for CubeSats, and next week’s focusing on open-source thrusters for PocketQube satellites. So we appreciated the timing of a video announcing the launch of the first public LoRa relay satellite. The PocketCube-format satellite, dubbed FossaSat-1, went for a ride to space along with six other small payloads on a Rocket Lab Electron rocket launched from New Zealand. Andreas Spiess has a short video preview of the FossaSat-1 mission, which was designed to test the capabilities of a space-based IoT link that almost anyone can access with cheap and readily available parts; a ground station should only cost a couple of bucks, but you will need an amateur radio license to uplink.
We know GitHub has become the de facto standard for source control and has morphed into a collaboration and project management platform used by everybody who’s anybody in the hacking community. But have you ever wished for a collaboration platform that was a little more in tune with the needs of hardware designers? Then InventHub might be of interest to you. Currently in a limited beta – we tried to sign up for the early access program but seem to have been put on a waiting list – it seems like this will be a platform that brings versioning directly to the ECAD package of your choice. Through plugins to KiCad, Eagle, and all the major ECAD players you’ll be able to collaborate with other designers and see their changes marked up on the schematic — sort of a visual diff. It seems interesting, and we’ll be keeping an eye on developments.
Amazon is now offering a stripped-down version of their Echo smart speaker called Input, which teams up with speakers that you already own to satisfy all your privacy invasion needs on the super cheap — only $10. At that price, it’s hard to resist buying one just to pop it open, which is what Brian Dorey did with his. The teardown is pretty standard, and the innards are pretty much what you’d expect from a modern piece of surveillance apparatus, but the neat trick here involved the flash memory chip on the main board. Brian accidentally overheated it while trying to free up the metal shield over it, and the BGA chip came loose. So naturally, he looked up the pinout and soldered it to a micro-SD card adapter with fine magnet wire. He was able to slip it into a USB SD card reader and see the whole file system for the Input. It was a nice hack, and a good teardown.