There’s little debate that Amazon’s Alexa ecosystem makes it easy to add voice control to your smart home, but not everyone is thrilled with how it works. The fact that all of your commands are bounced off of Amazon’s servers instead of staying internal to the network is an absolute no-go for the more privacy minded among us, and honestly, it’s hard to blame them. The whole thing is pretty creepy when you think about it.
Which is precisely why [André Hentschel] decided to look into replacing the firmware on his Amazon Echo with an open source alternative. The Linux-powered first generation Echo had been rooted years before thanks to the diagnostic port on the bottom of the device, and there were even a few firmware images floating around out there that he could poke around in. In theory, all he had to do was remove anything that called back to the Amazon servers and replace the proprietary bits with comparable free software libraries and tools.
Of course, it ended up being a little trickier than that. The original Echo is running on a 2.6.x series Linux kernel, which even for a device released in 2014, is painfully outdated. With its similarly archaic version of glibc, newer Linux software would refuse to run. [André] found that building an up-to-date filesystem image for the Echo wasn’t a problem, but getting the niche device’s hardware working on a more modern kernel was another story.
He eventually got the microphone array working, but not the onboard digital signal processor (DSP). Without the DSP, the age of the Echo’s hardware really started to show, and it was clear the seven year old smart speaker would need some help to get the job done.
The solution [André] came up with is not unlike how the device worked originally: the Echo performs wake word detection locally, but then offloads the actual speech processing to a more powerful computer. Except in this case, the other computer is on the same network and not hidden away in Amazon’s cloud. The Porcupine project provides the wake word detection, speech samples are broken down into actionable intents with voice2json, and the responses are delivered by the venerable eSpeak speech synthesizer.
As you can see in the video below the overall experience is pretty similar to stock, complete with fancy LED ring action. In fact, since Porcupine allows for multiple wake words, you could even argue that the usability has been improved. While [André] says adding support for Mycroft would be a logical expansion, his immediate goal is to get everything documented and available on the project’s GitLab repository so others can start experimenting for themselves.
Mad respect and I can’t wait to try this with my Echo.
+1.314159…
I wonder if thuis van be doen with more recent echo’s … As it looks Like those mediatek ones may have locker boot loaders.
As for the DSP, isn’t it part of the GPL kernel dump?
There’s apparently a trick to force the Mediatek chips into bootloader mode by shorting a pin to ground.
If this can be done? you know you can set your keyboard to dutch AND english on Android, right?
Finally working toward a workable product. Nice!
As it is said in the article the Echo as is … ‘is a no-go’ … for me and probably a lot of security/privacy conscious people. You should be able to run your status/control ‘network’ without the Internet. Ie. Isolate the home/building and still have voice and manual control.
+1 on rclark’s comment.
While it is not stellar in it’s response time (yet) it is really a great start!! Congratulations. I will be most intrigued to see where this goes and eager to replace my echo firmware with a more secure local system.
Yet there is always Mycroft AI.
Good job reading until the end.
Haha! That wasn’t very nice.
Does it work on the echo silver by SNL? My kids have been looking to get one for me.
ALEXIS!!
I now regret buying a GEN3 device. I would love to have Alexa’s hardware and skills, but have the freedom to use Google’s speech to text platform, use it to see apps built on Stanford’s Almond, and have the freedom to have the responses directly compared to Alexa’s. I’m probably wrong, but I would like to think that it would be a win win to see advanced commercial hardware operate through an academic environment.
I’m not pro/con toward privacy in the home (I have 3 indoor cameras generally live most of the time and drapes are almost always open) but found this an interesting project. I suspect subsequent versions will greatly improve the response time. I’ll keep my current Gen 4 stuff, it does “most” of what I want. Likely, my next change to the system will be upgrading the router… I have about 40 IP addresses used by the WiFi/hardwired devices (very small compared to some home automation enthusiasts) and I’m beginning to note a small lag. I’m slowly offloading WiFi devices to Z-Wave/Zigbee.
You may well come to regret the move to zigbee / zwave. It is a mess, with rife incompatibility and basic flaws such as the need to re-register all endpoints I’d you switch controllers – which in turn requires physical access to each node. Got badly burnt, never again – ble or wifi & high quality access points is the way forwards IMO
It need better skill on whom is talking to it man or women
While this article seems to be geared toward the Echo, has anyone had success with the Echo Dot’s? I am trying to get a feel before I take an old one apart and start hacking it. Any comments will be helpful!
Here’s some reverse engineering work:
https://darkspirit510.de/2019/10/alexa-amazon-free/
So, if all this does is detect the wake word and hand off processing to another device, is it even as good as a much-easier-to-deal-with ReSpeaker?
https://www.seeedstudio.com/category/Speech-Recognition-c-44.html
Need to do this for the recently lobotomized hk invoke Cortana speakers. It’d only help the 9 of is that bought them though.