Homebrewed Voice Assistant Keeps An Eye On Air Quality

Voice assistants are now available from a wide variety of companies, however, [7402] didn’t like the idea of these devices sending data off to the cloud for potentially-nefarious purposes. Thus, the goal became to build a home voice assistant that worked entirely offline, and that’s precisely what [7402] achieved.

The system had limited goals compared to commercial competitors. [7402] was more than happy to deal with a limited vocabulary of understanding as a trade off for privacy. It’s all built around a Raspberry Pi Zero, which runs the Julius speech recognition library. Ultrasonic sensors are used to only activate the device when a person leans in and directly addresses the system.

Capabilities include reporting on the weather, switching light on and off, and advising users of air quality readings from the local authorities.  Feedback to the user is via text-to-speech as well as flashing LEDs. The latter are used to create a quirky, retro “thinking” animation to indicate the system is processing, and has indeed heard a spoken command.

It’s a neat build, and one that covers most of the good things that commercial cloud devices are capable of anyway. As a bonus, no smartphone apps are required, nor will private companies impact the system’s functionality as it relies on no external servers to operate.

We’ve seen similar builds before too, such as this GlaDOS-themed voice assistant. Video after the break.

Continue reading “Homebrewed Voice Assistant Keeps An Eye On Air Quality”

Amazon Echo Gets Open Source Brain Transplant

There’s little debate that Amazon’s Alexa ecosystem makes it easy to add voice control to your smart home, but not everyone is thrilled with how it works. The fact that all of your commands are bounced off of Amazon’s servers instead of staying internal to the network is an absolute no-go for the more privacy minded among us, and honestly, it’s hard to blame them. The whole thing is pretty creepy when you think about it.

Which is precisely why [André Hentschel] decided to look into replacing the firmware on his Amazon Echo with an open source alternative. The Linux-powered first generation Echo had been rooted years before thanks to the diagnostic port on the bottom of the device, and there were even a few firmware images floating around out there that he could poke around in. In theory, all he had to do was remove anything that called back to the Amazon servers and replace the proprietary bits with comparable free software libraries and tools.

Taping into the Echo’s debug port.

Of course, it ended up being a little trickier than that. The original Echo is running on a 2.6.x series Linux kernel, which even for a device released in 2014, is painfully outdated. With its similarly archaic version of glibc, newer Linux software would refuse to run. [André] found that building an up-to-date filesystem image for the Echo wasn’t a problem, but getting the niche device’s hardware working on a more modern kernel was another story.

He eventually got the microphone array working, but not the onboard digital signal processor (DSP). Without the DSP, the age of the Echo’s hardware really started to show, and it was clear the seven year old smart speaker would need some help to get the job done.

The solution [André] came up with is not unlike how the device worked originally: the Echo performs wake word detection locally, but then offloads the actual speech processing to a more powerful computer. Except in this case, the other computer is on the same network and not hidden away in Amazon’s cloud. The Porcupine project provides the wake word detection, speech samples are broken down into actionable intents with voice2json, and the responses are delivered by the venerable eSpeak speech synthesizer.

As you can see in the video below the overall experience is pretty similar to stock, complete with fancy LED ring action. In fact, since Porcupine allows for multiple wake words, you could even argue that the usability has been improved. While [André] says adding support for Mycroft would be a logical expansion, his immediate goal is to get everything documented and available on the project’s GitLab repository so others can start experimenting for themselves.

Continue reading “Amazon Echo Gets Open Source Brain Transplant”

Kinect Gave Us A Preview Of The Future, Though Not The One It Intended

This holiday season, the video game industry hype machine is focused on building excitement for new PlayStation and Xbox consoles. Ten years ago, a similar chorus of hype reached a crescendo with the release of Xbox Kinect, promising to revolutionize how we play. That vision never panned out, but as [Daniel Cooper] of Engadget pointed out in a Kinect retrospective, it premiered consumer technologies that impacted fields far beyond gaming.

Kinect has since withdrawn from the gaming market, because as it turns out gamers are quite content with handheld controllers. This year’s new controllers for a PlayStation or Xbox would be immediately familiar to gamers from ten years ago. Even Nintendo, whose Wii is frequently credited as motivation for Microsoft to develop the Kinect, have arguably taken a step back with Joy-cons of their Switch.

But the Kinect’s success at bringing a depth camera to consumer price levels paved the way to explore many ideas that were previously impossible. The flurry of enthusiastic Kinect hacking proved there is a market for depth camera peripherals, leading to plug-and-play devices like Intel RealSense to make depth-sensing projects easier. The original PrimeSense technology has since been simplified and miniaturized into Face ID unlocking Apple phones. Kinect itself found another job with Microsoft’s HoloLens AR headset. And let’s not forget the upcoming wave of autonomous cars and drones, many of which will see their worlds via depth sensors of some kind. Some might even be equipped with the latest sensor to wear the Kinect name.

Inside the Kinect was also one of the earliest microphone arrays sold to consumers. Enabling the Kinect to figure out which direction a voice is coming from, and isolate it from other noises in the room. Such technology were previously the exclusive domain of expensive corporate conference room speakerphones, but now it forms the core of inexpensive home assistants like an Amazon Echo Dot. Raising the bar so much that hacks needed many more microphones just to stand out.

With the technology available more easily elsewhere, attrition of a discontinued device is reflected in the dwindling number of recent Kinect hacks on these pages. We still see a cool project every now and then, though. As the classic sensor bar itself recedes into history, others will take its place to give us depth sensing and smart audio. But for many of us, Kinect was the ambitious videogame peripheral that gave us our first experience.

Robots Can Finally Answer, Are You Talking To Me?

Voice Assistants, love them, or hate them, are becoming more and more commonplace. One problem for voice assistants is the situation of multiple devices listening in the same place. When a command is given, which device should answer? Researchers at CMU’s Future Interfaces Group [Karan Ahuja], [Andy Kong], [Mayank Goel], and [Chris Harrison] have an answer; smart assistants should try to infer if the user is facing the device they want to talk to. They call it direction-of-voice or DoV.

Currently, smart assistants use a simple race to see who heard it first. The reasoning is that the device you are closest to will likely hear it first. However, in situations with echos or when you’re equidistant from multiple devices, the outcome can seem arbitrary to a user.

The implementation of DoV uses an Extra-Trees Classifier from the python sklearn toolkit. Several other machine learning algorithms were considered, but ultimately efficiency won out and Extra-Trees was selected. Another interesting facet of the research was determining what facing really means. The team had humans ‘listeners’ stand in for smart assistants.  A ‘talker’ would speak the key phrase while the ‘listener’ determined if the talker was facing them or not. Based on their definition of facing, the system can determine if someone is facing the device with 90% accuracy that rises to 93% with per-room calibration.

Their algorithm as well as the data they collected has been open-sourced on GitHub. Perhaps when you’re building your own voice assistant, you can incorporate DoV to improve wake-word accuracy.

Continue reading “Robots Can Finally Answer, Are You Talking To Me?”

Hackaday Prize China Finalists Announced

In the time since the Hackaday Prize was first run it has nurtured an astonishing array of projects from around the world, and brought to the fore some truly exceptional winners that have demonstrated world-changing possibilities. This year it has been extended to a new frontier with the launch of the Hackaday Prize China (Chinese language, here’s a Google Translate link), allowing engineers, makers, and inventors from that country to join the fun. We’re pleased to announce the finalists, from which a winner will be announced in Shenzhen, China on November 23rd. If you’re in Shenzen area, you’re invited to attend the award ceremony!

All six of these final project entries have been translated into English to help share information about projects across the language barrier. On the left sidebar of each project page you can find a link back to the original Chinese language project entry. Each presents a fascinating look into what people in our global community can produce when they live at the source of the component supply chain. Among them are a healthy cross-section of projects which we’ll visit in no particular order. Let’s dig in and see what these are all about!

Continue reading “Hackaday Prize China Finalists Announced”

Taking A Peek Inside Amazon’s Latest Dot

Like a million or so other people, [Brian Dorey] picked up a third generation Echo Dot during Amazon’s big sale a couple weeks ago. Going for less than half its normal retail price, he figured it was the perfect time to explore Amazon’s voice assistant offerings. But the low price also meant that he didn’t feel so bad tearing into the thing for our viewing pleasure.

By pretty much all accounts, the Echo Dot line has been a pretty solid performer as far as corporate subsidized home espionage devices go. They’re small, fairly cheap, and offer the baseline functionality that most people expect. While there was nothing precisely wrong with the earlier versions of the Dot, Amazon has used this latest revision of the device to give the gadget a more “premium” look and feel. They’ve also tried to squeeze a bit better audio out of the roughly hockey puck sized device. But of course, some undocumented changes managed to sneak in there as well.

For one thing, the latest version of the Dot deletes the USB port. Hackers had used the USB port on earlier versions of the hardware to try and gain access to the Android (or at least, Amazon’s flavor of Android) operating system hiding inside, so that’s an unfortunate development. On the flip side, [Brian] reports there’s some type of debug header on the bottom of the device. A similar feature allowed hackers to gain access to some of Amazon’s other voice assistants, so we’d recommend hopeful optimism until told otherwise.

The Echo Dot is powered by a quad-core Mediatek MT8516BAAA 64-bit ARM Cortex-A35 processor and the OS lives on an 8GB Samsung KMFN60012M-B214 eMMC. A pair of Texas Instruments LV320ADC3101 ADCs are used to process the incoming audio from the four microphones arranged around the edge of the PCB, and [Brian] says there appears to be a Fairchild 74LCX74 flip-flop in place to cut the audio feed when the user wants a bit of privacy.

Of course, the biggest change is on the outside. The new Dot is much larger than the previous versions, which means all the awesome enclosures we’ve seen for its predecessor will need to be reworked if they want to be compatible with Amazon’s latest and greatest.

Chatterbox Voice Assistant Knows To Keep Quiet For Privacy

Cruising through the children’s hands-on activity zone at Maker Faire Bay Area, we see kids building a cardboard enclosure for the Chatterbox smart speaker kit. It would be tempting to dismiss the little smiling box as “just for kids” but doing so would overlook something more interesting: an alternative to data-mining corporations who dominate the smart speaker market. People are rightly concerned about Amazon Echo and Google Home, always-listening devices for online retail sending data back to their corporate data centers. In order to be appropriate for children, Chatterbox is none of those things. It only listens when a button is pressed, and its online model is designed to support the mission of CCFC (Campaign for a Commercial-Free Childhood.)

Getting started with a Chatterbox is much like other products designed to encourage young makers. The hardware — Raspberry Pi, custom HAT, speaker and button inside a cardboard enclosure — is conceptually similar to a Google AIY Voice kit but paired with an entirely different software experience. Instead of signing in to a Google developer account, children create their own voice interaction behavior with a block-based programming environment resembling MIT Scratch. Moving online, Chatterbox interactions draw upon resources of similarly privacy-minded entities like DuckDuckGo web search. Voice interaction foundation is built upon a fork of Mycroft with changes focused on education and child-friendliness. If a Chatterbox is unsure whether a query was for “Moana” or “Marijuana”, it will decide in favor of the Disney movie.

Many of these privacy-conscious pieces are open source or freely available, but Chatterbox pulls them all together into a single package that’s an appealing alternative to the big brand options. Based on conversations during Hackaday’s Maker Faire meetup, there’s a market beyond parents of young children. From technically aware adults who lack web API coding skills, to senior citizens unaware of dark corners of the web. Chatterbox Kickstarter campaign has a few more weeks to run but has already reached funding goals. We look forward to having a privacy-minded option in voice assistants.