This holiday season, the video game industry hype machine is focused on building excitement for new PlayStation and Xbox consoles. Ten years ago, a similar chorus of hype reached a crescendo with the release of Xbox Kinect, promising to revolutionize how we play. That vision never panned out, but as [Daniel Cooper] of Engadget pointed out in a Kinect retrospective, it premiered consumer technologies that impacted fields far beyond gaming.
Kinect has since withdrawn from the gaming market, because as it turns out gamers are quite content with handheld controllers. This year’s new controllers for a PlayStation or Xbox would be immediately familiar to gamers from ten years ago. Even Nintendo, whose Wii is frequently credited as motivation for Microsoft to develop the Kinect, have arguably taken a step back with Joy-cons of their Switch.
But the Kinect’s success at bringing a depth camera to consumer price levels paved the way to explore many ideas that were previously impossible. The flurry of enthusiastic Kinect hacking proved there is a market for depth camera peripherals, leading to plug-and-play devices like Intel RealSense to make depth-sensing projects easier. The original PrimeSense technology has since been simplified and miniaturized into Face ID unlocking Apple phones. Kinect itself found another job with Microsoft’s HoloLens AR headset. And let’s not forget the upcoming wave of autonomous cars and drones, many of which will see their worlds via depth sensors of some kind. Some might even be equipped with the latest sensor to wear the Kinect name.
Inside the Kinect was also one of the earliest microphone arrays sold to consumers. Enabling the Kinect to figure out which direction a voice is coming from, and isolate it from other noises in the room. Such technology were previously the exclusive domain of expensive corporate conference room speakerphones, but now it forms the core of inexpensive home assistants like an Amazon Echo Dot. Raising the bar so much that hacks needed many more microphones just to stand out.
With the technology available more easily elsewhere, attrition of a discontinued device is reflected in the dwindling number of recent Kinect hacks on these pages. We still see a cool project every now and then, though. As the classic sensor bar itself recedes into history, others will take its place to give us depth sensing and smart audio. But for many of us, Kinect was the ambitious videogame peripheral that gave us our first experience.
Voice Assistants, love them, or hate them, are becoming more and more commonplace. One problem for voice assistants is the situation of multiple devices listening in the same place. When a command is given, which device should answer? Researchers at CMU’s Future Interfaces Group [Karan Ahuja], [Andy Kong], [Mayank Goel], and [Chris Harrison] have an answer; smart assistants should try to infer if the user is facing the device they want to talk to. They call it direction-of-voice or DoV.
Currently, smart assistants use a simple race to see who heard it first. The reasoning is that the device you are closest to will likely hear it first. However, in situations with echos or when you’re equidistant from multiple devices, the outcome can seem arbitrary to a user.
The implementation of DoV uses an Extra-Trees Classifier from the python sklearn toolkit. Several other machine learning algorithms were considered, but ultimately efficiency won out and Extra-Trees was selected. Another interesting facet of the research was determining what facing really means. The team had humans ‘listeners’ stand in for smart assistants. A ‘talker’ would speak the key phrase while the ‘listener’ determined if the talker was facing them or not. Based on their definition of facing, the system can determine if someone is facing the device with 90% accuracy that rises to 93% with per-room calibration.
Their algorithm as well as the data they collected has been open-sourced on GitHub. Perhaps when you’re building your own voice assistant, you can incorporate DoV to improve wake-word accuracy.
Continue reading “Robots Can Finally Answer, Are You Talking To Me?”
In the time since the Hackaday Prize was first run it has nurtured an astonishing array of projects from around the world, and brought to the fore some truly exceptional winners that have demonstrated world-changing possibilities. This year it has been extended to a new frontier with the launch of the Hackaday Prize China (Chinese language, here’s a Google Translate link), allowing engineers, makers, and inventors from that country to join the fun. We’re pleased to announce the finalists, from which a winner will be announced in Shenzhen, China on November 23rd. If you’re in Shenzen area, you’re invited to attend the award ceremony!
All six of these final project entries have been translated into English to help share information about projects across the language barrier. On the left sidebar of each project page you can find a link back to the original Chinese language project entry. Each presents a fascinating look into what people in our global community can produce when they live at the source of the component supply chain. Among them are a healthy cross-section of projects which we’ll visit in no particular order. Let’s dig in and see what these are all about!
Continue reading “Hackaday Prize China Finalists Announced”
Like a million or so other people, [Brian Dorey] picked up a third generation Echo Dot during Amazon’s big sale a couple weeks ago. Going for less than half its normal retail price, he figured it was the perfect time to explore Amazon’s voice assistant offerings. But the low price also meant that he didn’t feel so bad tearing into the thing for our viewing pleasure.
By pretty much all accounts, the Echo Dot line has been a pretty solid performer as far as corporate subsidized home espionage devices go. They’re small, fairly cheap, and offer the baseline functionality that most people expect. While there was nothing precisely wrong with the earlier versions of the Dot, Amazon has used this latest revision of the device to give the gadget a more “premium” look and feel. They’ve also tried to squeeze a bit better audio out of the roughly hockey puck sized device. But of course, some undocumented changes managed to sneak in there as well.
For one thing, the latest version of the Dot deletes the USB port. Hackers had used the USB port on earlier versions of the hardware to try and gain access to the Android (or at least, Amazon’s flavor of Android) operating system hiding inside, so that’s an unfortunate development. On the flip side, [Brian] reports there’s some type of debug header on the bottom of the device. A similar feature allowed hackers to gain access to some of Amazon’s other voice assistants, so we’d recommend hopeful optimism until told otherwise.
The Echo Dot is powered by a quad-core Mediatek MT8516BAAA 64-bit ARM Cortex-A35 processor and the OS lives on an 8GB Samsung KMFN60012M-B214 eMMC. A pair of Texas Instruments LV320ADC3101 ADCs are used to process the incoming audio from the four microphones arranged around the edge of the PCB, and [Brian] says there appears to be a Fairchild 74LCX74 flip-flop in place to cut the audio feed when the user wants a bit of privacy.
Of course, the biggest change is on the outside. The new Dot is much larger than the previous versions, which means all the awesome enclosures we’ve seen for its predecessor will need to be reworked if they want to be compatible with Amazon’s latest and greatest.
Cruising through the children’s hands-on activity zone at Maker Faire Bay Area, we see kids building a cardboard enclosure for the Chatterbox smart speaker kit. It would be tempting to dismiss the little smiling box as “just for kids” but doing so would overlook something more interesting: an alternative to data-mining corporations who dominate the smart speaker market. People are rightly concerned about Amazon Echo and Google Home, always-listening devices for online retail sending data back to their corporate data centers. In order to be appropriate for children, Chatterbox is none of those things. It only listens when a button is pressed, and its online model is designed to support the mission of CCFC (Campaign for a Commercial-Free Childhood.)
Getting started with a Chatterbox is much like other products designed to encourage young makers. The hardware — Raspberry Pi, custom HAT, speaker and button inside a cardboard enclosure — is conceptually similar to a Google AIY Voice kit but paired with an entirely different software experience. Instead of signing in to a Google developer account, children create their own voice interaction behavior with a block-based programming environment resembling MIT Scratch. Moving online, Chatterbox interactions draw upon resources of similarly privacy-minded entities like DuckDuckGo web search. Voice interaction foundation is built upon a fork of Mycroft with changes focused on education and child-friendliness. If a Chatterbox is unsure whether a query was for “Moana” or “Marijuana”, it will decide in favor of the Disney movie.
Many of these privacy-conscious pieces are open source or freely available, but Chatterbox pulls them all together into a single package that’s an appealing alternative to the big brand options. Based on conversations during Hackaday’s Maker Faire meetup, there’s a market beyond parents of young children. From technically aware adults who lack web API coding skills, to senior citizens unaware of dark corners of the web. Chatterbox Kickstarter campaign has a few more weeks to run but has already reached funding goals. We look forward to having a privacy-minded option in voice assistants.
The future of consumer electronics is electronic voice assistants, at least that’s what the manufacturers are telling us. Everything from Alexas to Google Homes to Siris are invading our lives, and if predictions hold, your next new car might just have a voice assistant in it. It’s just a good thing we have enough samples of Majel Barrett’s voice for a quality virtual assistant.
For this week’s Hack Chat, we’re going to be talking all about voice interfaces. There are hundreds of Alexa and Google Home hacks around, but this is just the tip of the iceberg. What else can we do with these neat pieces of computer hardware, and how do we get it to do that?
Our guest for this week’s Hack Chat will be Nadine Lessio, a designer and technologist out of Toronto with a background in visual design and DIY peripherals. Nadine holds an MDes from OCADU where she spent her time investigating the Internet of Things through personal assistants. Currently, she’s working at OCADUs Adaptive Context Environments Lab where she’s researching how humans and devices work together.
During this Hack Chat, Nadine will be talking about voice assistants and answering questions like:
- What languages can be used to program voice assistants
- How do you use voice and hardware together?
- What goes into the UX of a voice assistant?
- How do these assistants interface with microcontrollers, Pis, and other electronics platforms?
You are, of course, encouraged to add your own questions to the discussion. You can do that by leaving a comment on the Hack Chat Event Page and we’ll put that in the queue for the Hack Chat discussion.
Our Hack Chats are live community events on the Hackaday.io Hack Chat group messaging. This week is just like any other, and we’ll be gathering ’round our video terminals at noon, Pacific, on Friday, July 13th. Need a countdown timer? Yes you do.
Click that speech bubble to the right, and you’ll be taken directly to the Hack Chat group on Hackaday.io.
You don’t have to wait until Friday; join whenever you want and you can see what the community is talking about.
I don’t think we’ll call virtual assistants done until we can say, “Make me a sandwich” (without adding “sudo”) and have a sandwich made and delivered to us while sitting in front of our televisions. However, they are not completely without use as they are currently – they can let you know the time, weather and traffic, schedule or remind you of meetings and they can also be used to order things from Amazon. [Pat AI] was interested in building an open source, extensible, virtual assistant, so he built P-Brain.
Think of P-Brain as the base for a more complex virtual assistant. It is designed from the beginning to have more skills added on in order to grow its complexity, the number of things it can do. P-Brain is written in Node.js and using a Node package called Natural, P-Brain parses your request and matches it to a ‘skill.’ At the moment, P-Brain can get the time, date and weather, it can get facts from the internet, find and play music and can flip a virtual coin for you. Currently, P-Brain only runs in Chrome, but [Pat AI] has plans to remove that as a dependency. After the break, [Pat AI] goes into some detail about P-Brain and shows off its capabilities. In an upcoming video, [Pat AI]’s going to go over more details about how to add new skills. Continue reading “Build Your Own P-Brain”