Hackaday Prize 2023: A DIY Voice-Control Module

If science fiction taught us anything, it’s that voice control was going to be the human-machine interface of the future. [Dennis] has now whipped up a tutorial that lets you add a voice control module to any of your own projects.

The voice control module uses a Raspberry Pi 4 as the brains of the operation, paired with a Seeed Studio ReSpeaker 4-microphone array. The Pi provides a good amount of processing power to crunch through the audio, while the mic array captures high-quality audio from any direction, which is key to reliable performance. Rhasspy is used as the software element, which is responsible for processing audio in a variety of languages to determine what the user is asking for. Based on the voice commands received, Rhasspy can then run just about anything you could possibly require, from sending MQTT smart home commands to running external programs.

If you’ve always dreamed of whipping up your own version of Jarvis from Iron Man, or you just want a non-cloud solution to turn your lights on and off, [Dennis’s] tutorial is a great place to start. Video after the break.

A small speaker with an LCD showing chatbot responses

AI-Powered Speaker Is A Chatbot You Can Actually Chat With

AI-powered chatbots are pretty cool, but most still require you to type your question on a keyboard and read an answer from a screen. It doesn’t have to be like that, of course: with a few standard tools, you can turn a chatbot into a machine that literally chats, as [Hoani Bryson] did. He decided to make a standalone voice-operated ChatGPT client that you can actually sit next to and have a conversation with.

The base of the project is a USB speaker, to which [Hoani] added a Raspberry Pi, a Teensy, a two-line LCD and a big red button. When you press the button, the Pi listens to your speech and converts it to text using the OpenAI voice transcription feature. It then sends the resulting text to ChatGPT through its API and waits for its response, which it turns into sound again through the eSpeak speech synthesizer. The LCD, driven by the Teensy, shows the current status of the machine and also provides live subtitles while the machine is talking.

To spice up the AI box’s appearance, [Hoani] also added an LED ring which shows a spectrogram of the audio being generated. This small addition really makes the thing come alive, turning it into what looks like a classic Sci-Fi movie prop. Except that this one’s real, of course – we are actually living in the future, with human-like AI all around us.

All code, mostly written in Go, is freely available on [Hoani]’s GitHub page. It also includes a separate audio processing library called toot that [Hoani] wrote to help him interface with the micophone and do spectral analysis. Anyone with basic electronic skills can now build their own AI companion and talk to it – something that ham radio operators have been doing for a while.

My Glasses Hear Everything I’m Not Saying!

There was a time when you saw someone walking down the street talking to no one, they were probably crazy. Now you have to look for a Bluetooth headset. But soon they may just be quietly talking to their glasses. Cornell University researchers have EchoSpeech which use sonar-like sensors in a pair of glasses to watch your lips and mouth move. From that data, they can figure out what you are saying, even if you don’t really say it out loud. You can see a video of the glasses below.

There are a few advantages to a method like this. For one thing, you can speak commands even in places where you can’t talk out loud to a microphone. There have been HAL 9000-like attempts to read lips with cameras, but this is power-hungry and video tends to be data intensive.

Retrotechtacular: Voice Controlled Typewriter Science Project In 1958

Hackaday readers might know [Victor Scheinman] as the pioneer who built some of the first practical robot arms. But what was a kid like that doing in high school? Thanks to a film about the 1958 New York City Science Fair, we know he was building a voice-activated typewriter. Don’t believe it? Watch it yourself below, thanks to [David Hoffman].

Ok, we know. Voice typing is no big deal today, and, frankly, [Victor’s] attempt isn’t going to amaze anyone today. But think about it. It was 1958! All those boat anchor ham radios behind him aren’t antiques. That’s what radios looked like in 1958. Plus, the kid is 16 years old. We’d say he did pretty darn good!

The Voice Of ChatGPT Is Now On The Air

AIs can now apparently carry on a passable conversation, depending on what you classify as passable conversation. The quality of your local pub’s banter aside, an AI stuck in a text box doesn’t have much of a living quality. human. An AI that holds a conversation aloud, though, is another thing entirely. [William Franzin] has whipped up just that on amateur radio.  (Video, embedded below.)

The concept is straightforward, if convoluted. A DSTAR digital voice transmission is received, which is then transcoded to regular digital audio. The audio then goes through a voice recognition engine, and that is used as a question for a ChatGPT AI. The AI’s output is then fed to a text-to-speech engine, and it speaks back with its own voice over the airwaves.

[William] demonstrates the system, keying up a transmitter to ask the AI how to get an amateur radio licence. He gets a pretty comprehensive reply in return.

The result is that radio amateurs can call in to ChatGPT with questions, and can receive actual spoken responses from the AI. We can imagine within the next month, AIs will be chatting it up all over the airwaves with similar setups. After all, a few robots could only add more diversity to the already rich and varied ham radio community. Video after the break.

An Impressively Functional Tacobot

We’re big fans of useless machines here at Hackaday, there’s something undeniably entertaining about watching a gadget flail about dramatically without actually making any progress towards a defined goal. But what happens when one of these meme machines ends up working too well? We think that’s just what we might be witnessing here with the Tacobot from [Vije Miller].

On the surface, building an elaborate robotic contraption to (slowly) produce tacos is patently ridiculous. Doubly so when you tack on the need to give it voice commands like it’s some kind of one-dish version of the Star Trek food replicator. The whole thing sounds like the setup for a joke, an assumption that’s only reinforced after watching the dramatized video at the break. But in the end, we still can’t get over how well the thing appears to work.

After [Vije] gives it a list of ingredients to dispense, a robotic arm drops a tortilla on a fantastically articulated rotating platform that can not only spin and move in two dimensions, but can form the soft shell into the appropriate taco configuration. The empty shell is then brought under a rotating dispenser that doles out (or at least attempts to) the requested ingredients such as beef, onions, cheese, and lettuce. With a final flourish, it squirts out a few pumps of the selected sauce, and then presents the completed taco to the user.

The only failing appears to be the machine’s ability to dispense some of the ingredients. The ground beef seems to drop into place without issue, but it visibly struggles with the wetter foodstuffs such as the tomatoes and onions. All we know is that if a robot handed us a taco with that little lettuce on it, we’d have a problem. On the project page [Vije] acknowledges the issue, and says that a redesigned dispenser could help alleviate some of the problem.

The issue immediately brought to mind the fascinating series of posts dedicated to handling bulk material penned by our very own [Anne Ogborn]. While the application here might be a bit tongue-in-cheek, it’s still a perfect example of the interesting phenomena that you run into when trying to meter out different types of materials.

The octagonal wooden box described in the project. On the left, outer surface of the box is shown, with "Say Friend And Come In" inscription, as well as a few draings (presumably from Lord of The Rings) and two metallic color stars that happen to serve as capacitative sensor electrodes. On the right, underside of the lid is shown, with all the electronics involved glued into CNC-machined channels.

Say Friend And Have This Box Open For You

Handcrafted gifts are special, and this one’s no exception. [John Pender] made a Tolkien-inspired box for his son and shared the details with us on Hackaday.io. This one-of-a-kind handcrafted box fulfills one role and does it perfectly – just like with the Doors of Durin, you have to say ‘friend’ in Elvish, and the box shall unlock for you.

This box, carefully engraved and with attention paid to its surface finish, stands on its own as a gift. However, with the voice recognition function, it’s a project complicated enough to cover quite a few fields at once – woodworking, electronics, and software. The electronics are laid out in CNC-machined channels, and LED strips illuminate the “Say Friend And Come In” inscriptions once the box is ready to listen. If you’re wondering how the unlocking process works, the video embedded below shows it all.

Two solenoids keep the lid locked, and in its center is a Pi Zero, the brains of the operation. With small batteries and a power-hungry board, power management is a bit intricate. Two capacitive sensors and a small power management device are always powered up. When both of the sensors are touched, a power switch module from Pololu wakes the Pi up. It, in turn, takes its sweet time, as fully-fledged Linux boards do, and lights up the LED strip once it’s listening.

