Hackaday Prize 2023: A DIY Voice-Control Module

If science fiction taught us anything, it’s that voice control was going to be the human-machine interface of the future. [Dennis] has now whipped up a tutorial that lets you add a voice control module to any of your own projects.

The voice control module uses a Raspberry Pi 4 as the brains of the operation, paired with a Seeed Studio ReSpeaker 4-microphone array. The Pi provides a good amount of processing power to crunch through the audio, while the mic array captures high-quality audio from any direction, which is key to reliable performance. Rhasspy is used as the software element, which is responsible for processing audio in a variety of languages to determine what the user is asking for. Based on the voice commands received, Rhasspy can then run just about anything you could possibly require, from sending MQTT smart home commands to running external programs.

If you’ve always dreamed of whipping up your own version of Jarvis from Iron Man, or you just want a non-cloud solution to turn your lights on and off, [Dennis’s] tutorial is a great place to start. Video after the break.

Continue reading “Hackaday Prize 2023: A DIY Voice-Control Module”

Voice Without Sound

Voice recognition is becoming more and more common, but anyone who’s ever used a smart device can attest that they aren’t exactly fool-proof. They can activate seemingly at random, don’t activate when called or, most annoyingly, completely fail to understand the voice commands. Thankfully, researchers from the University of Tokyo are looking to improve the performance of devices like these by attempting to use them without any spoken voice at all.

The project is called SottoVoce and uses an ultrasound imaging probe placed under the user’s jaw to detect internal movements in the speaker’s larynx. The imaging generated from the probe is fed into a series of neural networks, trained with hundreds of speech patterns from the researchers themselves. The neural networks then piece together the likely sounds being made and generate an audio waveform which is played to an unmodified Alexa device. Obviously a few improvements would need to be made to the ultrasonic imaging device to make this usable in real-world situations, but it is interesting from a research perspective nonetheless.

The research paper with all the details is also available (PDF warning). It’s an intriguing approach to improving the performance or quality of voice especially in situations where the voice may be muffled, non-existent, or overlaid with a lot of background noise. Machine learning like this seems to be one of the more powerful tools for improving speech recognition, as we saw with this robot that can walk across town and order food for you using voice commands only.

Continue reading “Voice Without Sound”

The Voice Of ChatGPT Is Now On The Air

AIs can now apparently carry on a passable conversation, depending on what you classify as passable conversation. The quality of your local pub’s banter aside, an AI stuck in a text box doesn’t have much of a living quality. human. An AI that holds a conversation aloud, though, is another thing entirely. [William Franzin] has whipped up just that on amateur radio.  (Video, embedded below.)

The concept is straightforward, if convoluted. A DSTAR digital voice transmission is received, which is then transcoded to regular digital audio. The audio then goes through a voice recognition engine, and that is used as a question for a ChatGPT AI. The AI’s output is then fed to a text-to-speech engine, and it speaks back with its own voice over the airwaves.

[William] demonstrates the system, keying up a transmitter to ask the AI how to get an amateur radio licence. He gets a pretty comprehensive reply in return.

The result is that radio amateurs can call in to ChatGPT with questions, and can receive actual spoken responses from the AI. We can imagine within the next month, AIs will be chatting it up all over the airwaves with similar setups. After all, a few robots could only add more diversity to the already rich and varied ham radio community. Video after the break.

Continue reading “The Voice Of ChatGPT Is Now On The Air”

Translating And Broadcasting Spoken Morse Code

When the first radios and telegraph lines were put into service, essentially the only way to communicate was to use Morse code. The first transmitters had extremely inefficient designs by today’s standards, so this was more a practical limitation than a choice. As the technology evolved there became less and less reason to use Morse to communicate, but plenty of amateur radio operators still use this mode including [Kevin] aka [KB9RLW] who has built a circuit which can translate spoken Morse code into a broadcasted Morse radio signal.

The circuit works by feeding the signal from a microphone into an Arduino. The Arduino listens for a certain threshold and keys the radio when it detects a word being spoken. Radio operators use the words “dit” and “dah” for dots and dashes respectively, and the Arduino isn’t really translating the words so much as it is sending a signal for the duration of however long each word takes to say. The software for the Arduino is provided on the project’s GitHub page as well, and uses a number of approaches to make sure the keyed signal is as clean as possible.

[Kevin] mentions that this device could be used by anyone who wishes to operate a radio in this mode who might have difficulty using a traditional Morse key and who doesn’t want to retrain their brain to use other available equipment like a puff straw or a foot key. The circuit is remarkably straightforward for what it does, and in the video below it seems [Kevin] is having a blast using it. If you’re still looking to learn to “speak” Morse code, though, take a look at this guide which goes into detail about it.

Thanks to [Dragan] for the tip!

Continue reading “Translating And Broadcasting Spoken Morse Code”

Voice Command Made Mostly Easy

Speech commands are all the rage on everything from digital assistants to cars. Adding it to your own projects is a lot of work, right? Maybe not. [Electronoobs] shows a speech board that lets you easily integrate 255 voice commands via serial communications with a host computer. You can see the review in the video below.

He had actually used a similar board before, but that version was a few years ago, and the new module has, of course, many new features. As of version 3.1, the board can handle 255 commands in a more flexible way than the older versions.

Continue reading “Voice Command Made Mostly Easy”

Making Linux Offline Voice Recognition Easier

For just about any task you care to name, a Linux-based desktop computer can get the job done using applications that rival or exceed those found on other platforms. However, that doesn’t mean it’s always easy to get it working, and speech recognition is just one of those difficult setups.

A project called Voice2JSON is trying to simplify the use of voice workflows. While it doesn’t provide the actual voice recognition, it does make it easier to get things going and then use speech in a natural way.

The software can integrate with several backends to do offline speech recognition including CMU’s pocketsphinx, Dan Povey’s Kaldi, Mozilla’s DeepSpeech 0.9, and Kyoto University’s Julius. However, the code is more than just a thin wrapper around these tools. The fast training process produces both a speech recognizer and an intent recognizer. So not only do you know there is a garage door, but you gain an understanding of the opening and closing of the garage door.

Continue reading “Making Linux Offline Voice Recognition Easier”

APRS Implemented At Low Cost And Small Size

Before smartphones and Internet of Things devices were widely distributed, the Automatic Packet Reporting System (APRS) was the way to send digital information out wirelessly from remote locations. In use since the 80s, it now has an almost hipster “wireless data before it was cool” vibe, complete with plenty of people who use it because it’s interesting, and plenty of others who still need the unique functionality it offers even when compared to more modern wireless data transmission methods. One of those is [Tyler] who shows us how to build an APRS system for a minimum of cost and size.

[Tyler]’s build is called Arrow and operates on the popular 2 metre ham radio band. It’s a Terminal Node Controller (TNC), a sort of ham radio modem, built around an ESP32. The ESP32 handles both the signal processing for the data and also uses its Bluetooth capability to pair to an Android app called APRSDroid. The entire module is only slightly larger than the 18650 battery that powers it, and it can be paired with a computer to send and receive any digital data that you wish using this module as a plug-and-play transceiver.

While the build is still has a few limitations that [Tyler] notes, he hopes that the project will be a way to modernize the APRS protocol using methods for radio transmission that have been improved upon since APRS was first implemented. It should be able to interface easily into any existing ham radio setup, although even small balloon-lofted radio stations can make excellent use of APRS without any extra equipment. Don’t forget that you need a license to operate these in most places, though!