New Wearable Detects Imminent Vocal Fatigue

“The show must go on,” so they say. These days, whether you’re an opera singer, a teacher, or just someone with a lot of video meetings, you rely on your voice to work. But what if your voice is under threat? Work it too hard, or for too long, and you might find that it suddenly lets you down.

Researchers from Northwestern University have developed a new technology to protect against this happenstance. It’s the first wearable device that monitors vocal usage and calls for time out before damage occurs. The research has been published in the Proceedings of the National Academy of Sciences.

Continue reading “New Wearable Detects Imminent Vocal Fatigue”

Hackaday Prize 2023: A DIY Voice-Control Module

If science fiction taught us anything, it’s that voice control was going to be the human-machine interface of the future. [Dennis] has now whipped up a tutorial that lets you add a voice control module to any of your own projects.

The voice control module uses a Raspberry Pi 4 as the brains of the operation, paired with a Seeed Studio ReSpeaker 4-microphone array. The Pi provides a good amount of processing power to crunch through the audio, while the mic array captures high-quality audio from any direction, which is key to reliable performance. Rhasspy is used as the software element, which is responsible for processing audio in a variety of languages to determine what the user is asking for. Based on the voice commands received, Rhasspy can then run just about anything you could possibly require, from sending MQTT smart home commands to running external programs.

If you’ve always dreamed of whipping up your own version of Jarvis from Iron Man, or you just want a non-cloud solution to turn your lights on and off, [Dennis’s] tutorial is a great place to start. Video after the break.

Continue reading “Hackaday Prize 2023: A DIY Voice-Control Module”

Voice Without Sound

Voice recognition is becoming more and more common, but anyone who’s ever used a smart device can attest that they aren’t exactly fool-proof. They can activate seemingly at random, don’t activate when called or, most annoyingly, completely fail to understand the voice commands. Thankfully, researchers from the University of Tokyo are looking to improve the performance of devices like these by attempting to use them without any spoken voice at all.

The project is called SottoVoce and uses an ultrasound imaging probe placed under the user’s jaw to detect internal movements in the speaker’s larynx. The imaging generated from the probe is fed into a series of neural networks, trained with hundreds of speech patterns from the researchers themselves. The neural networks then piece together the likely sounds being made and generate an audio waveform which is played to an unmodified Alexa device. Obviously a few improvements would need to be made to the ultrasonic imaging device to make this usable in real-world situations, but it is interesting from a research perspective nonetheless.

The research paper with all the details is also available (PDF warning). It’s an intriguing approach to improving the performance or quality of voice especially in situations where the voice may be muffled, non-existent, or overlaid with a lot of background noise. Machine learning like this seems to be one of the more powerful tools for improving speech recognition, as we saw with this robot that can walk across town and order food for you using voice commands only.

Continue reading “Voice Without Sound”

The Voice Of ChatGPT Is Now On The Air

AIs can now apparently carry on a passable conversation, depending on what you classify as passable conversation. The quality of your local pub’s banter aside, an AI stuck in a text box doesn’t have much of a living quality. human. An AI that holds a conversation aloud, though, is another thing entirely. [William Franzin] has whipped up just that on amateur radio.  (Video, embedded below.)

The concept is straightforward, if convoluted. A DSTAR digital voice transmission is received, which is then transcoded to regular digital audio. The audio then goes through a voice recognition engine, and that is used as a question for a ChatGPT AI. The AI’s output is then fed to a text-to-speech engine, and it speaks back with its own voice over the airwaves.

[William] demonstrates the system, keying up a transmitter to ask the AI how to get an amateur radio licence. He gets a pretty comprehensive reply in return.

The result is that radio amateurs can call in to ChatGPT with questions, and can receive actual spoken responses from the AI. We can imagine within the next month, AIs will be chatting it up all over the airwaves with similar setups. After all, a few robots could only add more diversity to the already rich and varied ham radio community. Video after the break.

Continue reading “The Voice Of ChatGPT Is Now On The Air”

Translating And Broadcasting Spoken Morse Code

When the first radios and telegraph lines were put into service, essentially the only way to communicate was to use Morse code. The first transmitters had extremely inefficient designs by today’s standards, so this was more a practical limitation than a choice. As the technology evolved there became less and less reason to use Morse to communicate, but plenty of amateur radio operators still use this mode including [Kevin] aka [KB9RLW] who has built a circuit which can translate spoken Morse code into a broadcasted Morse radio signal.

The circuit works by feeding the signal from a microphone into an Arduino. The Arduino listens for a certain threshold and keys the radio when it detects a word being spoken. Radio operators use the words “dit” and “dah” for dots and dashes respectively, and the Arduino isn’t really translating the words so much as it is sending a signal for the duration of however long each word takes to say. The software for the Arduino is provided on the project’s GitHub page as well, and uses a number of approaches to make sure the keyed signal is as clean as possible.

[Kevin] mentions that this device could be used by anyone who wishes to operate a radio in this mode who might have difficulty using a traditional Morse key and who doesn’t want to retrain their brain to use other available equipment like a puff straw or a foot key. The circuit is remarkably straightforward for what it does, and in the video below it seems [Kevin] is having a blast using it. If you’re still looking to learn to “speak” Morse code, though, take a look at this guide which goes into detail about it.

Thanks to [Dragan] for the tip!

Continue reading “Translating And Broadcasting Spoken Morse Code”

Voice Command Made Mostly Easy

Speech commands are all the rage on everything from digital assistants to cars. Adding it to your own projects is a lot of work, right? Maybe not. [Electronoobs] shows a speech board that lets you easily integrate 255 voice commands via serial communications with a host computer. You can see the review in the video below.

He had actually used a similar board before, but that version was a few years ago, and the new module has, of course, many new features. As of version 3.1, the board can handle 255 commands in a more flexible way than the older versions.

Continue reading “Voice Command Made Mostly Easy”

Making Linux Offline Voice Recognition Easier

For just about any task you care to name, a Linux-based desktop computer can get the job done using applications that rival or exceed those found on other platforms. However, that doesn’t mean it’s always easy to get it working, and speech recognition is just one of those difficult setups.

A project called Voice2JSON is trying to simplify the use of voice workflows. While it doesn’t provide the actual voice recognition, it does make it easier to get things going and then use speech in a natural way.

The software can integrate with several backends to do offline speech recognition including CMU’s pocketsphinx, Dan Povey’s Kaldi, Mozilla’s DeepSpeech 0.9, and Kyoto University’s Julius. However, the code is more than just a thin wrapper around these tools. The fast training process produces both a speech recognizer and an intent recognizer. So not only do you know there is a garage door, but you gain an understanding of the opening and closing of the garage door.

Continue reading “Making Linux Offline Voice Recognition Easier”