Hacked teddybear on a desk

Turning GLaDOS Into Ted: A Tale Of A Talking Toy

What if your old, neglected toys could come to life — with a bit of sass? That’s exactly what [Binh] achieved when he transformed his sister’s worn-out teddy bear into ‘Ted’, an interactive talking plush with a personality of its own. This project, which combines the GLaDOS Personality Core project from the Portal series with clever microcontroller tinkering, brings a whole new personality to a childhood favorite.

[Binh] started with the basics: a teddy bear already equipped with buttons and speakers, which he overhauled with an ESP32 microcontroller. The bear’s personality originated from GLaDOS, but was rewritten by [Binh] to fit a cheeky, teddy-bear tone. With a few tweaks in the Python-based fork, [Binh] created threads to handle touch-based interaction. For example, the ESP32 detects where the bear is touched and sends this input to a modified neural network, which then generates a response. The bear can, for instance, call you out for holding his paw for too long or sarcastically plead for mercy. I hear you say ‘but that bear Ted could do a lot more!’ Well — maybe, all this is just what an innocent bear with a personality should be capable of.

Instead, let us imagine future iterations featuring capacitive touch sensors or accelerometers to detect movement. The project is simple, but showcases the potential for intelligent plush toys. It might raise some questions, too.

Continue reading “Turning GLaDOS Into Ted: A Tale Of A Talking Toy”

Hypersonic Speech Jammer Works At A Distance

Speech jammers were a meme a little while back. By feeding back delayed voice audio to a person’s ears, it makes it near-impossible for most people to speak, as our speech system runs on a continual feedback loop. [Benn Jordan] decided to try reworking that concept by replacing headphones with a directed sound projector.

The key to the project is the use of hypersonic sound arrays. These essentially use high-frequency sound beyond the human range of hearing to carry a lower-frequency sound signal. By essentially modulating this higher-frequency carrier to create the perception of lower-frequency sound, it’s possible to create an audible signal that is highly directional. It’s like a “sound laser” that can be pointed directly at a person to allow them to hear it, which is then inaudible when pointed slightly away.

These allow the delayed voice signal to be fired at a person’s head with a relatively narrow spatial spread. When an individual speaks into a microphone hooked up to the device, delayed audio is sent through the hypersonic array back to the speaker’s ears, garbling their speech as their brain gets confused by the feedback.

[Benn] demonstrated the device in public by offering random individuals $100 to read a paragraph out of a book. The speech jammer worked a treat, and [Benn] was able to keep his money… until one amazingly immune individual breezed through the test. Check out our prior coverage of speech jamming technology. Video after the break.

Continue reading “Hypersonic Speech Jammer Works At A Distance”

New Wearable Detects Imminent Vocal Fatigue

“The show must go on,” so they say. These days, whether you’re an opera singer, a teacher, or just someone with a lot of video meetings, you rely on your voice to work. But what if your voice is under threat? Work it too hard, or for too long, and you might find that it suddenly lets you down.

Researchers from Northwestern University have developed a new technology to protect against this happenstance. It’s the first wearable device that monitors vocal usage and calls for time out before damage occurs. The research has been published in the Proceedings of the National Academy of Sciences.

Continue reading “New Wearable Detects Imminent Vocal Fatigue”

Voice Without Sound

Voice recognition is becoming more and more common, but anyone who’s ever used a smart device can attest that they aren’t exactly fool-proof. They can activate seemingly at random, don’t activate when called or, most annoyingly, completely fail to understand the voice commands. Thankfully, researchers from the University of Tokyo are looking to improve the performance of devices like these by attempting to use them without any spoken voice at all.

The project is called SottoVoce and uses an ultrasound imaging probe placed under the user’s jaw to detect internal movements in the speaker’s larynx. The imaging generated from the probe is fed into a series of neural networks, trained with hundreds of speech patterns from the researchers themselves. The neural networks then piece together the likely sounds being made and generate an audio waveform which is played to an unmodified Alexa device. Obviously a few improvements would need to be made to the ultrasonic imaging device to make this usable in real-world situations, but it is interesting from a research perspective nonetheless.

The research paper with all the details is also available (PDF warning). It’s an intriguing approach to improving the performance or quality of voice especially in situations where the voice may be muffled, non-existent, or overlaid with a lot of background noise. Machine learning like this seems to be one of the more powerful tools for improving speech recognition, as we saw with this robot that can walk across town and order food for you using voice commands only.

Continue reading “Voice Without Sound”

Classic 80s Text-To-Speech On Classic 80s Hardware

Those of us who were around in the late 70s and into the 80s might remember the Speak & Spell, a children’s toy with a remarkable text-to-speech synthesizer. While it sounds dated by today’s standards, it was revolutionary for the time and was riding a wave of text-to-speech functionality that was starting to arrive to various computers of the era. While a lot of them used dedicated hardware to perform the speech synthesis, some computers were powerful enough to do this in software, but others were not quite able. The VIC-20 was one of the latter, but thanks to an ESP8266 it has been retroactively given this function.

This project comes to us from [Jan Derogee], a connoisseur of this retrocomputer, and builds on the work by [Earle F. Philhower] who ported the retro speech synthesis software known as SAM from assembly to C which made it possible to run on the ESP8266. Audio playback is handled on the I2S port, but some work needed to be done to get this to work smoothly since this port also handles the communication with the VIC-20. Once this was sorted out, a patch was made to be able to hear the computer’s audio as well as the speech synthesizer’s. Finally, a serial command interface was designed by [Jan] which allows for control of the module.

While not many of us have VIC-20s sitting at home, it’s still an interesting project that shows the broad scope of a small and inexpensive chip like the ESP8266 which would have had a hefty price tag back in the 1980s. If you have other 80s hardware laying around waiting to be put to work, though, take a look at this project which brings new vocabulary words to that old classic Speak & Spell.

Continue reading “Classic 80s Text-To-Speech On Classic 80s Hardware”

Making Linux Offline Voice Recognition Easier

For just about any task you care to name, a Linux-based desktop computer can get the job done using applications that rival or exceed those found on other platforms. However, that doesn’t mean it’s always easy to get it working, and speech recognition is just one of those difficult setups.

A project called Voice2JSON is trying to simplify the use of voice workflows. While it doesn’t provide the actual voice recognition, it does make it easier to get things going and then use speech in a natural way.

The software can integrate with several backends to do offline speech recognition including CMU’s pocketsphinx, Dan Povey’s Kaldi, Mozilla’s DeepSpeech 0.9, and Kyoto University’s Julius. However, the code is more than just a thin wrapper around these tools. The fast training process produces both a speech recognizer and an intent recognizer. So not only do you know there is a garage door, but you gain an understanding of the opening and closing of the garage door.

Continue reading “Making Linux Offline Voice Recognition Easier”

This Animatronic Mouth Mimics Speech With Servos

Of the 43 muscles that comprise the human face, only a few are actually important to speaking. And yet replicating the movements of the mouth by mechanical means always seems to end up only partly convincing. Servos and linkages can only approximate the complex motions the lips, cheeks, jaw, and tongue are capable of. Still, there are animatronics out there that make a good go at the job, of which this somewhat creepy mechanical mouth is a fine example.

Why exactly [Will Cogley] felt the need to build a mechanical maw with terrifying and fairly realistic fangs is anyone’s guess. Recalling his lifelike disembodied animatronic heart build, it just seems like he pursues these builds for the challenge of it all. But if you thought the linkages of the heart were complex, wait till you see what’s needed to make this mouth move realistically. [Will] has stuffed this pie hole with nine servos, all working together to move the jaw up and down, push and pull the corners of the mouth, raise and lower the lips, and bounce the tongue around.

It all seems very complex, but [Will] explains that he actually simplified the mechanical design to concentrate more on the software side, which is a text-to-speech movement translator. Text input is translated to phonemes, each of which corresponds to a mouth shape that the servos can create. It’s pretty realistic although somewhat disturbing, especially when the mouth is placed in an otherwise cuddly stuffed bear that serenades you from the nightstand; check out the second video below for that.

[Will] has been doing a bang-up job on animatronics lately, from 3D-printed eyeballs to dexterous mechatronic hands. We’re looking forward to whatever he comes up with next — we think.

Continue reading “This Animatronic Mouth Mimics Speech With Servos”