[Geyes30]’s Raspberry Pi project does one thing: it finds arbitrary text in the camera’s view and reads it out loud. Does it do so flawlessly? Not really. Was it at least effortless to put together? Also no, but it does wonderfully illustrate the process of gluing together different bits of functionality to make something new. Also, [geyes30]’s kids find it fascinating, and that’s a win all on its own.
The device is made from a Raspberry Pi and camera and works by sending a still image from the camera to an optical character recognition (OCR) program, which converts any visible text in the image to its ASCII representation. The recognized text is then piped to the espeak engine and spoken aloud. Getting all the tools to play nicely took a bit of work, but [geyes30] documented everything so well that even a novice should be able to get the project up and running in an afternoon.
Sometimes a function like text-to-speech is an end result in and of itself. This was also true of another similar project: Magic Mirror, whose purpose was to tirelessly indulge children’s curiosity about language.
Seeing other projects come to life and learning about new tools is a great way to get new ideas, and documenting them helps cross-pollinate among creative types. Did something inspire you recently, or have you documented your own project? We want to hear about it and so do others, so let us know via the tips line!
Continue reading “Raspberry Pi Reads What It Sees, Delights Children”
Those of us who were around in the late 70s and into the 80s might remember the Speak & Spell, a children’s toy with a remarkable text-to-speech synthesizer. While it sounds dated by today’s standards, it was revolutionary for the time and was riding a wave of text-to-speech functionality that was starting to arrive to various computers of the era. While a lot of them used dedicated hardware to perform the speech synthesis, some computers were powerful enough to do this in software, but others were not quite able. The VIC-20 was one of the latter, but thanks to an ESP8266 it has been retroactively given this function.
This project comes to us from [Jan Derogee], a connoisseur of this retrocomputer, and builds on the work by [Earle F. Philhower] who ported the retro speech synthesis software known as SAM from assembly to C which made it possible to run on the ESP8266. Audio playback is handled on the I2S port, but some work needed to be done to get this to work smoothly since this port also handles the communication with the VIC-20. Once this was sorted out, a patch was made to be able to hear the computer’s audio as well as the speech synthesizer’s. Finally, a serial command interface was designed by [Jan] which allows for control of the module.
While not many of us have VIC-20s sitting at home, it’s still an interesting project that shows the broad scope of a small and inexpensive chip like the ESP8266 which would have had a hefty price tag back in the 1980s. If you have other 80s hardware laying around waiting to be put to work, though, take a look at this project which brings new vocabulary words to that old classic Speak & Spell.
Continue reading “Classic 80s Text-To-Speech On Classic 80s Hardware”
We all need someone to talk to sometimes, and the pandemic has only made matters worse when it comes to the number of people living with anxiety and depression. Exchanging the simplest of pleasantries can make you feel whole again, but the masks make it hard to engage with strangers and judge their emotions, so your big trip to the grocery store can make you feel lonely in a crowd.
So you go back home, still feeling lonely, and maybe you turn on the TV. Watching people interact is probably the next best thing to actual interaction, and it might even make you laugh. But have you ever wished you could talk to the people on TV? With [aniketdhole]’s EMOJO chatbot, you’ll feel as though you’re among friends. And technically you are — all the dialogue is from the TV show Friends.
In Castaway, Tom Hanks didn’t give that volleyball a frowny face, now did he? Nor does he have a dopey grin. Instead, he wears a wry smile that suggests depth of character and a grasp of the dire situation at hand. But now we have emoji, and they do a pretty good job of conveying and evoking emotion. EMOJO is a visual chatbot that uses voice and emoji to make easy, two-way conversation to help chase the loneliness away. It uses a Raspberry Pi and a TFT display to take voice input from a Bluetooth headset, convert it to text, and then respond in kind with both voice and text. It was a finalist in the rethink displays round of the Hackaday Prize, and we can’t wait to see how its character develops. Be sure to check out the demo after the break.
Continue reading “EMOJO Chatbot Will Be There For You”
“Sorry. I had music playing. Would you say that again?” If we had a money-unit every time someone tried talking to us while we were wearing headphones, we could afford a super-nice pair. For an Embedded C class, [extremerockets] built Listen Up!, a cutoff switch that pauses your music when someone wants your attention.
The idea was born while sheltering in place with his daughter, who likes loud music, but he does not want to holler to get her attention. Rather than deny her some auditory privacy, Listen Up! samples the ambient noise level, listens for a sustained rise in amplitude, like speech, and sends a pause signal to the phone. Someday, there may be an option to route the microphone’s audio into the headphones, but for now there is a text-to-speech module for verbalizing character strings. It might be a bit jarring to hear a call to dinner in the middle of a guitar riff, but we don’t like missing dinner either, so we’re with [extremerockets] on this one.
We don’t really need lots of money to get fun headphones, and we are not afraid of making our own.
Back in the early 1980s, there was a certain fad in making your computer produce something resembling human speech. There were several hardware solutions to this, adding voices to everything from automated telephone systems to video game consoles, all the way to Steve Jobs using the gimmick to introduce Macintosh to the world in 1984. In 1982, a software-based version of this synthesis was released for the Atari 8-bit line of computers, and ever since them [rossumur] has wondered whether or not it could run on the very constrained 2600.
Fast-forward 38 years and he found out that the answer was that yes, it was indeed possible to port a semblance of the original 1982 Software Automatic Mouth (or SAM) to run entirely on the Atari 2600, without any additional hardware. To be able to fit such a seemingly complicated piece of software into the paltry 128 bytes (yes, bytes) of RAM, [rossumur] actually uses an authoring tool in order to pre-calculate the allophones, and store only those in the ROM. This way, the 2600 alone can’t convert text to phonemes, but there’s enough space left for the allophones, which are converted into sound, that about two minutes of speech can fit into one cartridge. As for why he went through the trouble, we quote the author himself: “Because creating digital swears with 1982 speech synthesis technology on a 1977 game console is exactly what we need right now.”
For this project, [rossumur] has written an incredibly interesting article on speech synthesis in order to explain the SAM engine used here. And this isn’t his first time on the website either, always cramming software where it shouldn’t fit, such as a “Netflix”-like streaming service, or 8-bit console emulators, both on nothing but an ESP32 microcontroller. Check this one out in action after the break.
Continue reading “38 Years Later, The Atari 2600 Learns To Speak”
Even in a world that is as currently far off the rails as this one is, we’re going to go out on a limb and say that this machine learning, servo-powered prayer bot is going to be the strangest thing you see today. We’re happy to be wrong about that, though, and if we are, please send links.
“The Prayer,” as [Diemut Strebe]’s work is called, may look strange, but it’s another in a string of pieces by various artists that explores just what it means to be human at a time when machines are blurring the line between them and us. The hardware is straightforward: a silicone rubber representation of a human nasopharyngeal cavity, servos for moving the lips, and a speaker to create the vocals. Those are generated by a machine-learning algorithm that was trained against the sacred texts of many of the world’s major religions, including the Christian Bible, the Koran, the Baghavad Gita, Taoist texts, and the Book of Mormon. The algorithm analyzes the structure of sacred verses and recreates random prayers and hymns using Amazon Polly that sound a lot like the real thing. That the lips move in synchrony with the ersatz devotions only adds to the otherworldliness of the piece. Watch it in action below.
We’ve featured several AI-based projects that poke at some interesting questions. This kinetic sculpture that uses machine learning to achieve balance comes to mind, while AI has even been employed in the search for spirits from the other side.
Continue reading “Silicone And AI Power This Prayerful Robotic Intercessor”
The more glass we punch with our fingertips, the more we miss fun physical interfaces like the rotary phone. Sure, they took forever to dial, and you did not want to be one of those kids stuck with one during the transition to DTMF, especially if you were trying to be the 9th caller to a radio station, but the solidly electromechanical experience of it all was just cool, okay? The sound and the heft made them seem so adult.
[Tal O] gets it. He’s all but finished bringing this old girl into the 21st century without giving anything away on her surface. Inside are some things you’d expect, like a SIM800 GSM module for the telephony part, and an ESP32 to count the pulses from the dialer and communicate between it and the GSM module. But it also has a few things we haven’t seen before. The entire journey is outlined in a five-part video series, and we’ve got part one dialed in for you after the break.
Although [Tal] got the ringer working to prove it could be done, he didn’t want to have a separate 12V circuit just to run the bells. Also, the bells and their electromagnets take up a lot of space, so he compromised with an mp3 of a rotary ringer. [Tal] also wanted a way to have dialed-number feedback without cutting up the phone to add a screen, so he found a text-to-speech library and made the phone speak each number aloud as soon as it’s dialed. It uses the same internal speaker as the ringer, but we think it would be neat if the feedback came through the handset speaker.
If [Tal] is looking for another modern convenience to add to this phone, how about speed dial?
Continue reading “Old Rotary Phone Gets Called Into Action”