“Glasses” That Transcribe Text To Audio

Glasses for the blind might sound like an odd idea, given the traditional purpose of glasses and the issue of vision impairment. However, eighth-grade student [Akhil Nagori] built these glasses with an alternate purpose in mind. They’re not really for seeing. Instead, they’re outfitted with hardware to capture text and read it aloud.

Yes, we’re talking about real-time text-to-audio transcription, built into a head-worn format. The hardware is pretty straightforward: a Raspberry Pi Zero 2W runs off a battery and is outfitted with the usual first-party camera. The camera is mounted on a set of eyeglass frames so that it points at whatever the wearer might be “looking” at. At the push of a button, the camera captures an image, and then passes it to an API which does the optical character recognition. The text can then be passed to a speech synthesizer so it can be read aloud to the wearer.

It’s funny to think about how advanced this project really is. Jump back to the dawn of the microcomputer era, and such a device would have been a total flight of fancy—something a researcher might make a PhD and career out of. Indeed, OCR and speech synthesis alone were challenge enough. Today, you can stand on the shoulders of giants and include such mighty capability in a homebrewed device that cost less than $50 to assemble. It’s a neat project, too, and one that we’re sure taught [Akhil] many valuable skills along the way.

Continue reading ““Glasses” That Transcribe Text To Audio”

Speaking Computers From The 1970s

Talking computers are nothing these days. But in the old days, a computer that could speak was quite the novelty. Many computers from the 1970s and 1980s used an AY-3-8910 chip and [InazumaDenki] has been playing with one of these venerable chips. You can see (and hear) the results in the video below.

The chip uses PCM, and there are different ways to store and play sounds. The video shows how different they are and even looks at the output on the oscilloscope. The chip has three voices and was produced by General Instruments, the company that initially made PIC microcontrollers. It found its way into many classic arcade games, home computers, and games like Intellivision, Vectrex, the MSX, and ZX Spectrum. Soundcards for the TRS-80 Color Computer and the Apple II used these chips. The Atari ST used a variant from Yamaha, the YM2149F.

There’s some code for an ATmega, and the video says it is part one, so we expect to see more videos on this chip soon.

General instruments had other speech chips, and some of them are still around in emulated form. In fact, you can emulate the AY-3-8910 with little more than a Raspberry Pi.

Continue reading “Speaking Computers From The 1970s”

RP2040 Emulator Brings The Voice Of The 80s Back To Life

You may not have heard, but there’s a chip shortage out there. And it’s not just the fancy new chips that are in short supply; the chips that were fancy and new back when you could still buy them from Radio Shack are getting hard to come by, too. For different reasons, of course, but it does pose a problem that requires a little hacking to fix.

The chip in question here is the General Instrument SP0256, a 1980s-era speech synthesizer chip that [Andrew Menadue] relies on. The LSI chip stored 59 unique allophones, or basic sounds the vocal tract is capable of, and synthesized speech by rapidly concatenating these sounds. The chip and its descendants made regular appearances in computers and games throughout the 80s, so chances are good you’ve heard it. If not, think WarGames (yes, we know that wasn’t actually a computerized voice) or [Stephen Hawking] and you’ll be pretty close.

[Andrew]’s need for such a chip stems from his attempts to give voice to his collection of Psion Organisers, another 80s relic that was one of the first pocket computers. Some time ago he built a speech board for the Psion based on the SP0256-AL2, but had to resort to building an emulator for the chip since none were to be had. The emulator uses an RP2040 and lives on a PCB that has the same footprint as the original chip, so it can just plug right in. He dug up WAV files of the allophones and translated those to sequences of bytes, allowing the RP2040 to output the correct sounds as they’re called for. Speaker problems notwithstanding, it sounds pretty good in the video below.

We’ve featured a fair number of SP0256 projects before, on everything from Amstrad to Z80. We’ve also shown off a few of [Andrew]’s builds before, including this exploration of the voltage tolerance of the RP2040.

Continue reading “RP2040 Emulator Brings The Voice Of The 80s Back To Life”

Make Your ESP32 Talk Like It’s The 80s Again

80s-era electronic speech certainly has a certain retro appeal to it, but it can sometimes be a useful data output method since it can be implemented on very little hardware. [luc] demonstrates this with a talking thermometer project that requires no display and no special hardware to communicate temperatures to a user.

Back in the day, there were chips like the Votrax SC-01A that could play phonemes (distinct sounds that make up a language) on demand. These would be mixed and matched to create identifiable words, in that distinctly synthesized Speak & Spell manner that is so charming-slash-uncanny.

Software-only speech synthesis isn’t new, but it’s better now than it was in Atari’s day.

Nowadays, even hobbyist microcontrollers have more than enough processing power and memory to do a similar job entirely in software, which is exactly what [luc]’s talking thermometer project does. All this is done with the Talkie library, originally written for the Arduino and updated for the ESP32 and other microcontrollers. With it, one only needs headphones or a simple audio amplifier and speaker to output canned voice data from a project.

[luc] uses it to demonstrate how to communicate to a user in a hands-free manner without needing a display, and we also saw this output method in an electric unicycle which had a talking speedometer (judged to better allow the user to keep their eyes on the road, as well as minimizing the parts count.)

Would you like to listen to an authentic, somewhat-understandable 80s-era text-to-speech synthesizer? You’re in luck, because we can show you an authentic vintage MicroVox unit in action. Give it a listen, and compare it to a demo of the Talkie library in the video below.

Continue reading “Make Your ESP32 Talk Like It’s The 80s Again”

A small speaker with an LCD showing chatbot responses

AI-Powered Speaker Is A Chatbot You Can Actually Chat With

AI-powered chatbots are pretty cool, but most still require you to type your question on a keyboard and read an answer from a screen. It doesn’t have to be like that, of course: with a few standard tools, you can turn a chatbot into a machine that literally chats, as [Hoani Bryson] did. He decided to make a standalone voice-operated ChatGPT client that you can actually sit next to and have a conversation with.

The base of the project is a USB speaker, to which [Hoani] added a Raspberry Pi, a Teensy, a two-line LCD and a big red button. When you press the button, the Pi listens to your speech and converts it to text using the OpenAI voice transcription feature. It then sends the resulting text to ChatGPT through its API and waits for its response, which it turns into sound again through the eSpeak speech synthesizer. The LCD, driven by the Teensy, shows the current status of the machine and also provides live subtitles while the machine is talking.

To spice up the AI box’s appearance, [Hoani] also added an LED ring which shows a spectrogram of the audio being generated. This small addition really makes the thing come alive, turning it into what looks like a classic Sci-Fi movie prop. Except that this one’s real, of course – we are actually living in the future, with human-like AI all around us.

All code, mostly written in Go, is freely available on [Hoani]’s GitHub page. It also includes a separate audio processing library called toot that [Hoani] wrote to help him interface with the micophone and do spectral analysis. Anyone with basic electronic skills can now build their own AI companion and talk to it – something that ham radio operators have been doing for a while.

Continue reading “AI-Powered Speaker Is A Chatbot You Can Actually Chat With”

2022 Hackaday Prize: Talking Clock Built With Old-School Gear

Any smartphone or laptop could be a talking clock if you wished it so. However, we think this build from [Marek Więcek] is more fun, which uses discrete vintage chips to get the job done the old fashioned way.

The work started when [Marek] was tinkering with a 65C02 CPU, giving it an EPROM, some RAM, and some logic ICs to create something akin to a modern microcontroller in functionality. It came to be known as the 6502 Retro Controller Board. Slowly, the project was expanded with various additional modules, in much the same way one might add shields to an Arduino.

In this case, [Marek] expanded the 6502-powered board with a series of 7-segment displays, along with an RTC to keep accurate time. A classic SP0256-AL2 speech synthesis chip was then added, allowing the system to not only show the time, but read it aloud, too. As a bonus, not only can it tell you the hour, minute, day, and date, but it will also read various science-fiction quotes on demand.

Like most 80s speech synths, the output is robotic and a little difficult to parse. However, that’s part of the charm that makes it different compared to the speaking virtual assistants of today.

Continue reading “2022 Hackaday Prize: Talking Clock Built With Old-School Gear”

Breadboard containing speech synthesis chip

RPi Python Library Has Retro Chiptunes And Speech Covered

The classic SP0256-AL2 speech chip has featured a few times on these pages, and if you’ve not seen the actual part before, you almost certainly have heard the resulting audio output. The latest Python library from prolific retrocomputing enthusiast [Nick Bild] brings the joy of the old chip to the Raspberry Pi platform, with an added extra trick; support for the venerable AY-3-8910 sound generator as well.

The SP0256-AL2 chip generates vaguely recognisable speech using the allophone system. Allophones are kind of like small chunks of speech audio which when reproduced sequentially, result in intelligible phonemes that form the basis of speech. The chip requires an external device to feed it the allophones at a regular rate, which is the job of his Gi-Pi library.

This speech synthesis technology is based on Linear-predictive coding, which is used to implement a human vocal tract model. This is the same coding method utilized by the first generation of GSM digital mobile phones, implementing a system known as Full-Rate. Both an LPC encoder and an LPC decoder are present on the handset. The LPC encoder takes audio in from the user, breaks it into the tiny constituent parts of speech, and then simply sends a code representing the audio block, but not the actual audio. Obviously there are a few more parameters sent as well to adjust the model at the receiving side. The actual decoding side is therefore not all that dissimilar to what the AY-3-8910 and related devices are doing, except you the user have to create the list of audio blocks up-front and feed the chip at the rate it demands.

Continue reading “RPi Python Library Has Retro Chiptunes And Speech Covered”