We can almost count on our eyesight to fail with age, maybe even past the point of correction. It’s a pretty big flaw if you ask us. So, how can a person with aging eyes hope to continue reading the printed word?
There are plenty of commercial document readers available that convert text to speech, but they’re expensive. Most require a smart phone and/or an internet connection. That might not be as big of an issue for future generations of failing eyes, but we’re not there yet. In the meantime, we have small, cheap computers and plenty of open source software to turn them into document readers.
[rgrokett] built a RaspPi text reader to help an aging parent maintain their independence. In the process, he made a good soup-to-nuts guide to building one. It couldn’t be easier to use—just place the document under the camera and push the button. A Python script makes the Pi take a picture of the text. Then it uses Tesseract OCR to convert the image to plain text, and runs the text through a speech synthesis engine which reads it aloud. The reader is on as long as it’s plugged in, so it’s ready to work at the push of a button. We can probably all appreciate such a low-hassle design. Be sure to check out the demo after the break.
If you wanted to use this to read books, you’d still have to turn the pages yourself. Here’s a BrickPi reader that solves that one.
Continue reading “DIY Text-to-Speech With Raspberry Pi”
A diagnosis of amyotrophic lateral sclerosis, or ALS, is devastating. Outlier cases like [Stephen Hawking] notwithstanding, most ALS patients die within four years or so of their diagnosis, after having endured the progressive loss of muscle control that robs them of their ability to walk, to swallow, and even to speak.
Rather than see a friend’s father locked in by his ALS, [Ricardo Andere de Mello] decided to help out by building a one-finger interface to a [Hawking]-esque voice synthesizer on the cheap. Working mainly with what hardware he had on hand, his system lets his friend’s dad flick a finger to operate off-the-shelf assistive communication software running on a laptop. The sensor is an accelerometer velcroed to a fingertip; when a movement threshold is passed, an Arduino sends the laptop an F12 keypress, which is all that’s needed to operate the software. You can watch it in action in the video after the break.
Hats off to [Ricardo] for pitching in and making a difference without breaking the bank. This isn’t the first expedient speech synthesizer we’ve seen for ALS patients — this one does it just three chips, including voice synthesis. Continue reading “Quick Hack Helps ALS Patient Communicate”
[Monta Elkins] got it in his mind that he wanted to try out an old-style speech synthesizer with the SC-01 (or SC-01A) chip, one that uses phonemes to produce speech. After searching online he found a MicroVox text-to-speech synthesizer from the 1980s based around the chip, and after putting together a makeshift serial cable, he connected it up to an Arduino Uno and tried it out. It has that 8-bit artificial voice that many of us remember fondly and is fairly understandable.
The SC-01, and then the SC-01A, were made by Votrax International, Inc. In addition to the MicroVox, the SC-01 and SC-01A were used in the Heath Hero robot, the VS-100 synthesizer add-on for TRS-80s, various arcade games such as Qbert and Krull, and in a variety of other products. Its input determines which phonemes to play and where it shines is in producing good transitions between them to come up with decent speech, much better than you’d get if you just play the phonemes one after the other.
The MicroVox has a 25-pin RS-232 serial port as well as a parallel port and a speaker jack. In addition to the SC-01A, it has a 6502 under the hood. [Monta] was lucky to also receive the manual, and what a manual it is! In addition to a list of the supported phonemes and words, it also contains the schematics, parts list and details for the serial port which alone would make for fun reading. We really liked the taped-in note seen in this screenshot. It has a hand-written noted that says “Factory Corrected 10/18/82”.
Following along with [Monta] in the video below, he finds the serial port’s input buffer chip datasheet online and verifies the voltage levels. Next he opens up the case and uses dips switches to set baud rate, data bits, parity, stop bits and so on. After hooking up the speakers, putting together a makeshift cable for RX, TX and ground, and writing a little Arduino code, he sends it text and out comes the speech.
Continue reading “MicroVox Puts The 80’s Back Into Your Computer’s Voice”
In the movie 2001: A Space Odyssey, HAL 9000 — the neurotic computer — had a birthday in 1992 (for some reason, in the book it is 1997). In the late 1960s, that date sounded impossibly far away, but now it seems like a distant memory. The only thing is, we are only now starting to get computers with voice I/O that are practical and even they are a far cry from HAL.
[GeraldF6] built an Arduino-based clock. That’s nothing new but thanks to a MOVI board (ok, shield), this clock has voice input and output as you can see in the video below. Unlike most modern speech-enabled devices, the MOVI board (and, thus, the clock) does not use an external server in the cloud or any remote processing at all. On the other hand, the speech quality isn’t what you might expect from any of the modern smartphone assistants that talk. We estimate it might be about 1/9 the power of the HAL 9000.
Continue reading “Arduino Clock Is HAL 1000”
They just don’t write promotional film scripts like they used to: “These men are design engineers. They are about to engage a new breed of computer, called Graphic 1, in a dialogue that will test the ingenuity of both men and machine.”
This video (embedded below) from Bell Labs in 1968 demonstrates the state of the art in “computer graphics” as the narrator calls it, with obvious quotation marks in his inflection. The movie ranges from circuit layout, to animations, to voice synthesis, hitting the high points of the technology at the time. The soundtrack, produced on their computers, naturally, is pure Jetsons.
Highlights are the singing “Daisy Bell” at 9:05, which inspired Stanley Kubrick to play a glitchy version of the track as Dave is pulling Hal 9000’s brains out, symbolically regressing backwards through a history of computer voice synthesis which at that point in time was the present. (Whoah!)
Continue reading “Retrotechtacular: The Incredible Machine”
Speech synthesis is nothing new, but it has gotten better lately. It is about to get even better thanks to DeepMind’s WaveNet project. The Alphabet (or is it Google?) project uses neural networks to analyze audio data and it learns to speak by example. Unlike other text-to-speech systems, WaveNet creates sound one sample at a time and affords surprisingly human-sounding results.
Before you rush to comment “Not a hack!” you should know we are seeing projects pop up on GitHub that use the technology. For example, there is a concrete implementation by [ibab]. [Tomlepaine] has an optimized version. In addition to learning English, they successfully trained it for Mandarin and even to generate music. If you don’t want to build a system out yourself, the original paper has audio files (about midway down) comparing traditional parametric and concatenative voices with the WaveNet voices.
Another interesting project is the reverse path — teaching WaveNet to convert speech to text. Before you get too excited, though, you might want to note this quote from the read me file:
“We’ve trained this model on a single Titan X GPU during 30 hours until 20 epochs and the model stopped at 13.4 ctc loss. If you don’t have a Titan X GPU, reduce batch_size in the train.py file from 16 to 4.”
Last time we checked, you could get a Titan X for a little less than $2,000.
There is a multi-part lecture series on reinforced learning (the foundation for DeepMind). If you wanted to tackle a project yourself, that might be a good starting point (the first part appears below).
Continue reading “Talking Neural Nets”
It isn’t easy communicating when you have any form of speech impairment. In such cases, a Speech-generating device (SGD) becomes essential to help you talk to the world. When coupled with other ailments that limit body movement, the problem becomes worse. How do you type on a keyboard when you can’t move your hands, and how do you talk when your voice box doesn’t work. Well known Scientist Stephen Hawking has been battling this since 1985. Back then, it took a lot of hardware to build a text entry interface and a text to speech device allowing him to communicate.
But [Marquis de Geek] did a quick hack using just a few parts to make a Voice Box that sounds like Stephen Hawking. Using an arcade push button to act as a single button keyboard, an Arduino, a 74HC595 shift register, a 2-line LCD, and the SP0256 hooked to an audio amplifier / speaker, he built the stand-alone speech synthesizer which sounds just like the voice box that Stephen Hawking uses. Although Dr. Hawking’s speech hardware is quite complex, [Marquis de Geek]’s hack shows that it’s possible to have similar results using off the shelf parts for a low cost solution.
There aren’t a lot of those SP0256-AL2 chips around. We found just a couple of retailers with small stock levels, so if you want to make one of these voice boxes, better grab those chips while they last. The character entry is not quick, requiring several button presses to get to the character you want to select. But it makes things easier for someone who cannot move their hands or use all fingers. A lot of kids grew up using Speak and Spell, but the hardware inside that box wasn’t the easiest to hack into. For a demo of [Marquis de Geek]’s homemade Hawking voice box, check the video below.
Continue reading “Making A Homemade Stephen Hawking”