Numbers Station Simulator, Right In Your Browser

Do you find an odd comfort in the uncanny, regular intonations of a Numbers Station? Then check out [edent]’s numbers station project, which leverages the browser’s speech synthesis engine to deliver a ceaseless flow of (mostly) numbers, calmly-intoned in various languages.

The project is an entry for the annual JavaScript Golfing Competition, in which participants aim to create a cool program in 1024 bytes or less. It cleverly relies on the Web Speech API to deliver the speaking parts, which helps keep the code size tiny. The only thing it’s missing is an occasional shadow of static drifting across the audio.

If you’re new to numbers stations, our own [Al Williams] is here to tell you all about them. But there’s no need to tune into an actual mysterious radio signal just to experience weird numbers; just fire up [edent]’s project, put on some headphones, and relax if you can.

Text-to-Speech Model Can Do Music, Background Noises, And Sound Effects

Bark is a universal text-to-audio model that can not only create realistic speech, it can incorporate music, background noises, and sound effects. It can even include non-speech sounds like laughter, sighs, throat clearings, and similar elements. But despite the fact that it can deliver such complex results, it’s important to understand some of the peculiarities.

The model takes a prompt and generates the resulting sound from scratch. Results might sometimes be unexpected.

Bark is not a conventional text-to-speech program, and how it works has a lot more in common with large language model AI chatbots. This means that results can deviate from expectations, and outputs aren’t necessarily going to be studio-quality speech. As the project’s README points out, “(generated outputs can) be anything from perfect speech to multiple people arguing at a baseball game recorded with bad microphones.” That being said, there is some support for voice presets as a way to help guide the model with some consistency.

Bark was designed by a company called Suno for research purposes and is available under the MIT License. It can be installed and run locally, and has some demos available as well as an online implementation.

The ability to install and run Bark locally is promising territory for incorporating it into projects. And should you be more interested in speech-to-text instead, don’t forget about this plain C/C++ implementaion of AI-powered speech recognition.

Raspberry Pi Want A Cracker?

If you watch the old original Star Trek, you’ll notice that the computers on board the Enterprise don’t look much like our computers (unless you count the little 3.5 inch floppies that looked pretty close to the real thing). Then again, the Enterprise didn’t need keyboards and screens since the computers did a pretty good job of listening and speaking to humans.

We aren’t quite to the point where you can just ask the computer some fuzzy open-ended question like Captain Kirk did, but we do have things like Echo, Siri, and Google Now that do a fair job of listening to you and replying. In fact, Google provides an API that can do speech recognition and generation. [Giulio] used some common Python libraries to add speech I/O to a Raspberry Pi.

Continue reading “Raspberry Pi Want A Cracker?”