Whispers From The Void, Transcribed With AI

Screenshot of audio noise graph

‘Hearing voices’ doesn’t have to be worrisome, for instance when software-defined radio (SDR) happens to be your hobby. It can take quite some of your time and attention to pull voices from the ether and decode them. Therefore, [theckid] came up with a nifty solution: RadioTranscriptor. It’s a homebrew Python script that captures SDR audio and transcribes it using OpenAI’s Whisper model, running on your GPU if available. It’s lean and geeky, and helps you hear ‘the voice in the noise’ without actively listening to it yourself.

This tool goes beyond the basic listening and recording. RadioTranscriptor combines SDR, voice activity detection (VAD), and deep learning. It resamples 48kHz audio to 16kHz in real time. It keeps a rolling buffer, and only transcribes actual voice detected from the air. It continuously writes to a daily log, so you can comb through yesterday’s signal hauntings while new findings are being logged. It offers GPU support with CUDA, with fallback to CPU.

It sure has its quirks, too: ghost logs, duplicate words – but it’s dead useful and hackable to your liking. Want to change the model, tweak the threshold, add speaker detection: the code is here to fork and extend. And why not go the extra mile, and turn it into art?

11 thoughts on “Whispers From The Void, Transcribed With AI

  1. As far as I can tell, this is getting its input from your computer’s microphone / audio in, not an SDR. I guess you could use a virtual audio interface to get the audio from some other program that’s actually driving the SDR.

  2. One of the problems (or sources of great fun depending on your point of view) with AI is that it creates AP – artificial pareidolia – the “the tendency for perception to impose a meaningful interpretation on a nebulous stimulus, usually visual, so that one detects an object, pattern, or meaning where there is none.”

    We’ve seen this in a lot of the online image creators from “deep dream” weirdness to various AI imagery, sound creation, and video. I can’t imagine what spurious correlations will be created out of “voices from the æther”, but I’d bet the video will be even better.

    1. Funny, so that’s not unlike us people. There is a lot of meaning being constructed by our minds just because we want to see meaning where it’s just entropy of some kind. Guess AI is the correct term then.

    2. Man I miss the psychedelic Deep Dream fractal dog faces all over everything. That was when genAI seemed interesting and fun, and not merely an artstation/danbooru knockoff slophose

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.