Synchronize Data With Audio From A $2 MP3 Player

June 13, 2016

Many of the hacks featured here are complex feats of ingenuity that you might expect to have emerged from a space-age laboratory rather than a hacker’s bench. Impressive stuff, but on the other side of the coin the essence of a good hack is often just a simple and elegant way of solving a technical problem using clever lateral thinking.

Take this project from [drtune], he needed to synchronize some lighting to an audio stream from an MP3 player and wanted to store his lighting control on the same SD card as his MP3 file. Sadly his serial-controlled MP3 player module would only play audio data from the card and he couldn’t read a data file from it, so there seemed to be no easy way forward.

His solution was simple: realizing that the module has a stereo DAC but a mono amplifier he encoded the data as an audio FSK stream similar to that used by modems back in the day, and applied it to one channel of his stereo MP3 file. He could then play the music from his first channel and digitize the FSK data on the other before applying it to a software modem to retrieve its information.

There was a small snag though, the MP3 player summed both channels before supplying audio to its amplifier. Not a huge problem to overcome, a bit of detective work in the device datasheet allowed him to identify the resistor network doing the mixing and he removed the component for the data channel.

He’s posted full details of the system in the video below the break, complete with waveforms and gratuitous playback of audio FSK data.

This isn’t the first time we’ve featured audio FSK data here at Hackaday. We’ve covered its use to retrieve ROMs from 8-bit computers, seen it appearing as part of TV news helicopter coverage, and even seen an NSA Cray supercomputer used to decode it when used as a Star Trek sound effect.

32 thoughts on “Synchronize Data With Audio From A $2 MP3 Player”

Howard says:

June 13, 2016 at 8:43 am

Clever hacking! It reminds me of the cassette-based animation toys we made back in the 80s with the audio on one track and the animations in encoded tones on the other,

Report comment

Reply
1. Dr.Tune says:
  
  June 13, 2016 at 9:21 am
  
  It took me right back to the days of the Commodore 64 when I wrote a casstte fast-loader that let you play Space Invaders while waiting for the game to load; it was used on quite a lot of Mastertronic games (mostly released in Europe) – that was nearly 30 years ago! :-) https://www.youtube.com/watch?v=M3NkuS6pOfA
  
  Report comment
  
  Reply
  1. Greenaum says:
    
    June 13, 2016 at 6:25 pm
    
    YOU wrote Invade-a-load? That was famous! Millions of schoolchildren were amazed at it. In my case, the ZX Spectrum version. Or maybe that was Pac-Loader. In any case, genius, mate!
    
    Report comment
    
    Reply
    1. Dr.Tune says:
      
      June 13, 2016 at 7:31 pm
      
      I did! :-) Yeah I worked for Mastertronic, wrote it in a few days back when I was a nipper, it got used on a lot of games – however I didn’t do any fancy loaders on the Spectrum; although a few people did manage it, it was much much harder to do – the spectrum was entirely software timing loops, whereas the C64 was nice enough to have hardware timers and the cassette trigger input was hooked up to an IRQ, so doing stuff while loading was cake.
      
      Report comment
      
      Reply
      1. Greenaum says:
        
        June 15, 2016 at 10:45 am
        
        Ah, yeah that would make it easier. That and the C64’s tape rate being something like 300 baud, vs the Speccy’s 2000-something. All done directly by the CPU, just through a simple comparator or the like at an I/O address.
        
        I imagine the Speccy version was “do a little space invaders, check the tape, invade a little bit more, check the tape” in a loop. Same way Jet Set Willy did the sound, til Matthew Smith figured out how to do it better.
        
        Report comment
      2. Dr.Tune says:
        
        June 15, 2016 at 2:16 pm
        
        The C64 would run at ~2400baud (which was one of the reasons for doing a turbo loader) the hardware was quite capable but the ROM code was pretty weak. With Spectrum (+CPC, etc) tape loaders the tricky (well, highly tedious) part is that all your code paths that do the “something interesting” have to all take almost exactly the same number of cpu cycles so you can still do reasonably accurate software timing. It’s just a lot of drudgery. On the C64 people did the same effort so they could do “borderless” full screen images.
        
        Report comment
TDM says:

June 13, 2016 at 9:24 am

That was slick.

Report comment

Reply
gregkennedy says:

June 13, 2016 at 9:30 am

Ah yes, like the “loader music” of yore. Back in the Atari 8-bitter days you had one tape track carrying the data, and the other track could carry a sweet analog audio tune.
https://soundcloud.com/jeff-gerstmann/atari-tape-music

Report comment

Reply
1. Jim says:
  
  June 14, 2016 at 4:01 am
  
  A much underused feature of the Atari 8 bits. Aside from music to entertain while loading a game, with the tape motor control and digital sync, you could do really interesting stuff, especially with speech coming through the TV mixed with the computer audio. With the tape unit tucked away, less savvy bystanders wondered what fantastic futuristic programming they had just witnessed!
  
  Report comment
  
  Reply
  1. gregkennedy says:
    
    June 14, 2016 at 12:05 pm
    
    There were “learn to speak (LANGUAGE)” tapes that used the computer to queue up recorded audio tracks, IIRC
    
    Report comment
    
    Reply
Whatnot says:

June 13, 2016 at 9:59 am

Problem is that plenty of songs have certain instruments predominantly or even completely on one side, meaning that you can’t play regular MP3’s anymore on his hacked device since you miss half the audio information, you need to pre-process all files, or add a switch to re-enable the mixing circuit.

And talking of which,, it’s odd how you see more and more standalone speakers and BT speakers that are mono, it’s like society is walking backwards in audio terms. And they even dare to ask full price for mono speakers, it’s peculiar.

So now I’m wondering: aren’t there modules like this for say $3 that are stereo? Surely those exists.
So I just quickly checked, found one that does MP3/WMA/WAV and takes USB input too, in stereo, and it has audio input and output and FM radio plus a display and a freaking remote control, all for €4.24, albeit without amplifier I think.

Report comment

Reply
1. Pabluski says:
  
  June 13, 2016 at 10:12 am
  
  Stereo works best if you sit in the sweet spot. Otherwise mono is just fine and requires only one speaker.
  
  Report comment
  
  Reply
2. Dr.Tune says:
  
  June 13, 2016 at 10:54 am
  
  Well, yeah you need to preprocess your MP3s otherwise how would you get the data track in there? If you don’t want the data track this hack is not relevant to you.
  
  SOX does it very nicely; you downmix the original stereo to mono (which does exactly what the unmodified player module does in hardware; just sums the channels) then splice in data track. I do some other preprocessing on the audio as well because I’m using a crappy little speaker (2″ in the video but I’ll be using 3″ in the final project) which like all tiny speakers have poor bass response; so I remove a lot of the bass, compress the dynamic range a bit and normalize it, which improves the end result in my project.
  
  If you did want a DFPlayer with stereo output just use a $3 PAM amplifier module and hook to DACL+DACR; super simple Mono is generally fine for this kind of small-speaker application though because even if you do have two speakers you rarely have them far enough apart to get a useful stereo image. It’s not about high fidelity, it’s about compact and cheap. The DFPlayer quality is fine if you listen on a decent pair of headphones of course; a small speaker is the weak link.
  
  Report comment
  
  Reply
  1. Dr.Tune says:
    
    June 13, 2016 at 11:06 am
    
    Obviously what’s in the video is a 2″ speaker not mounted in any kind of enclosure, so it sounds awful – weak and tinny – in the end result everything (PCBs and battery etc) will be inside a 3″ inner diameter tube with a 3″ speaker mounted on one end; I’ll play a bit with the tube length and perhaps put a hole near the far end to turn it from a sealed to a ported enclosure; speaker design is an interesting thing.
    
    Report comment
    
    Reply
    1. Galane says:
      
      June 13, 2016 at 8:32 pm
      
      Years ago when the Bose Wave was first released, I read a magazine article about it and Bose’s waveguide system. The article had a picture of a speaker with straight tubes on front and rear. The ratio was 4:1, or was it 3:1? One of those two. Try both!
      
      Bend the rear tube around so its opening is aligned with the opening of the tube on the front of the speaker and you have the equivalent of the single bass speaker in a Bose Wave.
      
      Report comment
      
      Reply
      1. Dr.Tune says:
        
        June 13, 2016 at 9:12 pm
        
        Mmm it’s a world of fun investigating speaker design – especially ported enclosures. Looking at stuff like folded horns for bass, etc, there’s a lot of cleverness out there. My simplistic understanding – and possibly wrong – is that whatever you do you have peaks and troughs in the frequency response, and because my electronics (and batteries; likely 3 C cells in series) need to all go inside the same tube, that stuff is going to affect the air volume and hence resonance. I can’t really bend the tube (form-factor wise it’s pretty much got to be a single length of straight plastic pipe, likely 12-16″ long, 3″ inner diameter). I’ll try different lengths etc, ultiimately I’m not about hi-fi although I’d like it to be a reasonably efficient speaker, especially above bass frequencies (say 300hz and up). I’ll put some pics on the project page when I get to that point (probably a few weeks).
        
        Report comment
3. aklsdjfkj says:
  
  June 13, 2016 at 10:57 am
  
  Great point about the instruments and vocals sometimes being split on stereo channels. I would assume that’s what he did before adding the light data on the mp3 file. Most of those standalone speakers (mp3/bluetooth) that I have seen are stereo, hook up headphones to their audio out jacks and they play in stereo – but their external speaker is a mono speaker since stereo would require people sit in a specific location. Mono is a much better and much cheaper solution.
  
  Report comment
  
  Reply
  1. Whatnot says:
    
    June 13, 2016 at 11:57 am
    
    Obviously there is a sweet spot for stereo, but the general effect is certainly audible and gives depth to the sound over a much wider area than the sweet spot.
    
    Report comment
    
    Reply
    1. Dr.Tune says:
      
      June 13, 2016 at 12:18 pm
      
      My particular project is designed to be shit – literally – it’s going inside porta-potties :-)
      
      Report comment
      
      Reply
4. Galane says:
  
  June 13, 2016 at 8:28 pm
  
  My father, born in 1938, has never been a big fan of stereo sound. Seems to be a passing fad with all the mono Bluetooth speakers, and stereo ones with the speakers right next to each other.
  
  Report comment
  
  Reply
ejonesss says:

June 13, 2016 at 10:28 am

teddy ruxpin did that if you played a teddy ruxpin tape in a normal player the story played on 1 speaker and the control signal could be heard on the other channel

Report comment

Reply
thirtyone says:

June 13, 2016 at 1:08 pm

VERY convenient and clever way to encode lighting data with audio! ++++++

Report comment

Reply
Dan#1438459043 says:

June 13, 2016 at 2:09 pm

Ah very tricky, never let an unimaginative product design get in the way of a useful result.

Thanks for sharing, will we get to see more details when it is all done?

Report comment

Reply
1. Dr.Tune says:
  
  June 13, 2016 at 4:47 pm
  
  Sure, I’ll post some code etc; have a backlog of “real” work to do right now :-)
  
  Report comment
  
  Reply
  1. Jim says:
    
    September 14, 2017 at 5:12 pm
    
    Can you share the arduino sketch? How did you decode the FSK?
    
    Report comment
    
    Reply
    1. Dr.Tune says:
      
      September 16, 2017 at 3:27 pm
      
      Hey there. This project basically ran out of time b/c it was for Burning Man and the synchronized audio track stuff had to get dropped, but I did the basic modem stuff in python on linux and then ran it back+forth through an mp3 encoder to make sure the data survived being perceptually encoded (which as you’d imagine is really most about not squeezing the mp3 data rate too low).
      Creating the FSK waveform is as simple as creating a WAV with a square wave (use a sine wave if you like) where the high/low periods vary according to the bits in your data stream. Demodulating FSK is very easy, you’re basically just looking at the ADC input sampling the audio channel, finding the zero crossing points (with some “minimum amplitude” requirement between subsequent crossings) and comparing the amount of time between those points against a threshold to decide if you read a ‘1’ or a ‘0’. (you can get fancier of course and encode at a higher bitrate if you want to using QPSK or whatever, but binary PSK is easy and reliable). You usually add a simple header preamble (e.g. 0x80,0x80 or whatever) so you can get in sync when reading, and perhaps a checksum suffix; you can imagine. An mp3 codec running at reasonable encoding bitrate is a piece of cake to use as a data channel compared to say audio tape, phone lines or especially radio.
      
      Report comment
      
      Reply
NewCommentor1283 says:

June 13, 2016 at 4:09 pm

ahh now THAT is the kind of thing i come here to see/read

for 300 baud it works good, long as the tape is in good shape
can even be made wireless using old cordless phones, er, at least the analog ones …

Report comment

Reply
1. Dr.Tune says:
  
  June 13, 2016 at 4:46 pm
  
  Not had time to really get stuck into it yet (have a ‘real’ job to do :-) but it works fine with each bit encoded as a single cycle of a 4000/2000hz (1/0) sine wave; with one 1khz ‘start bit’ – and could quite probably use a half-cycle rather than a whole one- so you’re talking around a couple of hundred bytes per second. Given that there’s no inherent frequency wobble (‘wow/flutter’) as with cassettes, could probably encode several bits at a time, e.g. 4 bits at once into different frequencies. Anyway, plenty of room for different modulation techniques if you need more data rate.
  You want your mp3 encoder to not use “joint stereo” or “intensity stereo” encoding mode or there may be some bleedthrough between music/data or some ‘random’ skewing of the data. Fortunately the LAME encoder (that SOX uses) is a very smart piece of software and seems to be automatically doing the right thing.
  
  Report comment
  
  Reply
erock2014 says:

June 15, 2016 at 10:08 am

This is a great hack. It’s the same thing Disney did with Audio Animatronics.

Report comment

Reply
Alexei says:

June 15, 2016 at 10:36 am

Goldstar sold (probably, not sure if they were given away or sold) tape players bundled with English language course. Player had a 8051 CPU together with HD44780 LCD, and it would show “captions” encoded on one of the stereo tape tracks:
https://www.youtube.com/watch?v=UUxAdcdCazo

I had a play around with one of those some time ago. I figured out the data encoding and message structure, wrote some python to decipher original captions and make my own, but I never found the way to ‘hack’ the thing and find the way to LCD’s CGRAM through tape messages, which sucks. Drawing stuff on LCD using audiocasettes would be fun, I guess.

Report comment

Reply
1. Greenaum says:
  
  June 15, 2016 at 10:48 am
  
  You could convert Star Wars to the format! Would’ve been a great way of watching films portably, back in the 1980s. Well maybe not “great”.
  
  You could replace the 8051 if you really wanted to go for it, perhaps a bit pointless though, doesn’t have as much hack value if you’re having to replace hardware with modern stuff.
  
  Report comment
  
  Reply
  1. Alexei says:
    
    June 15, 2016 at 11:12 am
    
    Heh, the scrolling intro sequence from Star Wars would fit nicely on 2×16 LCD, but for the rest of the movie would be in very painful scroll-o-vision :)
    I did try to spam the thing with all sorts of different effects and speed settings and messages to maybe find some hidden or unused command (or a glitch) that would do exactly that, but there wasn’t any of that – it’s rock solid.
    What a tease to allow to print CG characters on screen, but completely deny to redefine them :P
    As for replacing 8051, I had thoughts about that, but then yep, it would be pointless and I don’t have a pin-compatible 8051 anyway.
    
    Dumping the firmware would be helpful, but those old factory-programmed (probably) 8051s don’t allow for that at all, I think. Don’t know much about ’em, my microcontroller tinkering began with a bunch of AVRs.
    
    Report comment
    
    Reply

Hackaday

Synchronize Data With Audio From A $2 MP3 Player

32 thoughts on “Synchronize Data With Audio From A $2 MP3 Player”

Leave a Reply to Dr.TuneCancel reply

Search

Never miss a hack

If you missed it

After 30 Years, Virtual Boy Gets Its Chance To Shine

How Vibe Coding Is Killing Open Source

Building Natural Seawalls To Fight Off The Rising Tide

Ask Hackaday: How Do You Digitize Your Documents?

The Amazing Maser

Our Columns

The Surprising Hackability Of A Knock-Off Chinese Toy Camera

Hackaday Links: February 1, 2026

Secret Ingredients

Hackaday Podcast Episode 355: Person Detectors, Walkie Talkies, Open Smartphones, And A WiFi Traffic Light

Did We Overestimate The Potential Harm From Microplastics?

32 thoughts on “Synchronize Data With Audio From A $2 MP3 Player”

Leave a Reply to Dr.TuneCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns