Synchronize Data With Audio From A $2 MP3 Player

Many of the hacks featured here are complex feats of ingenuity that you might expect to have emerged from a space-age laboratory rather than a hacker’s bench. Impressive stuff, but on the other side of the coin the essence of a good hack is often just a simple and elegant way of solving a technical problem using clever lateral thinking.

Take this project from [drtune], he needed to synchronize some lighting to an audio stream from an MP3 player and wanted to store his lighting control on the same SD card as his MP3 file. Sadly his serial-controlled MP3 player module would only play audio data from the card and he couldn’t read a data file from it, so there seemed to be no easy way forward.

His solution was simple: realizing that the module has a stereo DAC but a mono amplifier he encoded the data as an audio FSK stream similar to that used by modems back in the day, and applied it to one channel of his stereo MP3 file. He could then play the music from his first channel and digitize the FSK data on the other before applying it to a software modem to retrieve its information.

There was a small snag though, the MP3 player summed both channels before supplying audio to its amplifier. Not a huge problem to overcome, a bit of detective work in the device datasheet allowed him to identify the resistor network doing the mixing and he removed the component for the data channel.

He’s posted full details of the system in the video below the break, complete with waveforms and gratuitous playback of audio FSK data.

This isn’t the first time we’ve featured audio FSK data here at Hackaday. We’ve covered its use to retrieve ROMs from 8-bit computers, seen it appearing as part of TV news helicopter coverage, and even seen an NSA Cray supercomputer used to decode it when used as a Star Trek sound effect.

32 thoughts on “Synchronize Data With Audio From A $2 MP3 Player

  1. Clever hacking! It reminds me of the cassette-based animation toys we made back in the 80s with the audio on one track and the animations in encoded tones on the other,

      1. YOU wrote Invade-a-load? That was famous! Millions of schoolchildren were amazed at it. In my case, the ZX Spectrum version. Or maybe that was Pac-Loader. In any case, genius, mate!

        1. I did! :-) Yeah I worked for Mastertronic, wrote it in a few days back when I was a nipper, it got used on a lot of games – however I didn’t do any fancy loaders on the Spectrum; although a few people did manage it, it was much much harder to do – the spectrum was entirely software timing loops, whereas the C64 was nice enough to have hardware timers and the cassette trigger input was hooked up to an IRQ, so doing stuff while loading was cake.

          1. Ah, yeah that would make it easier. That and the C64’s tape rate being something like 300 baud, vs the Speccy’s 2000-something. All done directly by the CPU, just through a simple comparator or the like at an I/O address.

            I imagine the Speccy version was “do a little space invaders, check the tape, invade a little bit more, check the tape” in a loop. Same way Jet Set Willy did the sound, til Matthew Smith figured out how to do it better.

          2. The C64 would run at ~2400baud (which was one of the reasons for doing a turbo loader) the hardware was quite capable but the ROM code was pretty weak. With Spectrum (+CPC, etc) tape loaders the tricky (well, highly tedious) part is that all your code paths that do the “something interesting” have to all take almost exactly the same number of cpu cycles so you can still do reasonably accurate software timing. It’s just a lot of drudgery. On the C64 people did the same effort so they could do “borderless” full screen images.

    1. A much underused feature of the Atari 8 bits. Aside from music to entertain while loading a game, with the tape motor control and digital sync, you could do really interesting stuff, especially with speech coming through the TV mixed with the computer audio. With the tape unit tucked away, less savvy bystanders wondered what fantastic futuristic programming they had just witnessed!

  2. Problem is that plenty of songs have certain instruments predominantly or even completely on one side, meaning that you can’t play regular MP3’s anymore on his hacked device since you miss half the audio information, you need to pre-process all files, or add a switch to re-enable the mixing circuit.

    And talking of which,, it’s odd how you see more and more standalone speakers and BT speakers that are mono, it’s like society is walking backwards in audio terms. And they even dare to ask full price for mono speakers, it’s peculiar.

    So now I’m wondering: aren’t there modules like this for say $3 that are stereo? Surely those exists.
    So I just quickly checked, found one that does MP3/WMA/WAV and takes USB input too, in stereo, and it has audio input and output and FM radio plus a display and a freaking remote control, all for €4.24, albeit without amplifier I think.

    1. Well, yeah you need to preprocess your MP3s otherwise how would you get the data track in there? If you don’t want the data track this hack is not relevant to you.

      SOX does it very nicely; you downmix the original stereo to mono (which does exactly what the unmodified player module does in hardware; just sums the channels) then splice in data track. I do some other preprocessing on the audio as well because I’m using a crappy little speaker (2″ in the video but I’ll be using 3″ in the final project) which like all tiny speakers have poor bass response; so I remove a lot of the bass, compress the dynamic range a bit and normalize it, which improves the end result in my project.

      If you did want a DFPlayer with stereo output just use a $3 PAM amplifier module and hook to DACL+DACR; super simple Mono is generally fine for this kind of small-speaker application though because even if you do have two speakers you rarely have them far enough apart to get a useful stereo image. It’s not about high fidelity, it’s about compact and cheap. The DFPlayer quality is fine if you listen on a decent pair of headphones of course; a small speaker is the weak link.

      1. Obviously what’s in the video is a 2″ speaker not mounted in any kind of enclosure, so it sounds awful – weak and tinny – in the end result everything (PCBs and battery etc) will be inside a 3″ inner diameter tube with a 3″ speaker mounted on one end; I’ll play a bit with the tube length and perhaps put a hole near the far end to turn it from a sealed to a ported enclosure; speaker design is an interesting thing.

        1. Years ago when the Bose Wave was first released, I read a magazine article about it and Bose’s waveguide system. The article had a picture of a speaker with straight tubes on front and rear. The ratio was 4:1, or was it 3:1? One of those two. Try both!

          Bend the rear tube around so its opening is aligned with the opening of the tube on the front of the speaker and you have the equivalent of the single bass speaker in a Bose Wave.

          1. Mmm it’s a world of fun investigating speaker design – especially ported enclosures. Looking at stuff like folded horns for bass, etc, there’s a lot of cleverness out there. My simplistic understanding – and possibly wrong – is that whatever you do you have peaks and troughs in the frequency response, and because my electronics (and batteries; likely 3 C cells in series) need to all go inside the same tube, that stuff is going to affect the air volume and hence resonance. I can’t really bend the tube (form-factor wise it’s pretty much got to be a single length of straight plastic pipe, likely 12-16″ long, 3″ inner diameter). I’ll try different lengths etc, ultiimately I’m not about hi-fi although I’d like it to be a reasonably efficient speaker, especially above bass frequencies (say 300hz and up). I’ll put some pics on the project page when I get to that point (probably a few weeks).

    2. Great point about the instruments and vocals sometimes being split on stereo channels. I would assume that’s what he did before adding the light data on the mp3 file. Most of those standalone speakers (mp3/bluetooth) that I have seen are stereo, hook up headphones to their audio out jacks and they play in stereo – but their external speaker is a mono speaker since stereo would require people sit in a specific location. Mono is a much better and much cheaper solution.

    3. My father, born in 1938, has never been a big fan of stereo sound. Seems to be a passing fad with all the mono Bluetooth speakers, and stereo ones with the speakers right next to each other.

  3. Ah very tricky, never let an unimaginative product design get in the way of a useful result.

    Thanks for sharing, will we get to see more details when it is all done?

        1. Hey there. This project basically ran out of time b/c it was for Burning Man and the synchronized audio track stuff had to get dropped, but I did the basic modem stuff in python on linux and then ran it back+forth through an mp3 encoder to make sure the data survived being perceptually encoded (which as you’d imagine is really most about not squeezing the mp3 data rate too low).
          Creating the FSK waveform is as simple as creating a WAV with a square wave (use a sine wave if you like) where the high/low periods vary according to the bits in your data stream. Demodulating FSK is very easy, you’re basically just looking at the ADC input sampling the audio channel, finding the zero crossing points (with some “minimum amplitude” requirement between subsequent crossings) and comparing the amount of time between those points against a threshold to decide if you read a ‘1’ or a ‘0’. (you can get fancier of course and encode at a higher bitrate if you want to using QPSK or whatever, but binary PSK is easy and reliable). You usually add a simple header preamble (e.g. 0x80,0x80 or whatever) so you can get in sync when reading, and perhaps a checksum suffix; you can imagine. An mp3 codec running at reasonable encoding bitrate is a piece of cake to use as a data channel compared to say audio tape, phone lines or especially radio.

  4. ahh now THAT is the kind of thing i come here to see/read

    for 300 baud it works good, long as the tape is in good shape
    can even be made wireless using old cordless phones, er, at least the analog ones …

    1. Not had time to really get stuck into it yet (have a ‘real’ job to do :-) but it works fine with each bit encoded as a single cycle of a 4000/2000hz (1/0) sine wave; with one 1khz ‘start bit’ – and could quite probably use a half-cycle rather than a whole one- so you’re talking around a couple of hundred bytes per second. Given that there’s no inherent frequency wobble (‘wow/flutter’) as with cassettes, could probably encode several bits at a time, e.g. 4 bits at once into different frequencies. Anyway, plenty of room for different modulation techniques if you need more data rate.
      You want your mp3 encoder to not use “joint stereo” or “intensity stereo” encoding mode or there may be some bleedthrough between music/data or some ‘random’ skewing of the data. Fortunately the LAME encoder (that SOX uses) is a very smart piece of software and seems to be automatically doing the right thing.

  5. Goldstar sold (probably, not sure if they were given away or sold) tape players bundled with English language course. Player had a 8051 CPU together with HD44780 LCD, and it would show “captions” encoded on one of the stereo tape tracks:

    I had a play around with one of those some time ago. I figured out the data encoding and message structure, wrote some python to decipher original captions and make my own, but I never found the way to ‘hack’ the thing and find the way to LCD’s CGRAM through tape messages, which sucks. Drawing stuff on LCD using audiocasettes would be fun, I guess.

    1. You could convert Star Wars to the format! Would’ve been a great way of watching films portably, back in the 1980s. Well maybe not “great”.

      You could replace the 8051 if you really wanted to go for it, perhaps a bit pointless though, doesn’t have as much hack value if you’re having to replace hardware with modern stuff.

      1. Heh, the scrolling intro sequence from Star Wars would fit nicely on 2×16 LCD, but for the rest of the movie would be in very painful scroll-o-vision :)
        I did try to spam the thing with all sorts of different effects and speed settings and messages to maybe find some hidden or unused command (or a glitch) that would do exactly that, but there wasn’t any of that – it’s rock solid.
        What a tease to allow to print CG characters on screen, but completely deny to redefine them :P
        As for replacing 8051, I had thoughts about that, but then yep, it would be pointless and I don’t have a pin-compatible 8051 anyway.

        Dumping the firmware would be helpful, but those old factory-programmed (probably) 8051s don’t allow for that at all, I think. Don’t know much about ’em, my microcontroller tinkering began with a bunch of AVRs.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s