Serial Connection Over Audio: Arduino Can Listen To UART

We’ve all been there: after assessing a problem and thinking about a solution, we immediately rush to pursue the first that comes to mind, only to later find that there was a vastly simpler alternative. Thankfully, developing an obscure solution, though sometimes frustrating at the time, does tend to make a good Hackaday post. This time it was [David Wehr] and AudioSerial: a simple way of outputting raw serial data over the audio port of an Android phone. Though [David] could have easily used USB OTG for this project, many microcontrollers don’t have the USB-to-TTL capabilities of his Arduino – so this wasn’t entirely in vain.

At first, it seemed like a simple task: any respectable phone’s DAC should have a sample rate of at least 44.1kHz. [David] used Oboe, a high performance C++ library for Android audio apps, to create the required waveform. The 8-bit data chunks he sent can only make up 256 unique messages, so he pre-generated them. However, the DAC tried to be clever and do some interpolation with the signal – great for audio, not so much for digital waveforms. You can see the warped signal in blue compared to what it should be in orange. To fix this, an op-amp comparator was used to clean up the signal, as well as boosting it to the required voltage.

Prefer your Arduino connections wireless? Check out this smartphone-controlled periodic table of elements, or this wireless robotic hand.

19 thoughts on “Serial Connection Over Audio: Arduino Can Listen To UART

    1. sure, but uart itself does not have any requirement to have a zero DC level (i.e. equal number of ones and zeros over time). So it would work for short messages, but not long ones. Since analog circuits are already used, a better method would be to send a sine with small amplitude for zero and a sine with larger amplitude for one. That way there is always zero dc.

      1. It can be done, if you don’t mind (at least) tenfold bit rate reduction.

        If you set the uart to 8,N,1, than you know that long strings of back-to-back characters will have a 10 between the characters, so you can use that to form tones and vice versa.

        E.g. series of 0x55s (it is LSB first, remember?) should produce continuous tone of frequency equal to half of bitrate, and series of 0xF0s should produce continuous tone of frequency equal to 1/10th of bitrate.

        Let’s say one character time of lower frequency is symbol “1” (idle, for proper synchronization of receiving UARTs) and one character time of higher frequency is “0”. Now you can have the communication session start with unspecified minimal number of ones as preamble, followed by perhaps a single zero, and then the data may begin. I think it would be a good thing to put some rule for bit-stuffing, to break long runs of high-frequency bits, because otherwise receiving UARTs may lose the count where the character (a “bit”) begins.

        Now if the audio inputs and outputs are band limited to max 20kHz, the maximum baud rate will be one tenth of double of that (aka one fifth), or 4kbps. One standard bit rate close to a match is 38400, which will produce tones at 3.84 kHz and 19.2 kHz, with symbol rate of 3840 bps. You can’t surf over that, but a chat, or some simple control, could work.

        1. Or you could limit the available symbols over UART to just those that have proper DC.
          Again, only for long messages without breaks in between. If you just want to send some command every now and then, it will work.

  1. I wouldn’t count on USB OTG support. I just finished working on a project with (cheap) mediatek/allwinner android tablets where they stripped out OTG support. That said my phone did have it and so did a tablet from Lenovo.

  2. i was playing with some modulation/demodulation techniques for CW and PSK and made pairs for both in js with the WebAudio API. Not perfect, code is messy, but it works. It can send and receive at about 200 bps in BPSK mode between 2 browsers, tested it with my android phone and firefox on desktop and works fine, even at close to inaudible frequencies.

    might need to enable Stereo Mix in windows to make it work between 2 browser tabs.

  3. The distorted waveform is very similar to floppy disk flux reversal signals. The MFM encoding makes sure the phase locked loop stays in sync while still being able to squeeze in a lot of data.
    Here’s a capture I did on the rigol DS1054z over ethernet into my application. The top two traces are the digital flux reversal signals after processing using my own software filters (first one) and the floppy drive own digital read output. The bottom is the analogue flux reversal signal directly captured from the drive head.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.