ESP8266 As A Networked MP3 Decoder

Support libraries, good application notes, and worked examples from a manufacturer can really help speed us on our way in making cool stuff with new parts. Espressif Systems has been doing a good job with their ESP8266 product (of course, it doesn’t hurt that the thing makes a sub-$5 IOT device a reality). Only recently, though, have they started publishing completed, complex application examples. This demo, a networked MP3 webradio player, just popped up in Github, written by the man better known to us as Sprite_tm. We can’t wait to see more.

The MP3 decoder itself is a port of the MAD MP3 library, adapted for smaller amounts of SRAM and ported to the ESP8266. With a couple external parts, you can make an internet-connected device that you can point to any Icecast MP3 stream, for instance, and it’ll decode and play the resulting audio.

What external parts, you ask? First is something to do the digital-to-analog conversion. The application, as written, is build for an ES9023 DAC, but basically anything that speaks I2S should be workable with only a little bit of datasheet-poking and head-scratching. Of course, you could get rid of the nice-sounding DAC chip and output 5-bit PWM directly from the ESP8266, but aside from being a nice quick demo, it’s going to sound like crap.

The other suggested external IC is an SPI RAM chip to allow for buffering of the incoming MP3 file. WiFi — and TCP networks in general — being what they are, you’re going to want to buffer the MP3 files to prevent glitching. As with the dedicated DAC, you could get away without it (and there are defines in the “playerconfig.h” file to do so) but you’ll probably regret it.

In sum, an ESP8266 chip, a cheap I2S DAC, and some external RAM and you’ve got a webradio player. OK, maybe we’d also add an amplifier chip, power supply, and a speaker. Hmmm…. and a display? Or leave it all configurable over WiFi? Point is, it’s a great worked code example, and a neat DIY device to show your friends.

The downsides? So far, only the mono version of the libMAD decoder / synth has been ported over to ESP8266. The github link is begging for a pull request, the unported code is just sitting there, and we think that someone should take up the task.

Other Resources

In our search for other code examples for the ESP8266, we stumbled on three repositories that appear to be official Espressif repositories on Github: espressif, EspressifSystems, and EspressifApp (for mobile apps that connect to the ESP8266). The official “Low Power Voltage Measurement” example looks like a great place to start, and it uses the current version of the SDK and toolchain.

There’s also an active forum, with their own community Github repository, with a few “Hello World” examples and a nice walkthrough of the toolchain.

And of course, we’ve reported on a few in the past. This application keeps track of battery levels, for instance. If you’ve got the time, have a look at all the posts tagged ESP8266 here on Hackaday.

You couldn’t possibly want more resources for getting started with your ESP8266 project. Oh wait, you want Arduino IDE support?

Thanks [Sprite_tm] for the tip.

43 thoughts on “ESP8266 As A Networked MP3 Decoder

      1. Aye. Unfortunately, only mono audio can be made that way: you really need buffered output, and in this case I abuse the hardware I2S and DMA to do that, but that gives only a single output. To have two outputs, you’d have to do everything in software and I don’t know if that’d leave enough room to actually decode the MP3.

        1. Because instead of sending 32-bits PCM data to the I2S DAC, I send 32-bits worth of PWM data. Because in that case every bit has the same weight, I can only send 2^5 values with unique rations between 1s and 0s. Theoretically I could do more, but I don’t really have the memory to buffer that in this application.

          1. I need some buffer to multitask between decoding MP3 data, reading MP3 from the network and various network stack things. Theoretically, it would be possible to modify the PWM code to use a buffer like that, but the PWM mode basically was a quick hack to allow people to test this without waiting for the I2S chip to come in, so I didn’t want to spend too much time on it.

    1. The ESP actually has I2S, which in this case is basically SPI with a DMA engine to feed it. The DSD thing you mention seems to need an 1-bit output at 2.8MHz, plus some logic to calculate the output. Processor time I have in the ESP: the MP3 decoder actually uses the CPU at 80MHz while it can run at 160MHz. I fear that the bottleneck is going to be the output buffers: I only have enough to do 1.4MHz 1-bit output… otherwise, it may be worth looking at.

      1. A second-order sigma-delta 1-bit DAC at 1.4MHz should produce reasonable audio quality… and just uses integer addition and subtraction.

        Most commercial SD DACs are 5th order, but that wants a multiplier as well as figuring out the correct coefficients for the lowpass filter (while a 2nd order SDDAC is just a double integrator)

        1. I’ve implemented a 2nd-order delta-sigma DAC, and indeed, to my untrained ear at least it seems that the noise is more concentrated in the high frequency bands and should be easier to filter out using a lowpass filter. Check the github repo for the code if you want it. I’m sure a 5th order DAC is doable as well, but I don’t know if the analog qualities of the ESPs output pin warrant implementing that.

      1. I think the cheap Chinese ones probably come in at less than the cost of a standalone DAC and SPI RAM chip, we just don’t really have any way to buy or develop for them unfortunately.

  1. Some 74’595s and a bunch of resistors and you have a cheap I2S DAC. Well, ok, not as cheap as the integrated stuff in high quantities, but at least very easy to source.

      1. Here were I live 74HC595 is 0.19€ + 5.60€ shipping in one shop and 0.20€ + 3.85€ shipping in another shop while Digikey wants to have 18€ shipping unless you order parts for >= 65€. I can’t find any other shop selling the AK4430.

    1. Can’t see a big problem – if the MP3 stack works, a slim SIP stack next and you’re virtually there.
      I’m sure a bodged SIP implementation for fun is enough, or trimmed PJSIP for those that want to use any PBX.

    1. Most of the MP3 patents have actually expired now because it’s so old. The standard itself dates back to 1993, and there have been software MP3 decoders that run on an ordinary PC in real time since 1995. (In a sane world, all of them should have expired at this point because it’s been 20 years, but the patent system is a little broken.)

      1. Are you based out of Shanghai, or does EspressIf support working remotely?

        The .a lib files in github all seem to be empty. Are these files not used or are they binary blobs that EspressIf provides for the tcp/ip stack, freertos, etc.?

        What are you working on now, either for the esp8266 web radio, or project-wise?

        1. I’ve used the Espressif job as an excuse to move to Shanghai. No idea on working remotely, I know they don’t mind it for small amounts of time but I’m unsure how good an idea it would be to never come in to the company ever.

          The .a files all have data in them (not in text format, so Github doesn’t show it in the preview). They indeed contain the SDK libraries.

    1. I made my own boombox which currently only has Bluetooth input; but I’d rather set the ESP8266 as something that can advertise itself as an audio sink in the wifi network, much like some current TVs will do. And then receive an audio stream from Spotify, Google Play Music, maybe even YouTube.

      Is this even feasible? Anyone can point me into the right direction as to which protocol/service should the ESP8266 use to announce itself in the network?

  2. Does the ESP8266 also output the clock signal for the I2S DAC? I think the ES9023 was mentioned in the README which requires a clock signal of N times the sampling frequency on its MCLK pin.

    1. Ah, now there’s a subtlety. As far as I can tell it doesn’t, but the ES9023 is unusual in that MCLK doesn’t have to be derived from the same clock as BCLK – it supports a special asynchronous mode where MCLK is supplied by a different clock source, and I think you can get modules with an ES9023 and an onboard clock. This is not true of DAC chips in general. Most expect MCLK and BCLK to come from the same clock source and MCLK to be an exact multiple of the sample rate.

  3. Indeed, you can attach an external oscillator to this DAC but it is hard to find a common frequency for 44.1 and 48 khnz for example. I remember that a pic32 (mx220fb32 and other MCUs with a I2S interface are able to do this) was able to output the clock signal (N*fs where N=128,256,…). It would be really useful to have a clock output directly from the ESP8266. Maybe one of the PWM outputs can be used…

  4. I have some questions :
    1. Does anyone else thought about sending a PCM/DSD (1-bit sigma-delta modulated data) via wifi and use the ESP8266 as a gate driver for a H-Bridge of a class D amplifier. (PCM to PWM or DSD low pass filter PWM)
    Why whould or wouldn’t this work ?
    2. Is MP3 necessary because the ESP8266’s data rate is too slow ? I bet that 99% of the users would not hear the difference between a 1-bit delta modulated 48 kHz stream and a 16-bit PCM DAC.

    In most cases I don’t care about a full player inside ESP8266, but rather a device to stream audio to and play it over wifi on a decent class D amplifier.

  5. I have some questions :
    1. Does anyone else thought about sending a PCM/DSD (1-bit sigma-delta modulated data) via wifi and use the ESP8266 as a gate driver for a H-Bridge of a class D amplifier. (PCM to PWM or DSD low pass filter PWM) Why whould or wouldn’t this work ?
    2. Is MP3 necessary because the ESP8266’s data rate is too slow ? I bet that 99% of the users would not hear the difference between a 1-bit delta modulated 48 kHz stream and a 16-bit PCM DAC. In most cases I don’t care about a full player inside ESP8266, but rather a device to stream audio to and play it over wifi on a decent class D amplifier.

  6. Hi all ,
    I am working on esp8266,my idea is send audio files to MCU, here my MCU is TM4C123 through ESP8266.
    Can any one help me how to do this ? Please
    Thanks in advance
    Rohit

  7. Hey, is there any way to compile the code with arduino? Im glad that i can manage arduion on my Windows, but im not understanding anything about the makefile etc.. I already got my ESP8266 ESP-01 with the aditional SPI RAM installes also all the wiring is done, now im only lacking the Software. I want to use the pwm chanel as the sound output.

Leave a Reply to Ben NguyenCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.