Using Audio Hardware To Drive Neopixels Super Fast

Here’s the thing about running large strings of Neopixels—also known as WS2812 addressable LEDs. You need to truck out a ton of data, and fast. There are a dozen different libraries out there to drive them already, but [Zorxx] decided to strike out with a new technique—using I2S hardware to get the job done. 

Fast!

Microcontrollers traditionally use I2S interfaces to output digital audio. However, I2s also just happens to be perfect for driving tons of addressable LEDs. At the lowest level, I2S hardware is really just flipping a serial data line really fast with a clock line and a word select line for good measure. If, instead of sound, you pipe a data stream for addressable LEDs to the I2S hardware, it will clock that data out just the same!

[Zorxx] figured that at with an ESP32 trucking out I2S data at a rate of 2.6 megabits per second on the ESP32,  it would be possible to update a string of 256 pixels in just 7.3 milliseconds. In other words, you could have a 16 by 16 grid updating at over 130 frames per second. Step up to 512 LEDs, and you can still run at almost 70 fps.

There’s some tricks to pulling this off, but it’s nothing you can’t figure out just by looking at the spec sheets for the WS2812B and the ESP32. Or, indeed, [Zorxx’s] helpful Github page. We’ve featured some other unorthodox methods of driving these LEDs before, too! Meanwhile, if you’ve got your own ideas on how to datablast at ever greater speeds, don’t hesitate to let us know!

17 thoughts on “Using Audio Hardware To Drive Neopixels Super Fast

  1. Uhm … whut?
    WS2812’s are a fixed time protocol. Either they update or they don’t – there’s no slow or fast – so all the “fps” stuff is exactly the same with any other driver. Even my 6502 updates WS2812’s at the same speed.
    I2S is great to drive them with minimal CPU overhead – but it’ll never be “faster”.. Or slower.

    1. Yeah, I call Shenanigans! There are other pixel technologies that allow for much higher data rates, in many cases by using a separate clock line. But the incredibly popular, dominant, and ubiquitous WS281x and its clones is certainly not one of them.

      1. Then you misunderstand WS2812’s. To update a string of WS2812 takes a certain amount of data transfer time and then a certain amount of wait time (both with notoriously tight timing requirements) before they see no more data is coming, and switch to their new data. But after that they sit idle. So with a certain number of neopixels there is a theoretical maximum of FPS you can achieve, but you can always go slower by updating just once every so often.

    2. The fixed time refers to a single LED and there can be an arbitrary delay between updates of two adjacent LEDs. These delays add up and determine (as well as the total number of LEDs) the actual fps . With I2S circuit streaming the data (with minimal delays between LEDs) CPU is free to prepare the buffer for the next frame, hence the delay between the last LED of the nth frame and the first of the (n+1) th is minimal too. So, indeed it is not about the performance of the driver, but an automatic driver like I2S allows faster updates of a whole frame.

    1. Yep,

      IIRC this goes back to cnlohr, who implemented it for ESP8266 10 years ago.
      https://github.com/cnlohr/esp8266ws2812i2s
      I then ported it for NeoPixelBus
      https://github.com/Makuna/NeoPixelBus/wiki/ESP8266-NeoMethods

      I love the I²S device, because its a cheap DMA-to-serial engine and for generating signals great.
      The RMT is a bit more powerful, but requires more CPU-interaction to fill the buffers.
      Newer ESP32 also have DMA support for RMT – didn’t try this yet.

      1. cnlohr started this whole path to a solution for ESP8266 as bit-bang just doesn’t work well due to not being able to block WiFi ISRs. That lead to the ESP32 solutions.

        NeoPixelBus also supports I2S sending with ESP32 and RMT, this is the library WLED uses.

        ESP32 supports parallel LCD driver feature using the I2S hardware and pulled out into its own peripheral with the ESP32S3, giving you x8/x16/x24 channels sent at the same time using hardware. Split your strip and truly gain x8 speed increase.

  2. The WS2812 takes 24 * (1.25us+/-600ns) per pixel + 50us per frame. That’s it. No slower, no faster. The (most) difficult part is DMA. Maybe a little bit faster if you only need to populate the “start” of the frame (since the pixels are entered serially).

  3. I used the I2S hardware of the HLK RM04 (RT5350) as the brain to drive my 2000+ pixel ws2812 costume back in 2014.
    I remember it being tricky in that there was no mono mode, so I had to interleave chunks of bits as “left and right” channels to generate the proper constant stream

  4. shrug – I2S or SPI + DMA for driving neopixels has been a thing for a REALLY long time. Or use an RPI PIO plus DMA, and the second core to almost entirely offload everything to do with calculating colors and streaming out a bitstream

  5. Most modern RGB LEDs (see the 2813 series) have shorter 0 and 1 times compared to their predecessors, giving you a higher potential refresh rate. One common misconception is that each LED updates after it’s data has been sent to it and that is not the case. As long as there is a data stream occurring along the string, NONE of the LEDs change state. You need a “latching period”, which varies depending on the exact model – but typically 300us of low will cause ALL of the LEDs in the string to change their states at the exact same time. If this was not the case then we would see “tearing” in large panel of LEDs.

    With the ESP32-S2 and S3 you can drive multiple pin outputs (parallel) with I2S, so it would be possible to have a 8 pixel swath being updated at the exact same time. Your code would have to break the pixels down for the proper output, but THIS is a way that you can use I2S to dramatically increase the speed of a panel update since you are updating 8x as many pixels as the common single output method.

  6. WS2812 addressable LEDs operate at a fixed clock speed. The only way to go faster is to drive multiple strings at the same time. That can be done easily with DMA and a timer on most microcontrollers. If the GPIO port is 16 bits, you can drive 16 strings at the same time with very little CPU usage.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.