WS2811 can be addressed at 800kHz using a 8MHz clock


Timing is everything and that’s why most communication protocols require a very accurate clock source. The WS2811 LED strip controllers are no different. But [Danny] figured out a way to drive them reliably with an 8MHz clock source.

The WS2811 has become one of the most popular controllers for RGB pixels and strips alike. We’ve seen several hacks used to address them, including the 16MHz AVR technique that inspired [Danny] to take on this project. He planned to use that library but the 25-day shipping time for a 16MHz crystal drove home to invent a way to use the internal oscillator instead.

The gist of the hack is that he wrote assembly code to handle pairs of binary bit values. With a code block for each of the four possible combinations in hand he had to find a way to craft the conditional jumps to preserve accurate timing. After hitting the wall trying to solve this puzzle by hand he wrote a C++ program to solve it for home. The proof is in this video which shows one chip driving multiple Larson scanners on a single strip.

35 thoughts on “WS2811 can be addressed at 800kHz using a 8MHz clock

  1. That’s a nice writeup.

    I’m guessing you probably didn’t realize Adafruit’s Neopixel library already supports WS2811 strips on both 8 & 16 MHz AVR-based Arduino? They use pretty much the same approach. But they didn’t write such a nice wiki page with the technical details….

    I also saw your page about parallel output. Just so you know now, the OctoWS2811 library (which I wrote and was mentioned yesterday here on HaD) does this, with nearly zero CPU overhead using the DMA controller on a 32 bit ARM part. OctoWS2811 uses the transposed format you described. My ARM-based code probably won’t help at all on AVR, since there’s no DMA controller… but I did write a nice Processing program to stream video to simultaneously to any number of boards in that transposed format. Even if you don’t do streaming video, that code might help you get some data into the transposed format.

    1. Actually I believe I have beat both of you :)

      I have been working on a lws2812 library for the AVR for a while after I was dissatisfied with the existing solutions. Most implementations rely on contrived c++ template libraries (fastspi lib) or on unrolled loops. Both lead to a large memory footprint. Furthermore, instructions which are not compatible to all AVRs are used often (like MUL or CBI/SBI).

      You can find my implementation here:

      Down to 8 MHz all routines are looped with a memory footprint below 50 bytes. As a special bonus, I managed to implement a 4 MHz version, which unfortunately has to rely on loop unrolling.

      Yesterday also submitted ws2812 support to little-wire, which allows controlling WS2812 strips via USB on a humble ATTiny85.

      Experimental v1.2 here:

      1. I took a look at the ws2812_sendarray_mask-function under the ws2812_8MHz ifdef. The byte-timing looks correct to me and is indeed nice and compact, the while loop introduces inter-byte delays though and that is outside spec. As [Alan Burlison] wrote in his 16Mhz write-up, that may work on some led strips and fail on others.

        1. Hi,
          thank you for your feedback! According to my experiments the critical part of the timing is between the rising and the falling edge. I assume what the controller does is to trigger a monoflop at the rising edge and then samples the input after ~0.5µs. The timing between the falling and the rising edge triggers the reset condition, which seems to happen anywhere between 5µs and 50µs. So that means that there is some tolerance to the timing of the latter part.

          It would be easily possible to achieve perfect timing for the 8Mhz version with loop unrolling like in the 4 Mhz version. But that comes at the expense of a lot of code space. So far I have not seen any issue, but I only have tested on short strips so far. Would be interested in feedback if there are any issues.

          Achieving perfect timing for the 4Mhz version is near impossible since there is only one cycle per bit left – and at some point you have to do that pesky 2 cycle jump…

    2. I just took a better look Neopixel. [Paint Your Dragon] managed to do the same thing in less code. He just unrolled the bit loop and that’s all, doesn’t really need the extensive write-up, because it’s so simple.

      Double-darn you Lady Ada and your genius friends!

      (who would have thought that somebody could utter that sentence in the 21th century).

    3. I guess nobody ever told me this was supposed to be difficult. :-)

      I had my WS2811s working on an 8MHz ATtiny85 about an hour after my LED strip arrived. No assembly language needed apart from a NOP.

      I’m still using the same code today (although it now supports 16MHz as well…)

      Demo sketch:

      I wrote an assembly language version because I wanted to play back an 8kHz sound sample while updating the LEDs (a bit trickier at 8MHz).

      1. Your bit-timing seems alright, but there’s a huge gap between bits with the “b=b+b”, “if (*b&0x80)…” and “while (–i !=0)” and all. Maybe this works for your LEDs, but the timing is not to spec and this may (and I guess: will) fail for other instances.

        [Alan Burlison]’s featured write-up dedicates some text on why you’d want to stay in spec with the timings

        1. AFAIK the datasheet doesn’t say anything about time between bits. The chip looks for a rising edge at the start of a bit and starts counting from there. Time since previous bit doesn’t seem critical. In testing I added a for() loop counting up to 7 between bits and the chip was still happy.

          I’ve used this code on long LED strips, some surface mount WS2811s, even a WS2811 in a DIP8 package. It’s always worked.

          1. The chip *has* to be able to detect a rising edge and use it to process a bit – it wouldn’t work otherwise because the first bit of the 24 _always_ arrives at an unknown time.

            If the chip has that ability then it doesn’t make much sense for it to suddenly apply strict timing to all the subsequent bits. The easiest thing to do is use the same logic as for the first bit.

  2. Neat way of graphing and developing the code, enjoyed seeing it. Though it’s 4x Larson scanners in the demo, with the strip in a semicircle it reminds me of Close Encounters.

      1. Actually, 20-25 days appears to be somewhat of a standard maximum delivery from a Hong Kong seller on ebay to mainland Europe. Normally the goods arrive a lot sooner, but on this occasion it actually took weeks for the oscillators to arrive. Which was a good thing because really, I just needed a good excuse.

        But now I have 10 unused oscillators lying around…

        1. 25 days? I ordered something from Digikey a couple of months ago, paid for fedex, damn thing spent a month in some sorting facility in Memphis – back to Mouser who at least will ship to New Zealand from Hong Kong, orders arrive in 2-3 days

          1. Fungus: What you are doing in ws2812.h is the definition of loop unrolling. You code is huge and may break with any change of compiler conditions. This is a pure hack.

            But oh well, look where we are… :)

          2. The only point I was trying to make is that driving a WS2811 with an 8MHz AVR chip isn’t particularly difficult. When it came up here as some sort of miracle it made me raise an eyebrow.

  3. I have driven ws2812 with uart at 4Mbps, you can laso use SPI at 3.2Mbps.
    I drove the ws2812 from a 8-pin pic, the pic can have a framebuffer of 68 units for 24bit colour and 100leds for 16bit colour, this is with 256 Byte ram, RAM is the only limit.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s