Running Way More LED Strips On A Raspberry Pi With DMA

October 12, 2020

The Raspberry Pi is a powerful computer in a compact form factor, making it highly useful for all manner of projects. However, it lacks some of the IO capabilities you might find on a common microcontroller. This is most apparent when it comes to running addressable LED strings. Normally, this is done using the Pi’s PWM or audio output, and is limited to just a couple of short strings. However, [Jeremy P Bentham] has found a way to leverage the Pi’s hardware to overcome these limitations.

The trick is using the Raspberry Pi’s little-documented Secondary Memory Interface. The SMI hardware allows the Pi to shift out data to 8 or 16 I/O pins in parallel using direct memory access (DMA), with fast and accurate timing. This makes it perfect for generating signals such as those used by WS2812B LEDs, also known as NeoPixels.

With [Jeremy]’s code and the right supporting hardware, it’s possible to run up to 16 LED strips of arbitrary length from the Raspberry Pi. [Jeremy] does a great job outlining how it all works, covering everything from the data format used by WS2812B LEDs to the way cache needs to be handled to avoid garbled data. The hack works on all Pis, from the humble Pi Zero to the powerful Pi 4. Thanks to using DMA, the technique doesn’t overload the CPU, so performance should be good across the board.

Of course, there are other ways to drive a ton of LEDs; we’ve seen 20,000 running on an ESP32, for example.

[Thanks to Petiepooo for the tip!]

18 thoughts on “Running Way More LED Strips On A Raspberry Pi With DMA”

Uriel Guy says:

October 12, 2020 at 8:16 am

In case someone wants to run 27 parallel strips, I have a kernel module for Raspberry Pi Zero to do that. Although this is a much cleaner, less violent solution.
https://github.com/UrielGuy/raspi_ws2812

Report comment

Reply
1. Petiepooo says:
  
  October 12, 2020 at 2:21 pm
  
  Freezing the kernel to twiddle GPIO pins works, but as you mention in your repo, chains would need to be short, and other tasks would suffer during the freeze.
  The hope with this is that a mid-level Raspberry Pi, like a 3B, can work all the pieces at once: run FPP to generate E1.31 while also driving up to 16 channels of 4 universes (680 RGB LEDs). Add a protoboard with a couple of SN74HCT245N buffers to protect GPIO and boost to 5V, and a power distribution network for powering the LEDs, and you’ve got all the makings of a show. There is no standalone controller for +100k pixels that could beat it on price.
  Mind you, there’s a little bit of work to be done yet. Someone needs to distill his research work and add the E1.31 receiver portion…
  
  Report comment
  
  Reply
2. Nick says:
  
  March 20, 2021 at 9:35 am
  
  Someone have ported this kernel module for RPi 2 and 3?
  
  Report comment
  
  Reply
x86daddy says:

October 12, 2020 at 8:27 am

Does the existence of this DMA mode imply that hooking up an actual hard drive (not via USB) would be possible?

Report comment

Reply
1. limroh says:
  
  October 12, 2020 at 8:50 am
  
  Seems someone was already working on that: https://github.com/fenlogic/IDE_trial
  
  Report comment
  
  Reply
paulvdh says:

October 12, 2020 at 9:21 am

Gosh, someone managed to reverse engineer some obscure interface of this piece of fruit.

Beaglebone was doing this in 2013:
https://hackaday.com/2013/09/13/a-23-feet-tall-pyramid-with-0-31-mile-of-led-strips/

Beaglebone Black is getting a bit old and is slow for today’s standards, but with it’s PRU’s it can still pack a punch in areas where it matters. Programming the PRU’s has been a complication in the beginning, but a few years ago (I think in 2017) GCC was ported to it, which should make development cycles easier.

Upon reading the above again. seems the PRU’s weren’t even used, but the Beaglebones got some help from some teensy’s.

Report comment

Reply
1. willmore says:
  
  October 12, 2020 at 9:47 am
  
  I was going to mention the history of the various Teensy boards to do a similar trick, but the BB seems to do it even better.
  
  Report comment
  
  Reply
2. X says:
  
  October 12, 2020 at 12:28 pm
  
  The VAX and the PDP11 had DRV11 high speed parallel interfaces back in the 1970s
  
  Report comment
  
  Reply
3. Grawp says:
  
  October 13, 2020 at 1:58 am
  
  Could recommend some non-obscure board as a replacement for RPi with at least 4GBs RAM and roughly the same power in the same format?
  Regarding the power and RAM I know only about x86_64 boards and they won’t fit into a small RC helicopters and planes I’m interested in putting such board it into :(
  
  Report comment
  
  Reply
  1. Grawp says:
    
    October 13, 2020 at 2:02 am
    
    Typo. Message should have started with “Could you …”.
    
    Report comment
    
    Reply
  2. willmore says:
    
    October 13, 2020 at 10:00 am
    
    The Odroid boards are better built and well supported. The C4 looks to be what you want. If you want a more rugged machine with even more CPU, the N2 is a good choice.
    
    Report comment
    
    Reply
    1. Grawp says:
      
      October 14, 2020 at 4:18 am
      
      Last time I had an Odroid board it didn’t have support in upstream kernel. I was locked to an archaic version :(
      
      Report comment
      
      Reply
      1. willmore says:
        
        October 14, 2020 at 7:43 am
        
        Which board was that?
        
        Report comment
  3. Conor Stewart says:
    
    June 20, 2021 at 3:03 pm
    
    Is there a specific reason you want an SBC on the aircraft, you could have a microcontroller on the craft communicating with a ground station with an SBC, you can still stream video and information back to do any kind of object detection and just have a small, power efficient microcontroller on board.
    
    Or if you need an SBC I’m pretty sure banana pi or another such company make something comparable to a pi 4 but in a pi zero form factor so that might be worth looking at.
    
    Report comment
    
    Reply
Drone says:

October 12, 2020 at 8:03 pm

So I skimmed all four of [Jeremy P Bentham’s] posts on this RPi SMI/DMA topic. Nowhere it seems does he say how fast he can actually output bits of data, perhaps the most important piece of information on the subject. However, I did see on a GitHub page (link above in [limroh’s] post) that someone was partially successful in making an IDE interface using a RPi’s SMI port that worked at 44 Mbytes/sec.

The subject interests me because often you will see SoC’s touting system clock speeds in the GHz range, so you would think you could get some pretty fast GPIO, but in the end you discover you can’t. It turns out that sometimes all the fast stuff on the chip is wrapped up by some proprietary bus architecture that acts as a bottleneck. Case in-point is the ARM Advanced Microcontroller Bus Architecture (a.k.a. AMBA).[1] You would think you could just go to a datasheet or app note to know all about this – but nooooo, not with a Broadcom product, it’s a “secret”. (Broadcom makes the SoC in the RPi.)

1. ARM Advanced Microcontroller Bus Architecture (AMBA)

https://en.wikipedia.org/wiki/Advanced_Microcontroller_Bus_Architecture

Report comment

Reply
1. Petiepooo says:
  
  October 13, 2020 at 6:31 am
  
  He talks about the timing of the WS2812 data stream in the linked post and how he’s able to send one pulse cycle every 1.2µs, either as 0.8 on and 0.4 off or vice versa. That’s slightly off the 1.25µs (800kHz) that is called for in the LED’s spec, as it means he’s running it at 833kHz. It may still be possible to adjust the SMI’s clock to meet the official spec, as I think there’s still things to be discovered about the peripheral. However, the pulse lengths would not exactly match the spec, even though it appears to work reliably.
  What’s missing is how long the LED chains can be, as there must be limits to the size of the DMA transfers that can be setup, or the max latency possible to keep the DMA engine fed if user-space needs to service it more than once per data cycle. Ultimately, that may affect the max frame rate or chain length on less powerful PIs like the Zero, but I have high hopes for the 3 or 4.
  
  Report comment
  
  Reply
Sword says:

October 12, 2020 at 9:24 pm

One of the big issues with ambilight pi projects is the need to use an arduino. I wonder if this can be combined with ambilight capability for a simple project.

Report comment

Reply
1. Petiepooo says:
  
  October 13, 2020 at 6:17 am
  
  Have you seen https://www.instructables.com/DIY-Ambilight-With-Raspberry-Pi-and-NO-Arduino-Wor/ ? It should also be possible to use the https://github.com/jgarff/rpi_ws281x library and the Pi’s PWM or PCM peripherial for non-SPI LED strips, provided Hyperion gains support for that.
  
  Report comment
  
  Reply

Hackaday

Running Way More LED Strips On A Raspberry Pi With DMA

18 thoughts on “Running Way More LED Strips On A Raspberry Pi With DMA”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

Back To The Future, 40 Years Old, Looks Like The Past

Why The Latest Linux Kernel Won’t Run On Your 486 And 586 Anymore

One Laptop Manufacturer Had To Stop Janet Jackson Crashing Laptops

The 2025 Iberian Peninsula Blackout: From Solar Wobbles To Cascade Failures

Field Guide To The North American Weigh Station

Our Columns

Hackaday Podcast Episode 327: A Ploopy Knob, Rube-Goldberg Book Scanner, Hard Drives And Power Grids Oscillating Out Of Control

Last Chance: 2025 Hackaday Supercon Still Wants You!

FLOSS Weekly Episode 839: I Want To Get Paid Twice

South Korea Brought High-Rise Fire Escape Solutions To The Masses

C++ Encounters Of The Rusty Zig Kind

18 thoughts on “Running Way More LED Strips On A Raspberry Pi With DMA”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns