Audio Reactive LED Strips Are Hard

Back in 2017, Hackaday featured an audio reactive LED strip project from [Scott Lawson], that has over the years become an extremely popular choice for the party animals among us. We’re fascinated to read his retrospective analysis of the project, in which he looks at how it works in detail and explains that why for all its success, he’s still not satisfied with it.

Sound-to-light systems have been a staple of electronics for many decades, and have progressed from simple volume-based flashers and sequencers to complex DSP-driven affairs like his project. It’s particularly interesting to be reminded that the problem faced by the designer of such a system involves interfacing with human perception rather than making a pretty light show, and in that context it becomes more important to understand how humans perceive sound and light rather than to simply dump a visualization to the LEDs. We receive an introduction to some of the techniques used in speech recognition, because our brains are optimized to recognize activity in the speech frequency range, and in how humans register light intensity.

For all this sophistication and the impressive results it improves though, he’s not ready to call it complete. Making it work well with all musical genres is a challenge, as is that elusive human foot-tapping factor. He talks about using a neural network trained using accelerometer data from people listening to music, which can only be described as an exciting prospect. We genuinely look forward to seeing future versions of this project. Meanwhile if you’re curious, you can head back to 2017 and see our original coverage.

16 thoughts on “Audio Reactive LED Strips Are Hard

  1. I’ve usually been disappointed by automatic music reactive lights, but maybe there is hope for the near future. Has robust tempo detection and on-the-beat prediction become easier in the LLM age? Obviously, the processing would have to be done locally. My iPhone claims to have an “AI chip,” so this “act locally” seems doable.

    1. I don’t think LLMs are particularly suited for realtime music analysis. A transformer can be trained on audio, but at inference time it like to have the whole song as input. So it might be more useful for pre-generating a light show to go with a song rather than instantly reacting to whatever is currently playing.

      But yes, cellphones have gotten good at low-power local sound processing. Think of the “wake word” use case: Your phone is probably locally processing every sound around you right now, listening for someone to say “Ok Google” or “Siri…”

      1. I wonder how many Europeans( and Russians/Chinese/Japanese and so forth) use google/Siri et cetera.
        I see yanks use it a lot, but never see Brits/EU people use it, but me not seeing it does not mean they don’t use it of course.
        And if they use it do they use it on their phones I wonder.

    2. Unfortunately LLM are bad at math. Ive also had numerous failures for an LLM asked to name the tempo of a song. It gives incorrect answers, and for analysis there are computationally more efficient ways to analyze and find BPM.

      1. I think that if somebody says “neural network trained” in such a context they do NOT mean LLM.
        So that leaves the question if a modern phone with AI enhancements is good at general neural networks. I would think so, theoretically, I mean they use it for image manipulation too and not just language, but how is the interfacing with such hardware for developers?

    1. The things that make music pleasing can be complimented by the things that make visuals pleasing.

      This is true even though they are talking to different senses. Just like texture affects flavor. Get it wrong and they fight, get it right and they enhance. Music visualization, -when done right- is a bridge between listening and seeing that can deepen connection, clarify structure, and invite new meaning.

      Have you ever watched a music video that really complimented the song? That’s just a bunch of tiny little lights getting brighter and darker at the right times… So, how few “pixels” can that get down to and still convey feeling?
      Concert lighting crews seem to be able to get it down to less than a dozen sometimes and even ten still have a positive impact.

    2. If that were true dancing wouldn’t ‘work’ as something watchable.
      Gyno row would be empty.

      IMHO it’s the geeks involved, have the project participants dance and tell me it’s better.

  2. I actually found a really good visualizer in the Music Player Clementine: the Psychadelic Visualizer. I had a look at the source code and implemented a general purpose app for desktop audio, that sends the Color RGB to an LED strip and/or DMX Lighting Equipment. It works by doing an FFT on the audio in smaller chunks, Low Bass Frequencies get summed to the Red Color Channel, Mids to the Green, and High frequency content like claps or snares etc get transformed into blue colors. It works amazingly well for EDM and other similar genres. I also added a lot of filters to adjust how harsh or quick colors change, and other filters to generate more change from lighter music. Find it on my gitlab at https://gitlab.com/Romanizer/dmxvis (Windows seriel port support works, but is not as low latency as linux, so 144hz updates to the led strip might not work properly, among other things) It’s also compatible to OpenRGB (can sync to your RAM/GPU/Mainboard lights)

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.