Back in 2017, Hackaday featured an audio reactive LED strip project from [Scott Lawson], that has over the years become an extremely popular choice for the party animals among us. We’re fascinated to read his retrospective analysis of the project, in which he looks at how it works in detail and explains that why for all its success, he’s still not satisfied with it.
Sound-to-light systems have been a staple of electronics for many decades, and have progressed from simple volume-based flashers and sequencers to complex DSP-driven affairs like his project. It’s particularly interesting to be reminded that the problem faced by the designer of such a system involves interfacing with human perception rather than making a pretty light show, and in that context it becomes more important to understand how humans perceive sound and light rather than to simply dump a visualization to the LEDs. We receive an introduction to some of the techniques used in speech recognition, because our brains are optimized to recognize activity in the speech frequency range, and in how humans register light intensity.
For all this sophistication and the impressive results it improves though, he’s not ready to call it complete. Making it work well with all musical genres is a challenge, as is that elusive human foot-tapping factor. He talks about using a neural network trained using accelerometer data from people listening to music, which can only be described as an exciting prospect. We genuinely look forward to seeing future versions of this project. Meanwhile if you’re curious, you can head back to 2017 and see our original coverage.

So he gave up?
I’ve usually been disappointed by automatic music reactive lights, but maybe there is hope for the near future. Has robust tempo detection and on-the-beat prediction become easier in the LLM age? Obviously, the processing would have to be done locally. My iPhone claims to have an “AI chip,” so this “act locally” seems doable.
I don’t think LLMs are particularly suited for realtime music analysis. A transformer can be trained on audio, but at inference time it like to have the whole song as input. So it might be more useful for pre-generating a light show to go with a song rather than instantly reacting to whatever is currently playing.
But yes, cellphones have gotten good at low-power local sound processing. Think of the “wake word” use case: Your phone is probably locally processing every sound around you right now, listening for someone to say “Ok Google” or “Siri…”
WLED on its own or with LedFX do a pretty good job in the audio reactive LED strip world.