[Stanislaw Pusep] has gifted us with the Pianolizer project – an easy-to-use toolkit for music exploration and visualization, an audio spectrum analyzer helping you turn sounds into piano notes. You can run his toolkit on a variety of different devices, from Raspberry Pi and PCs, to any browser-equipped device including smartphones, and use its note output however your heart desires. To show off his toolkit in action, he set it up on a Raspberry Pi, with Python code taking the note data and sending color information to the LED strip, displaying the notes in real time as he plays them on a MIDI keyboard! He also created a browser version that you can use with a microphone input or an audio file of your choosing, so you only need to open a webpage to play with this toolkit’s capabilities.
He took time to make sure you can build your projects with this toolkit’s help, providing usage instructions with command-line and Python examples, and even shared all the code used in the making of the demonstration video. Thanks to everything that he’s shared, now you can add piano note recognition to any project of yours! Pianolizer is a self-contained library implemented in JavaScript and C++ (which in turn compiles into WebAssembly), and the examples show how it can be used from Python or some other language.
[Stanislaw] also documented the principles behind the code, explaining how the note recognition does its magic in simple terms, yet giving many insights. We are used to Fast Fourier Transform (FFT) being our go-to approach for spectral analysis, aka, recognizing different frequencies in a stream of data. However, a general-purpose FFT algorithm is not as good for musical notes, since intervals between note frequencies become wider as frequency increases, and you need to do more work to distinguish the notes. In this toolkit, he used a Sliding Discrete Fourier Transform (SDFT) algorithm, and explains to us how he derived the parameters for it from musical note frequencies. In the end of the documentation, he also gives you a lot of useful references if you would like to explore this topic further!
What are you going to build with this? Maybe, a box that records you playing the flute and instantly turns it into sheet music? Or, perhaps, an AI that continues the song for you when you stop?
Yissss… that’s all I need to turn my noodling on a kazoo to platinum certified rock hits.
Don’t even need a kazoo! Can just hum your ideas :)
Now, this I will really dig it at some point. Besides #UnrealEngine, my other “soft spot” are Music-to-Light DIY. I have done plenty “usual” ones, but I had a thought about note-precise for a long time, but I did not find such projects in past. Thank You for this!
I am seriously considering using the output of Pianolizer to generate terrain, and then, say, wrap it into a tunnel/cave (left channel => left wall; right channel => conversely). I only have skills to make it as a colorful wireframe, although…
thanks, sliding dft gives me something to think about.
personally, when i have wanted to separate out the 88 keys on a piano, i just used bandpass IIR filters (equivalent to a two-stage RC filter i think), one for each note. roughly, 88*5 float multiply-adds per sample. i’m not sure if that’s actually an efficient way to do it, i just know it was able to keep up with realtime on the 800MHz ARM in the phone i was using at the time.
I actually did use bandpass filters in an early prototype! Performance is very close to that of SDFT. In the end, I opted to use SDFT because I wanted to “get a better feel of it”.
30 years ago, as a graduation project, I used a Winograd transform on an Intel 8031 (4MHz clock?), programmed in PL/M, to send real time audio to a MIDI keyboard, worked well, not to many blinking LEDs. If I recall, had to resort to some tricks to address the bass end of the spectrum… fun times.
nice project, nice post
Sounds awesome! Do you have the code published somewhere?
Glad that you mention the bass end. My “trick” is… rendering as 61-key keyboard instead of 88-key :/
I prefer to think that it makes the code more readable while not taking much from the functionality (turns out most customer-grade microphones struggle with the bass end of the spectrum, too).
There is something interesting with a thin led strip in a video. I could not see shit.