A stereo setup assumes that the listener is physically located between the speakers, that’s how it can deliver sound equally from both sides. It’s also why the receiver has a “Balance” adjustment, so the listener can virtually move the center point of the audio by changing the relative volume of the speakers. You should set your speaker balance so that your normal sitting location is centered, but of course you might not always be in that same position every time you listen to music or watch something.
[Vije Miller] writes in with his unique solution to the problem of the roving listener. He’s come up with a system that can adjust the volume of his speakers without having to touch the receiver’s setup, in fact, he doesn’t have to touch anything. By leveraging configurable voice control software running on his computer, his little ESP8266-based devices do all the work.
Each speaker has its own device which consists of a NodeMCU ESP8266 and X9C104 digital potentiometer inside of a 3D printed case. The audio terminal block on the gadget allows him to connect it inline between the speaker and the receiver, giving [Vije] the ability to adjust the volume through software. The source code, which he’s posted on the Hackaday.io project page, uses a very simple REST-style API to change speaker volume based on HTTP requests which hit the ESP8266’s IP address.
The second part of the project is a computer running VoiceAttack, which lets [Vije] assign different actions based on what the software hears. When he says the appropriate command, the software goes through and fires off HTTP requests to the nodes in the system. Everything is currently setup for two speakers, but it shouldn’t be too difficult to expand to more speakers (or even rooms) with some adjustment to the software.
It’s not the first voice controlled speaker we’ve ever seen, but it does solve a very specific problem in a unique way. We’d be interested in seeing the next logical step, which would see this technology integrated into the speaker itself.
Continue reading “Voice Controlled Stereo Balance With ESP8266”









[tomek] was aware of this hip knowledge domain called Digital Signal Processing but hadn’t done any of it themselves. Like many algorithmic problems the first step was to figure out the fastest way to bolt together a prototype to prove a given technique worked. We were as surprised as [tomek] by how simple this turned out to be. Fundamentally it required a single function – cross-correlation – to measure the similarity of two data samples (audio files in this case). And it turns out that
At this point all that was left was packaging it all into a one click tool to listen to the radio without loading an entire analysis package. Conveniently Octave is open source software, so [tomek] was able to dig through its sources until they found the bones of the critical xcorr() function. [tomek] adapted their code to pour the audio into a circular buffer in order to use an existing Java FFT library, and the magic was done. Piping the stream out of ffmpeg and into the ad detector yielded events when the given ad jingle samples were detected.


