Voice Controlled Stereo Balance With ESP8266

A stereo setup assumes that the listener is physically located between the speakers, that’s how it can deliver sound equally from both sides. It’s also why the receiver has a “Balance” adjustment, so the listener can virtually move the center point of the audio by changing the relative volume of the speakers. You should set your speaker balance so that your normal sitting location is centered, but of course you might not always be in that same position every time you listen to music or watch something.

[Vije Miller] writes in with his unique solution to the problem of the roving listener. He’s come up with a system that can adjust the volume of his speakers without having to touch the receiver’s setup, in fact, he doesn’t have to touch anything. By leveraging configurable voice control software running on his computer, his little ESP8266-based devices do all the work.

Each speaker has its own device which consists of a NodeMCU ESP8266 and X9C104 digital potentiometer inside of a 3D printed case. The audio terminal block on the gadget allows him to connect it inline between the speaker and the receiver, giving [Vije] the ability to adjust the volume through software. The source code, which he’s posted on the Hackaday.io project page, uses a very simple REST-style API to change speaker volume based on HTTP requests which hit the ESP8266’s IP address.

The second part of the project is a computer running VoiceAttack, which lets [Vije] assign different actions based on what the software hears. When he says the appropriate command, the software goes through and fires off HTTP requests to the nodes in the system. Everything is currently setup for two speakers, but it shouldn’t be too difficult to expand to more speakers (or even rooms) with some adjustment to the software.

It’s not the first voice controlled speaker we’ve ever seen, but it does solve a very specific problem in a unique way. We’d be interested in seeing the next logical step, which would see this technology integrated into the speaker itself.

13 thoughts on “Voice Controlled Stereo Balance With ESP8266

    1. Short version, when you go over to that one guys house, you know the one, the receiver is the thing he can never find the remote to and thus can’t turn on his tv.
      Long version.
      In the old old days a receiver was just a piece of Hi-Fi equipment that took various audio sources like your tape deck or turn table or it own built in radio thus the name, selected one and output it to the amplifier or directly to the speakers if it had an amp built in.
      Then the new old days happened and Surround Sound needed to be decoded from DVD players over spdif and Ieee1394 from set top boxes but folks still wanted to use their “classic” amplifiers so yeah.
      Now it’s a mostly superfluous extra box on the home theater stack since miniaturization has let us cram everything into the amplifier box, I think the HDMI CEC was supposed to sort of prop it up as an essential central command unit but it didn’t really turn out like that.

      TL;DR it’s a box with a bunch of extra input ports you hook up to your home theater amplifier.

  1. Use the speakers in reverse as microphones and have it set balance based on it’s received signal comparison. Find a seat, say “Balance”, it centers your location, and waits for your “Play” command.

  2. Seems a good next step idea would be to replace the voice control with an RGBD sensor to “see” where the person(s) are within the room and adjust accordingly, something like an Xtion Pro Live or repurposing an old xbox kinect.
    I’ve seen this done with OpenNI/Nite but it would seem that software isn’t available anymore.
    Bonus feature that it would work with 4 channels as well as 2.

  3. If it’s a receiver those digital pots would in a line level path. Limiting audio at a speaker level is a poor way to regulate output. I assumed that he had amplified speakers with a line level going to them not a receiver. Since there are two individuals of them I assume that the “pots” are power wasting resistor controls made for muzak sound in offices.

    If I would look first. Comments takes you past the rest of the article. The printed plastic wedge shape will change when it heats up with watts of amplifier power dissipated into that 8 pin dip chip, if it don’t go poof sooner. Those chips are just the thing for line level control of the amp. No need for wires all over the place.

    Interesting note in the first years of stereo, easy listening was big. I remember most of it was a duo-phonic mix. There was no center channel or stage of sound, just 2 places with mics at either end of the orchestra. A big hole in the middle! The first 2 albums by the Beatles weren’t much better and Capitol just made them mono. Then tri-phonic mixing and now there was at least a center source for vocals. Pan pots on channel mixing weren’t common till 1970 or so starting in Europe.

    1. 100% agree that resisting output after amplification is not ideal .. and admittedly this was more tailored toward the concept of positional adjustment (center surround sound) rather than optimal set up. Fortunately the speakers tested were inline and low power so there was a noticeable range w/o aggressive heat. Ideally .. individual output adjustments shd begin at the receiver.

  4. It’s an appealing idea, so I had a look at the part. Unfortunately maximum wiper current for those digi-pots is 4.4mA. Even with music crest factors and an efficient 8 ohm speaker, that would only equate to a peak sound output around 65-70dB (SPL re 20uPa @1m), so would really only be useful if running them far past their spec.

Leave a Reply to HapposaiCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.