Voice Controlled Stereo Balance With ESP8266

September 19, 2018

A stereo setup assumes that the listener is physically located between the speakers, that’s how it can deliver sound equally from both sides. It’s also why the receiver has a “Balance” adjustment, so the listener can virtually move the center point of the audio by changing the relative volume of the speakers. You should set your speaker balance so that your normal sitting location is centered, but of course you might not always be in that same position every time you listen to music or watch something.

[Vije Miller] writes in with his unique solution to the problem of the roving listener. He’s come up with a system that can adjust the volume of his speakers without having to touch the receiver’s setup, in fact, he doesn’t have to touch anything. By leveraging configurable voice control software running on his computer, his little ESP8266-based devices do all the work.

Each speaker has its own device which consists of a NodeMCU ESP8266 and X9C104 digital potentiometer inside of a 3D printed case. The audio terminal block on the gadget allows him to connect it inline between the speaker and the receiver, giving [Vije] the ability to adjust the volume through software. The source code, which he’s posted on the Hackaday.io project page, uses a very simple REST-style API to change speaker volume based on HTTP requests which hit the ESP8266’s IP address.

The second part of the project is a computer running VoiceAttack, which lets [Vije] assign different actions based on what the software hears. When he says the appropriate command, the software goes through and fires off HTTP requests to the nodes in the system. Everything is currently setup for two speakers, but it shouldn’t be too difficult to expand to more speakers (or even rooms) with some adjustment to the software.

It’s not the first voice controlled speaker we’ve ever seen, but it does solve a very specific problem in a unique way. We’d be interested in seeing the next logical step, which would see this technology integrated into the speaker itself.

13 thoughts on “Voice Controlled Stereo Balance With ESP8266”

walter says:

September 19, 2018 at 2:06 pm

What´s a “receiver”, Hackaday ? It is made of copy-pasted juice or it has some meaning I ignore ?

Report comment

Reply
1. Echo_Hotel (@Echo_Hotel) says:
  
  September 20, 2018 at 4:19 am
  
  Short version, when you go over to that one guys house, you know the one, the receiver is the thing he can never find the remote to and thus can’t turn on his tv.
  Long version.
  In the old old days a receiver was just a piece of Hi-Fi equipment that took various audio sources like your tape deck or turn table or it own built in radio thus the name, selected one and output it to the amplifier or directly to the speakers if it had an amp built in.
  Then the new old days happened and Surround Sound needed to be decoded from DVD players over spdif and Ieee1394 from set top boxes but folks still wanted to use their “classic” amplifiers so yeah.
  Now it’s a mostly superfluous extra box on the home theater stack since miniaturization has let us cram everything into the amplifier box, I think the HDMI CEC was supposed to sort of prop it up as an essential central command unit but it didn’t really turn out like that.
  
  TL;DR it’s a box with a bunch of extra input ports you hook up to your home theater amplifier.
  
  Report comment
  
  Reply
  1. Ren says:
    
    September 20, 2018 at 6:39 am
    
    In the old old days, if the Tuner (AM and FM radio box) and audio Amplifier were combined into the same box, instead of two boxes, it was called a Receiver.
    
    Report comment
    
    Reply
Happosai says:

September 19, 2018 at 5:06 pm

https://en.wikipedia.org/wiki/AV_receiver

Report comment

Reply
Piecutter says:

September 19, 2018 at 8:04 pm

Use the speakers in reverse as microphones and have it set balance based on it’s received signal comparison. Find a seat, say “Balance”, it centers your location, and waits for your “Play” command.

Report comment

Reply
1. Vije says:
  
  September 19, 2018 at 8:30 pm
  
  +1.5
  
  Report comment
  
  Reply
2. Buddy Casino says:
  
  September 20, 2018 at 8:11 am
  
  do it, I double dare you
  
  Report comment
  
  Reply
3. Elliot Williams says:
  
  September 21, 2018 at 5:27 am
  
  Throw in a little DSP, and you can have it delay one signal to get you in the stereo sweet spot while you’re at it.
  
  Report comment
  
  Reply
Dissy says:

September 19, 2018 at 8:37 pm

Seems a good next step idea would be to replace the voice control with an RGBD sensor to “see” where the person(s) are within the room and adjust accordingly, something like an Xtion Pro Live or repurposing an old xbox kinect.
I’ve seen this done with OpenNI/Nite but it would seem that software isn’t available anymore.
Bonus feature that it would work with 4 channels as well as 2.

Report comment

Reply
echodelta says:

September 19, 2018 at 9:00 pm

If it’s a receiver those digital pots would in a line level path. Limiting audio at a speaker level is a poor way to regulate output. I assumed that he had amplified speakers with a line level going to them not a receiver. Since there are two individuals of them I assume that the “pots” are power wasting resistor controls made for muzak sound in offices.

If I would look first. Comments takes you past the rest of the article. The printed plastic wedge shape will change when it heats up with watts of amplifier power dissipated into that 8 pin dip chip, if it don’t go poof sooner. Those chips are just the thing for line level control of the amp. No need for wires all over the place.

Interesting note in the first years of stereo, easy listening was big. I remember most of it was a duo-phonic mix. There was no center channel or stage of sound, just 2 places with mics at either end of the orchestra. A big hole in the middle! The first 2 albums by the Beatles weren’t much better and Capitol just made them mono. Then tri-phonic mixing and now there was at least a center source for vocals. Pan pots on channel mixing weren’t common till 1970 or so starting in Europe.

Report comment

Reply
1. Vije says:
  
  September 19, 2018 at 9:37 pm
  
  100% agree that resisting output after amplification is not ideal .. and admittedly this was more tailored toward the concept of positional adjustment (center surround sound) rather than optimal set up. Fortunately the speakers tested were inline and low power so there was a noticeable range w/o aggressive heat. Ideally .. individual output adjustments shd begin at the receiver.
  
  Report comment
  
  Reply
  1. Martin says:
    
    September 20, 2018 at 2:23 am
    
    I can’t see, how this should work: The X9C104 is a 100kOhm E-POT (1kOhm steps), with max. 4mA of wiper current. I just can’t imagine this thing in a speaker wire.
    
    Report comment
    
    Reply
Rolinger says:

September 20, 2018 at 2:06 am

It’s an appealing idea, so I had a look at the part. Unfortunately maximum wiper current for those digi-pots is 4.4mA. Even with music crest factors and an efficient 8 ohm speaker, that would only equate to a peak sound output around 65-70dB (SPL re 20uPa @1m), so would really only be useful if running them far past their spec.

Report comment

Reply