Fail Of The Week: ESP Walkie, Not-So-Talkie

The ESP8266 has become such a staple of projects in our community since it burst onto the scene a few years ago. The combination of a super-fast processor and wireless networking all on the same chip and sold in retail quantities for relative pennies has been irresistible. So when [Petteri Aimonen] needed to make a wireless intercom system for cycling trips it seemed an obvious choice. Push its internal ADC to sample at a high enogh rate for audio, and stream the result over an ad-hoc wi-fi network.

The result was far from satisfactory, as while early results with a signal generator seemed good, in practice it was unusable. Significant amounts of noise were entering the pathway such that the resulting audio was unintelligible. It seems that running a wireless network causes abrupt and very short spikes of power supply current that play havoc with audio ADCs.

He’s submitted it to us as a Fail Of The Week and he’s right, it is a fail. But in a way that’s an unfair description, because we can see there is the germ of a seriously good idea in there. Perhaps with an external ADC, or maybe with some as-yet-to-be-determined filtering scheme, an ESP8266 walkie-talkie is one of those ideas that should be taken to its conclusion. We hope he perseveres.

51 thoughts on “Fail Of The Week: ESP Walkie, Not-So-Talkie

      1. Another thing that I notice is the op amp is connected directly into the ADC without any decoupling. Those ADC’s aren’t rail to rail and the LM324 is a rather poor choice in this configuration.

        1. There is no problem with DC coupling analog signals into ADCs, as long as you can be certain that the voltage is always within the limits of the ADC’s input. Likewise, there is no problem with running an LM324 output into an ADC, as long as your output is limited to the proper range. However, once you put enough gain on a microphone to get it to the range the ADC expects, it’s difficult to guarantee this, due to offset voltage of the op-amp. Therefore it is almost always better to AC couple audio inputs.

          A bigger issue I see is that when using the ADC inputs on any microcontroller, the PCB layout must keep the analog signals away from all digital signals, and generally should use a separate ground, tied to the ground pin closest to the analog inputs (this is usually specified in the datasheet). Failure to do this may make a board unsuitable for analog inputs.

          It’s hard to tell from the audio sample, what the problem is, or how serious it is, since there doesn’t appear to be any voice during that sample. Does this mean the voice was altogether wiped out, or just that this is a sample of the noise, without any voice? It’s really not that useful to hear just the noise, since signal to noise ratio is much more significant than just the noise level.

          1. So one thing I would suspect can happen with that bias network on the LM324 is that you’re running extremely close to the minimum operating spec of that op amp. So when then ESP draws a current spike it’s going to cause a brown out. All things equal I’d have tried throwing a 1000uF low ESR cap on the ESPs vcc. That’s how I use the ESP32 because it draws an really enormous startup spike.

          2. colin, do note that it is LMV324, which goes to significantly closer to rails and operates at lower supply voltage than LM324.

            But yeah, a big low-ESR electrolytic and separate power rail for analog probably would help some. Though I did try by having one board transmitting next to other board with only the mic amp, and it still took almost as much noise even though there was no electrical connection between the boards.

    1. Hm, I have 10µF soldered directly on the module, and 10µF on the PCB, both MLCC.

      I would assume that the module already has something like 100nF-1µF on it. My intuition also says that putting a small capacitor at the end of long wires (i.e. on the module pins) doesn’t make much sense.

      1. 10 uF, I’m guessing, means electrolytics. These are poor at handling even audio-range spikes. Do leave them there, but put some ceramics, in the 0.01 to 0.1 µF range, in parallel with them. I wouldn’t make any assumptions about what the board includes.

        1. You don’t have to assume anything about what’s inside the module; just read the documentation. That module is an ESP-12E:

          https://www.kloppenborg.net/images/blog/esp8266/esp8266-esp12e-specs.pdf (schematic is on the last page)
          That one doesn’t include the layout, but the -12E and -12F are nearly identical except for the antenna. The -12F’s layout is probably similar enough to see how the -12E is set up: https://www.elecrow.com/download/ESP-12F.pdf (page 7)

          Unfortunately, all of the chip’s power supply pins are tied together inside the module, with minimal decoupling between the pins. There’s also no analog ground, so both power and ground paths are going to be shared between analog signals and the power supplies. A big, fat, low ESR capacitor (e.g. aluminum polymer) might be some help, but I wouldn’t expect much. A separate LDO (with good PSRR in audio frequencies) for the mic and amplifier would do more.

          With an ADC, chip, and module all not designed for good analog performance, it’s probably not practical to get good audio quality out of this thing. I expect you’d need automatic gain control on the mic amplifier to keep the signal near the ADC’s maximum input range without clipping all the time. At that point, it’s easier to use an I2S or PDM microphone and forget the ESP8266’s ADC.

          And why in the world would 10uF mean electrolytic? That’s been solidly in ceramic territory for ages, down to 0402. Not that it will do as much as you might expect from the nameplate capacitance — modern MLCCs achieve high capacitance density like that at the cost of terrible DC bias effects, and they’re rated by their capacitance at zero bias.

          1. Very nice, I hadn’t seen that document!

            But yeah, it quite much confirms my suspicions that it is not possible to push the ESP8266 internal peripherals this far. Of course the project should be quite simple with I2S codec and I think I’ve seen a few that use it that way.

            Seeing cnlohr do such awesome things on ESP8266 without external chips motivated me to try also :)

    2. On looking at the schematic again I can see a number of ways to improve the design.

      Firstly about the spikes on the 3v3 rail.

      There are normally issues with current surges and voltage spikes and these represent two different problems.

      A current surge will drop the 3v3 rail and as a larger capacitor recharges the voltage will slowly increase back to the nominal voltage.

      This is not what is happening in your case as the voltage is going above the nominal voltage. In higher frequency circuits this often represents refection due to short wave lengths (higher frequencies) but the frequency here seems to be around a third of a MHz so it’s more likely a result due to delays in the voltage regulator responding to the voltage drop.

      A bit about capacitors first.

      Larger value capacitors have larger parasitic inductance which makes them poor at shunting spikes but good at storing energy for surge currents.

      Smaller value capacitors (especially ceramic disk) have lower parasitic inductance which makes them good shunting spikes but poor at storing energy for surge currents.

      So in essences it is often best to have two caps for decoupling like a 10uF/100uF in parallel with a 10pF/100pF (0.01uF/0.1uF). The higher the spike or transient frequency the lower the value of the shunt (smaller) cap should be. It’s AC shunt impedance [ohms] is 1 / ( 2 x Pi x frequency [Hz] x Capacitance [Farads] )

      I tend to use a tantalum for the higher value capacitor as they have a lower internal series resistance and a ceramic for the lower value capacitor.

      So I suggest putting a lower value capacitor in parallel with the ESP decoupling cap and the voltage regulators output filter cap.

      Large filter caps on regulator output reduce the regulators response time and can cause overshoot. I would reduce the 10uF to somewhere between 0.1uF and 1uF – the datasheet will give a good option.

      Run 3 separate traces from the pad of the power regulator shunt cap to the ESP, Output amp and input amp. Do the same with ground unless you use a ground pour.

      Next – the mic amp LMV324

      The spec of the op-amp is min 2v7 supply rail and you are using a 3v3 rail with about 0v9 spikes so the rail is falling below the min supply. Hopefully the addition of shunt caps above will fix that problem.

      This chip has about 60dB of supply rail rejection which is good enough but …

      You have direct AC paths from the rail to the inputs though R4-R3-R? and R7 and that is defeating the inbuilt supply rejection.

      Take the 10uF cap across to the other side of R7 and add a shunt cap as well.

      All all off the voice frequency low pass filtering (R4, R3, R?, C3, C?) needs to be completely separated from the 3v3 rail and placed into the negative feedback loop of the op-amp.

      The mic op-amp circuit needs to be completely re-done and there are lots of examples on the net. The way the positive input is so susceptible to rail noise is a problem.

    3. Dealing with RF noise can be quite complicated. Most filtering components have such high inductance, so remember that most high value caps (ceramic or not) don’t do a whole lot. Layout can also be quite critical, as 2.4GHz capacitively couples over very tiny capacitances.

      The thing to remember is that a lot of ESD diodes on most chip designs act as RF rectifiers, so even if you have a device that ostensibly does not work at high frequencies, these parasitic ESD diodes will likely turn that 2.4GHz RF to a much lower frequency that your IC is in fact sensitive to. Case in point – many older audio amplifiers have issues when a GSM phone is placed nearby, despite the fact that your phone clearly does not have a <20kHz antenna on it. Incidentally, there are many chips designed on with input structures designed to be resistant to this sort of interference these days, because cellphone designs pretty much always have to deal with this. This is also why digital microphones have become so popular.

  1. This illustrates the difference between hacking electronics and designing electronic circuits. There are many ways for the noise to get into the audio not just via DACs and ADCs.
    A few pointers might help:
    First isolate and decouple all the supplies to audio circuits a small inductor (guess) 2-10mH feeding a 10u Electrolytic in parallel with a 100n Ceramic Capacitor. Sometimes a small resistor will suffice instead of the inductor but at the expense of a voltage loss and therefore a lower potential volume.
    A couple of ceramic 100n capacitors close to the terminals of the ESP 8266 might also help stop the digital noise coming back into the power supply. I guess the ESP 8266 does not have a separate ground for DACs etc.
    It is also possible that the wires from the battery are radiating and feeding unwanted signal back into the circuit.
    PCB routing, track width and grounding can all have a profound effect on the circuit noise levels.
    It is also possible that the software timing is not consistent and that can induce noise, jitter, that is heard.
    Good luck.

    1. Agreed, even a 1 to 10 ohm resistor in stead of the inductor will make a difference. There are however greater problems with the ESP module’s ADC. A shared ground pin, and no external voltage reference. When I did a commercial project with an ESP that required accurate ADC, I included a low pin count external micro controller doing the ADC and as a bonus providing some dongle like copy protection.

  2. There are some WiFi walkies online, but none seem too good. I think the ESP is not that good of a choice for a walkie talkie because it consumes too much power in receive.
    There are some older designs using other radios, like this one with rfm12: https://www.youtube.com/watch?v=nJzeuvcNTME
    Not sure if there is some with the more modern RFM69, but that would be better. Heck, why not try a RFM95 with super nice 10mA only in receive or your typical NRF24 with PA and LNA.

  3. Yes, the ESP draws current spikes up to 0,5A. And I can not see a real groundplane in the PCB. In such a mixed analog and HF PCB a 2 sided PCB with one side as (mostly undisturbed) ground plane is the minimum requirement, better of course is a 4 layer design. In a 2 layer design you should do careful via stitching to get some GND. And small value decoupling capacitors, e.g. at the microphone and the microphone input. I normally used 8.2pF for DECT (1,9GHz) or WIFI (2,4GHz) frequencies.
    The ADC of the ESP has a range of 1V and the LM324 is not very good with voltages down to GND. So AC coupling should be much better.

    1. Could you point me to the measurements that show 0.5A? I’m confused as I don’t see any hard data anywhere. Everybody says that it’s impossible to power it from e.g. cp2102, but I can’t feel it.

      1. I have seen up to 200mA during transmit. The average current is quite low, with a big enough capacitor you will have no problems powering it. 0.5A is too much I think, this is more for a cellular modern.

  4. If you’re not dead set on the ESP you could implement a set of RDA1846S modules (or similar), they are already a walkie talkie in a box, all you need is audio amplification, some buttons and an ATTiny or something if you’re going to be fancy. It already has low power concumption and is really small. The range will be better as well if that even matters.

    1. Yes, you really need to use a mic array and some signal processing. I have tired helmet radios for my cycles before. On the plus side there are some nice chips for cell phones and cell phone like things that make it easy. The down side is they are tiny. The last one I looked at had at least 8 legs on it and was no bigger than a grain of rice.

  5. That was quiet a neat project pitty it didn’t work out. I remember adding extra caps to an nodeMCU 1.0 board on the power pins of the 8266 to try and eliminate what I thought was power glitch issues. Though it ended up been a software issue rather than power.

  6. This is something I wanted to make for years but never managed to start. In my dreams I would use something like Codec2 to compress audio in order to limit bandwidth and gain distance, at the expense of some latency. I’d love to see some development in this field, keep up the good work!

        1. yeah, no… My intention would be to have several as network radios on wifi. Although it would be interesting to use directly in some sort of unconnected mode as well.

  7. This idea was already discussed on HAD, the bottom line is that you use an ESP32 because it has all of the required functions on the SOC. ADC DAC touch controls etc.

  8. OK a fail of the week. I’ll buy that. Let’s just throw out the “baby with the bath water” here and not try to make a “purse out of a sow’s ear”. You have to be an older American to understand those old adages.

    Let’s look to a unique communications system mostly only allowed in USA right now. DECT6.0. I diagrammed a portable DECT6.0 communications system using OTS pre-existing parts. It uses digital cordless phone intercoms. They use full duplex through the base unit and are selective calling. They have 984 foot range (300 meters). With the base unit the center biker can extend that to 1968 feet (600 meters). That’s almost 1 quarter mile range direct line of sight. It works on 1.9 GHz.

    Using solar cell, auxiliary battery, and bicycle generator you can extend battery life. You can use headset, speakerphone, or handset mode. The audio quality is superb. No outside aerial needed. You can buy a multiple DECT6.0 intercom system on eBay for under $100 or $10 at a thrift store.You do not need a POTS 9volt battery loop for them to work in intercom mode. But you definitely need the base unit as it controls the multiple full-duplex mode. All units can talk at the same time to everyone as it is not simplex mode.

    Here’s a diagram:
    http://www.upl.co/uploads/bikecommsystem1532397577.jpg

  9. from my fogy memory ESP8266 internal ADC is being used(multiplexed) while transmitting(power monitoring or something) = sampling while transmitting will introduce noise/distortions, works ok only when transmitter is off

  10. My experience using the ESP8266 is that it is really glitchy when it is sourcing or sinking current. Even just 5mA on an IO pin makes it reset randomly. You almost certainly need to take steps to pause the radio to use the ADC, so you can’t use them both continuously.

    So you could use it as a walkie-talkie without an external ADC if you buffer the results and transmit when you un-key. I don’t know what that is called… half half-duplex?

  11. I strongly recommend flashing the board with the firmware for ColorChord. You can look at a oscilloscope view. Can you post what you see noise-wise. When developing ColorChord, I found a very odd problem where the RF from the ESP was creating an AC transient at a very high frequency, normally this would not be an issue at all, BUT, there was some sort of nonlinear effect inside the microphone amp circuit which pushed a DC offset during when the high frequency AC was present. There is actually a relatively simple solution around it, and it was to include a smaller capacitor, just a few picofarads (should try different values from 20-470pF) across both the microphone AND the ADC input, if decoupled.

    1. Hmm yeah, it very much seemed like the internal JFET inside the electret microphone was itself acting like a radio receiver. I did have 1nF cap across the mic, but perhaps I should try smaller values also. On the ADC input I also added 1nF soldered directly on the module, which did help some but not much.

  12. Hi there. Few ppl know of the Espressif proprietary protocols workin on WiFi interface of ESP32. One of them is ESP-NOW – a very fast protocol that enables mulit point communication over WiFi and ESP-TEL that enables the two way, multipoint conferencing voice connections – of course with use of I2S audio devices for multiple reasons. Here’s a clip illustrating the ESP-TEL protocol capabilities: https://www.youtube.com/watch?v=jVVwDic1aO4 And here’s a clip of a ESP-NOW mesh with 6 nodes example and here’s how to program multi node communication: https://www.youtube.com/watch?v=7FkmUFY7JDk. For a clever guy you really give up too quick – before exploring all the possibilities ;)

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.