ESP32, We Have Ways To Make You Talk

February 6, 2018

One of our favorite scenes from the [James Bond] franchise is the classic exchange between [Goldfinger] and [Bond]. [Connery] (the One True Bond) says, “You expect me to talk?” And the reply is, “No Mr. Bond, I expect you to die!” When it comes to the ESP32, though, apparently [XTronical] expects it to talk. He posted a library to simplify playing WAV files on the ESP32. There is also a video worth watching, below.

Actually, you might want to back up to his previous post where he connects a speaker via one of the digital to analog converters on the board. In that post, he just pushes out a few simple waveforms, but the hardware is the same setup he uses for playing the WAV files.

By wrapping up the WAV code in a library, [XTronical] makes the actual playback simple. Here’s the core of his simple example:

void loop() {
static uint32_t i=0; // simple counter to output
if(ForceWithYou.Completed) // if completed playing, play again
    DacAudio.PlayWav(&amp;amp;ForceWithYou); // play the wav (pass the wav class object created at top of code
    Serial.println(i); // print out the value of i
    i++; // increment the value of i
}

Not very hard, but, of course, the heavy lifting is hidden in the two objects PlayWav and ForceWithYou. The video explains how you can add more, but you can probably guess, too. The short version is he uses Audacity to prepare the WAV file and then a hex editor to get the bytes into an array. Since many of us use Linux or Cygwin, we might have been tempted to use od or hexdump, but however you do it, it has to wind up in an array.

If you want to experiment more with waveform generation, [Elliot Williams] did a good piece on that. You might also get some ideas from our signal generator.

24 thoughts on “ESP32, We Have Ways To Make You Talk”

Ostracus says:

February 6, 2018 at 4:12 pm

For some reason evil villains always want to tell the good guy their plans.

Report comment

Reply
1. Olsen says:
  
  February 6, 2018 at 4:58 pm
  
  I would never do that.
  I guess that’s what makes them evil.
  
  Report comment
  
  Reply
2. scoldog says:
  
  February 6, 2018 at 6:42 pm
  
  Reminds me of Frozone from the Incredibles
  
  “He’s got me right where he wants me, but what does he start doing. He starts monologuing!
  
  Report comment
  
  Reply
3. Whatnot says:
  
  February 8, 2018 at 1:30 am
  
  These villain go through a lot of effort to organize some evil plot and without appreciation it’ll just feel incomplete.
  The whole world of internet social media is built on the concept, so it’s pretty universal.
  
  Report comment
  
  Reply
Olsen says:

February 6, 2018 at 5:45 pm

I don’t see where the wireless benefits of the ESP32 come in handy.
I’m guessing [XTronical] just had one laying about. I could be wrong.

Report comment

Reply
1. Dielectric says:
  
  February 6, 2018 at 5:51 pm
  
  A talking wireless clock so you don’t need Nixie tubes. It’s the future! A reasonable library of numbers read by your favorite person might fit in internal flash, but I don’t know. Maybe you need to branch out to SPIFFS or something.
  
  Report comment
  
  Reply
Fred says:

February 6, 2018 at 7:54 pm

Could the ESP32 be used to stream the array file and play it directly? I know this is wrapping it into a library, but that can’t possibly be the best implementation on a wireless platform.

Report comment

Reply
1. lwatcdr says:
  
  February 6, 2018 at 8:14 pm
  
  maybe support for RTP streaming? SIP?
  
  Report comment
  
  Reply
2. robotuprising says:
  
  February 6, 2018 at 11:11 pm
  
  The ESP32 has builtin hardware support for I2S witch you can use to output stereo sound directly using the internal DAC (2 channels) among a few other options. And this all can be done using DMA transfer.
  
  Combined with the Wifi-unit you can even use it as a webradio: https://github.com/Edzelf/ESP32-Radio
  
  Report comment
  
  Reply
  1. steveway says:
    
    February 7, 2018 at 5:26 am
    
    That project uses an external decoder chip for the mp3 decoding.
    You can even go a step further and do it on the ESP32 itself like in this project:
    https://github.com/kodera2t/ESP32_OLED_webradio
    
    Report comment
    
    Reply
3. herbert says:
  
  February 7, 2018 at 3:01 am
  
  use voip to make an acoustic modem setup
  
  Report comment
  
  Reply
evilmadscience says:

February 6, 2018 at 8:34 pm

If you are going to use audacity to “prepare” wav file you might as well use it to prepare a raw file and play it with nodemcu. I built a grandfather clock chime with an 8266-01. It gets it’s time sync from my router and plays 1/2, 1/2. 3/4/ and the full Westminster chime on the quarter, half, three-quarter, and hour, and bongs out the hour’s just like a grandfather clock.

Report comment

Reply
1. Roger Schaefer says:
  
  February 15, 2018 at 10:53 am
  
  Care to share your code?
  
  Report comment
  
  Reply
Mats Engstrom (@matseng) says:

February 6, 2018 at 11:05 pm

There’s no need to convert the audio file into an array to be put in a .h file. The toolchain and makefiles does that automatically. Just put a reference to the file in the “component.mk” like this:

COMPONENT_EMBED_FILES := sound.wav

and then declare an array in the source like this:

extern const uint8_t sound[] asm(“_binary_sound.wav_start”);

easy-peasy….

Report comment

Reply
1. Roger Schaefer says:
  
  February 15, 2018 at 12:05 pm
  
  Not exactly easy-peasy. An example would be very helpful
  
  Report comment
  
  Reply
Redhatter (VK4MSL) says:

February 7, 2018 at 2:28 am

Minor correction about the sample rate… it isn’t a measure of the number of “bytes”, it’s a measure of the number of samples… otherwise it’d be called a “byte rate”. We do sometimes measure quality in terms of data units per second; usually we call that the bit rate. Sample rate is just one variable that decides the bit rate for uncompressed audio.

Bit rate = Sample Rate × bits per sample × channels

So in the case of 8-bit mono audio, you indeed get sample rate = “byte rate”, but that won’t hold true if you, say, use 16-bit samples.

As for minimum sample rate, a big factor will be the highest significant frequency present in the wave form.

https://stuartl.longlandclan.id.au/blog/2015/01/03/a-horn-for-the-bicycle/

The bell sound effect there is one example where going to 4kHz sample rate really is not sufficient, as the highest frequency component is about 2.2kHz. If you sample that at 4kHz, you’ll get an alias of that component at 200Hz. Audacity will likely try and filter out that alias, which may completely change the sound you were wanting to reproduce.

It’s wise to look at the spectrum analysis of the waveform of interest before deciding on a sample rate, pick the highest significant frequency, double it, then add some more to give yourself some transition band.

Report comment

Reply
RoboMonkey says:

February 7, 2018 at 4:30 am

Some places where arrays and streaming can be used effectively, true…but good write up either way.

Guess who’s going to implement an audio alarm into the refrigerator sensors in the home automation….HEHEHE The spousal unit will be quiet upset. The food will be edible.

Report comment

Reply
1. RoboMonkey says:
  
  February 7, 2018 at 4:31 am
  
  Hey, it took out my evil tags on the HEHEHE….rats.
  
  Report comment
  
  Reply
Mazer says:

February 7, 2018 at 5:28 am

Or use a DFPlayer Mini ($5-$10). Play hundreds of MP3 files on a micro-sd card. So easy and cheap. Mono and/or stereo. It’s really small in size too.

Report comment

Reply
1. XTronical says:
  
  August 19, 2018 at 11:25 am
  
  Hopefully if you look at the latest version and what I use it for it was never designed to just be a simple digital sound player, that was just the start. See latest videos on this project
  
  DacAudio V4
  https://youtu.be/fRvavKaWKms
  
  Frogger on ESP32
  https://youtu.be/isvPum2VBW8
  
  It was initially designed to help me produce the sounds for writing games on the ESP32 which would not work with an MP3 player, not even sure if DMA access is an easy option for my needs.
  
  Report comment
  
  Reply
Buddy Casino says:

February 7, 2018 at 5:41 am

Made the same project last friday. Uses a SPIFFS image to play a random RAW files via I2S. Its going to have a big red button, and it says “no” in various ways.

Report comment

Reply
Whatnot says:

February 7, 2018 at 7:33 am

Sound through a filtered output is a standard module in NodeMCU isn’t it?
I’m not sure why this is portrayed as something needing new code.

Report comment

Reply
SolderFluxxer says:

February 10, 2018 at 12:33 am

A library that not only makes the ESP32 talk, it also makes the Force be With You.

Report comment

Reply
Y04NN says:

February 13, 2018 at 12:28 pm

On linux, to convert any binary file to a C char array simply do:
xxd -i filename

Report comment

Reply

Hackaday

ESP32, We Have Ways To Make You Talk

24 thoughts on “ESP32, We Have Ways To Make You Talk”

Leave a Reply to lwatcdrCancel reply

Search

Never miss a hack

If you missed it

Catching Those Old Busses

Thorium-Metal Alloys And Radioactive Jet Engines

A Brief History Of The Spreadsheet

Review: Cherry G84-4100 Keyboard

Creating User-Friendly Installers Across Operating Systems

Our Columns

Bare Metal STM32: Increasing The System Clock And Running Dhrystone

FLOSS Weekly Episode 859: OpenShot: Simple And Fast

Keebin’ With Kristina: The One With The Curious Keyboards

Pufferfish Venom Can Kill, Or It Can Relieve Pain

Hackaday Links: December 14, 2025

24 thoughts on “ESP32, We Have Ways To Make You Talk”

Leave a Reply to lwatcdrCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns