You know how when you’re working on a project, other side quests pop up left and right? You can choose to handle them briefly and summarily, or you can dive into them as projects in their own right. Well, Uri Shaked is the author of Wokwi, an online Arduino simulator that allows you to test our your code on emulated hardware. (It’s very, very cool.) Back in the day, Arduino meant AVR, and he put in some awesome effort on reverse engineering that chip in order to emulate it successfully. But then “Arduino” means so much more than just AVR these days, so Uri had to tackle the STM32 ARM chips and even the recent RP2040.
Arduino runs on the ESP32, too, so Uri put on his reverse engineering hat (literally) and took aim at that chip as well. But the ESP32 is a ton more complicated than any of these other microcontrollers, being based not only on the slightly niche Xtensa chip, but also having onboard WiFi and its associated binary firmware. Reverse engineering the ESP32’s WiFi is the side-quest that Uri embarks on, totally crushes, and documents for us in this standout Remoticon 2021 talk.
Peeking and Poking
The ESP32 treats the WiFi as a memory-mapped peripheral, like you’re probably used to on microcontrollers. For GPIO pins, for instance, memory mapping means that you can write a 1 or 0 into a particular bit of memory, and it will turn an external LED on or off. Read from that memory location, and you can tell if someone is pushing a button. For WiFi, it’s basically the same thing, only it’s mostly completely undocumented where the memory addresses are and what they’re for. Uri’s approach uses a debugger to the JTAG on the physical hardware, a Ghidra plugin to help him work on the binaries, and his own ESP32 simulator to ferret all of this out.
First off, he flashed one of the simple ESP-IDF WiFi “hello world” programs into his simulator, turned logging verbosity up to eleven, and ran it until it crashed. Which it did quickly, because his simulator didn’t have any of the WiFi hardware emulated yet. With GDB, the debugger, he could figure out which function in particular crashed. Then he took that function apart.
Straight off the bat, he got lucky. A function, helpfully called hal_mac_deinit()
didn’t seem to do much except write particular values to a fixed memory address, and wait for a particular response. He then programmed his simulator to give that response, which made the program crash a little bit further downstream. Success! What does the memory address in question map to? The datasheet says “Reserved” but it didn’t take too large a leap of faith to assume that it’s some kind of WiFi control register.
The rest of the talk has Uri explaining this repeated ping-ponging between a crashed program on his simulator, using Ghidra and GDB to figure out what the crashed code does, and then to integrating the desired behavior into his simulator until that bit of code worked. What’s truly amazing is that this ends up with a simulation of how the ESP32’s WiFi works on the inside that’s so good that he can run Python MQTT libraries on the simulated device, and it works exactly as if it were running on the native hardware. Amazing!
This is a great talk, providing a high-level overview of reverse engineering using emulation as a key tool. It’s a great technique, and we’re stoked to have been able to look over Uri’s metaphorical shoulders. Check it out!
Also didn’t like the ALL CAPS headlines. Took me a while to figure out it is not Uniform Resource Identifier.
Yeah, also I don’t really like the “[Name Here] did something” style, but when it’s not consistently applied, things can get confusing.
Maybe somebody can name a vulnerability “URI Shake” or something just to thicken the soup a bit…
Uri rules! My head spins from just watching how fast he works. It was a pleasure to take his Pico class.
Foo to ESP makers for not publishing the hardware docs, though.
Hi! I’m part of the ‘ESP makers’. I kinda agree, but you have to understand: between people being able to abuse the radio (and institutions like the FCC frowning on anything that makes that easier) and the veil of secrecy and paranoia that seems to be always a part of every chip maker (don’t want designs to be stolen, or other mfgs finding out our secret sauce!), it’s kinda hard to convince colleagues that releasing this is a good idea. Speaking for at least my software colleagues: We certainly think that what Uri does is great and given that emulating the WiFi packet interface doesn’t give much pathways to abuse the hardware, we’re open to supporting that effort up to a point.
Thanks for your comment, it’s encouraging to hear that y’all recognize the issue.
I understand the manufacturers’ IP concerns, but in my opinion regulatory issues are generally overstated: the OEMs won’t mess with it because of civil and criminal liability, and individual hobbyists either ethically follow best practices or can bypass the regulations anyway by purchasing grey market gear.
Hopefully we’ll make progress on this! Maybe you could release the docs with the register descriptions, or at least the header files?
Regulatory concerns aren’t about liability, they’re entirely about the effort required in getting the certification in the first place. It costs tens or hundreds of thousands of dollars and hundreds of hours of engineer time, and is the difference between being allowed to sell your product into a vital market or not.
Therefore anything that can potentially prolong or derail that process is extremely hard to justify, even if from a rational perspective it’s not unreasonable – logic and reason rarely matters when it comes to dealing with laws and bureaucracy.
This is amazing and hopefully puts in the first steps in truely opening up the software stack. With this work, and the emulator, there now is the top layer and the bottom layer defined of those nasty propriatery blobs.
Hooray for this major goalpost. Hopefully 2022 will be the year of the open ESP.
I wonder in home much this knowledge will apply to the pine groups mission on openening up the bufolou? WiFi chips.
So reverse engineered for his internal use and not publicly documented? Nothing to see here.
That is sadly true. Hopefully he will release some documentation.