Reverse engineering embedded device firmware

While not necessarily an easy thing to learn, the ability to reverse engineer embedded device firmware is an incredibly useful skill. Reverse engineering firmware allows you to analyze a device for bugs and vulnerabilities, as well as gives you the opportunity to add features if you happen to be so inclined. When it comes to things such as jailbroken iPhones, Android phones, and Nooks, you can guarantee that a close look at the firmware helped to move the process along.

[Craig] works with embedded systems quite frequently and put together a detailed walkthrough demonstrating how he reverse engineers device firmware. The subject of his hacking was a new firmware package he obtained for a Linksys WWAG120 Wireless-N router.

His tutorial walks through some of the most common reverse engineering methods and tools, which allow him to slowly unravel the firmware’s secrets. When finished, he had a working copy of the router’s boot loader, kernel, and file system – all ready to be further analyzed. His writeup includes tons of additional details, so be sure to swing by his site if reverse engineering is something you are interested in.

17 thoughts on “Reverse engineering embedded device firmware

  1. Quite obvious for people used to work with embedded devices. However each person has its own tricks, so we can always learn something ;)

    Next step, how to extract Linux (kernel of course) configuration and build a custom kernel for the device. Where to find possibly vulnerable executables/scripts in the file-system (init.d, udev scripts, wwwroot web-based configuration). :)

  2. When I heard “reverse engineering” and “firmware”, I was expecting something more interesting than just extracting the Linux file system from a firmware image.

    “Interesting” to me would include actual reverse engineering of machine code or discovering significant undocumented system design details or features.

    1. I agree. The article is informative to be sure – to me, (like anyone G.A.F.F.) hacking implies having the facility to modify the operation of the device in a way that is meaningful to the hacker. Without disassembling the code section, understanding the i/o map of the embedded device architecture and knowing something useful about stored initialization variables, constants, interrupt vector usage, etc… it’s going to be tricky for a hacker to know how to alter this particular embedded router device. This router, being a linux based system gives it up on the OS in the article, which was interesting to read. As for the functional operation and execution of the object code, … ???

  3. in realistic scenarios you usually end up reversing off dumps you get from shellcode injection or glitching, and it’s usually encrypted these days and you have to get a complete structure(like ELF) and send it through a decryption routine or hardware-oracle.

    Enabled OCD is something you only find on small toys and stuff nobody cared about.

  4. Writing a disassembler is much easier than writing an assembler or compiler. For a disassembler there’s nothing “high level” to unravel or parse. The only problems I run into are embedded jump tables, which my disassembler attempt to disassemble. I have a inline flag that jumps around that. Occasionally you have to make several passes to find all the tables, but one you have, the assembly language code just snaps into place.

    Another trick I used, was to identify the compiler used to generate the code I’m disassembling. In one case the manufacturer made the compiler available online for free. That was awesome. I could now see exactly what machine language code was spit out for various common C language statements. For example, strncpy. That allowed me to substitute C statements in place of multiple assembly language statements, which made things a lot more readable.

    A third strategy, I employ early on, is to find the various data areas. This might be embedded tables, or bit maps for fonts or images and icons. In most cases you have a pretty good idea of what you’re looking for, since you’ve likely seen the fonts, or icons, or images used in the application when it’s working. Often it only takes a small amount of experimentation to determine the row and column numbers, and then you can disassemble these icons and images as well. From that, you can examine your disassembled code for references to the icons and images and fonts, and that gives you a lot of insight into what that code does. Often there is a structure that preceeds the data, and unraveling that is now much easier once you know what it’s referring to.

    Also, for an embedded micro, which has well defined I/O ports for various functions (serial, I2C, timers, etc) it’s helpful to generate symbolic references to these. That way your disassembled coded can refer to loading a baud rate counter, as opposed to just some anonymous value into some anonymous register. Then, once again, working backward, you can gain insight based on what other routines might be calling those I/O related routines that you’ve disassembled.

    So I guess a common theme here is that disassembly is an iterative process, and in many cases once you find a clue, you work *backward* from that point as well as forward.

  5. One more thing…

    Part of that iterative process is to give meaningful names to routines and functions that you’ve deciphered. Then, once you’ve run your disassembler again, your coded now has symbolic names which makes the code easier to understand. The larger the program you’re disassembling, the more important this is.

    Just another part of the iterative process.

  6. There used to be an awesome program I used to extract graphics from files called byteraper. Basically, it was just a hex editor that displayed in RGB. You had a little window that you defined the size of data to look at and some settings for the encoding method. This was back in the DOS days, and I haven’t been able to get my copy working in windows. Been looking for a similar tool ever since.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.