We dropped in on [Charlie Miller]‘s fuzzing seminar at the end of the day yesterday. Fuzzing become a fairly popular topic in the last year and essentially involves giving a program garbage input, hoping that it will break. If it can’t handle the fake data and fails in a non-graceful fashion, you could have found a potentially exploitable bug. Fuzzing is a fairly simple idea, but as Charlie points out, without some thinking while you’re doing it it’s unlikely to be very productive.
Say you wanted to fuzz a PDF reader. You take a random good PDF file and use a fuzzing program iterate through multiple mutations of that file. This brings up the question of how long do you fuzz something? Do you let it run 24hrs, is that enough time? Charlie applies the principles of code coverage to determine exactly how much of the code his fuzzing is actually testing. He used the PNG library as an example. He picked a random PNG to mutate from and it tested a small percentage of the code. Studying the PNG spec he found that there are 21 different chunk types possible in a PNG file so he grabbed 1600 random PNGs and mutated off of those. These many different seed files gave him a lot more code coverage because the files feature almost all the different chunk types. The principle idea being if you don’t execute a line of code you’ll never find the bug in that line of code.
This isn’t just for programs with the source; you can also use code coverage tools like Pei Mei with IDA Pro to determine what part of a binary file the specific code you’re looking for lives. Then you can write smarter generators that will hit more of that particular code.