The 1337 PNG Hashquine

A hashquine is a fun way to show off your crypto-tricks — It’s a file that contains its own hash. In some file types it’s trivial, you just pick the hash to hit, and then put random data in a comment or other invisible field till you get a collision. A Python script that prints its own hash would be easy. But not every file type is so easy. Take PNG for instance. these files are split into chunks of data, and each chunk is both CRC-32 and adler32 checksummed. Make one change, and everything changes, in three places at once. Good luck finding that collision. So how exactly did [David Buchanan] generate that beautiful PNG, which does in fact md5sum to the value in the image? Very cleverly.

md5sum hashquine.pngThankfully [David] shared some of his tricks, and they’re pretty neat. The technique he details is a meet-in-the-middle hack, where 36 pairs of MD5 collision blocks are found, with the understanding that these 36 blocks will get added to the file. For each block, either A or B of the pair will get plugged in at that location, and the md5sum won’t change. It’s a total of 2^36 possible combinations of these blocks, which is more computation than was practical for this particular hack. The solution is to pre-compute the results of every possible combination of the first 18 blocks, and store the results in a lookup table. The second half of the collisions are run backwards from a target CRC value, and the result checked against the lookup table. Find a hit, and you just found a series of blocks that matches both your target md5sum and CRC32 results.

Thanks to [Julian] for the tip! And as he described it, this hack is one that gets more impressive the more you think about it. Enjoy!

7 thoughts on “The 1337 PNG Hashquine

    1. I disagree, like is shown in the article, MD5 still has a use as a basic CRC, 99.9% of users dont do these sort of silly things with their files and 99.9% of applications wouldnt even care if they did, bottom line its still a much better way to compare files versus comparing the actual content of the files :)

    2. MD5 is insecure so you shouldn’t use it for security, but it has plenty of uses as a strong checksum or non-cryptographic hash. It’s like how Master padlocks are insecure, but are still useful to hold things shut and keep the passively curious out.

      I had a problem recently where testers were reporting issues with the latest build a program that I was 100% confident had been fixed because I couldn’t reproduce the issue. After a few of these incidents I added a line to calculate the MD5 of the current executable and print out the last 8 characters, and I started requiring the testers to provide that code along with their bug report. Sure enough, some of the testers were using old versions of the program because they were trying to overwrite the binaries and the overwrite would fail for one reason or other.

      I use MD5 often in microcontroller applications, too. Imagine doing SHA-256 on an 8-bit MCU.

  1. a fun thought experiment, if we can say in esence that this gorgeous ring is the visalization of md5, I wonder how higher forms of encryption would “look” taking this as a model? would it iterate with layers of similar rings stacked on one another? or form a 3d sphere with spikes (a la’ a cell or virus) or form shells like an electron cloud model?

    maybe the visualzation of complex digital stuff amazes only me, fun write-up! :)

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.