Steganography involves hiding data in something else — for example, encoding data in a picture. [David Buchanan] used polyglot files not to hide data, but to send a large amount of data in a single Twitter post. We don’t think it quite qualifies as steganography because the image has a giant red UNZIP ME printed across it. But without it, you might not think to run a JPG image through your unzip program. If you did, though, you’d wind up with a bunch of RAR files that you could unrar and get the complete works of the Immortal Bard in a single Tweet. You can also find the source code — where else — on Twitter as another image.
What’s a polyglot file? Jpeg images have an ICC (International Color Consortium) section that defines color profiles. While Twitter strips a lot of things out of images, it doesn’t take out the ICC section. However, the ICC section can contain almost anything that fits in 64 kB up to a limit of 16 MB total.
The ZIP format is also very flexible. The pointer to the central directory is at the end of the file. Since that pointer can point anywhere, it is trivial to create a zip file with extraneous data just about anywhere in the file.
So the scheme is to break up the payload into 64 kB or smaller chunks using rar. The replace the image’s ICC section with a zip of the rar files and point to the zip directory at the end of the file. A program reading image data will just see a garbage ICC section and some extra bytes at the end. A zip program will find the zip file and extract the rar files.
Interestingly, even creating a thumbnail usually keeps the color profile data, so a Twitter thumbnail will still retain the payload. Of course, any service that strips out the ICC data will break the method. Some do and some don’t, and — of course — anyone could start removing the data at any time.
Most image-based steganography we’ve seen makes subtle changes to the image’s colors. We’ve also heard Morse code hidden in audio.
10 thoughts on “Shakespeare In A Zip In A RAR, Hidden In An Image On Twitter”
This is old news, like 20 years old news, and it’s been used by lots of people to smuggle CP and other materal on imageboards and other web forums that allow posting images.
Last year I gave a discount coupon as a birthday gift to my brother: zipped a Word document containing a video of Google TTS speaking the URL of a bit.ly link pointing to a pastebin page with the coupon code, and sent the coupon site by SMS.
I guess this year he will get a JPG…
You gave him a discount code for his birthday? And I thought I was cheap.
Not a discount code, it was a gift card. He got a PS4 game if I am not mistaken…
This technique was popular on imageboards a few years ago. The STL files for the Liberator pistol would get compressed and bundled into a PNG. All the instructions to get the files were written in the image.
Wouldn’t this tend to give garbage results on any web browser that actually uses ICC profiles? Or is some basic curves-and-primaries profile for the image squeezed into the start of the area?
It is just an extension not requirement of the spec and is predicated on more generalized tagging that is in the jpeg spec. Ignoring garbage is just about mandatory to cope with all the images that don’t include one, include older or vendor specific versions, etc.
http://www.color.org/specification/ICC1v43_2010-12.pdf Appendix A, Page 85 for embedding ICC profiles in jpegs.
https://www.w3.org/Graphics/JPEG/itu-t81.pdf ITU’s publication of the JPEG spec. APP tags are the relevant bits.
When the Crowleyan Satanic way tattooed dude lived with me like 10 years back… he showed me a website with the four horsemen in skulls look that was just a bunch of books as I think jpeg’s that you could download and rename to pdf and read. He totally stressed two books something to do with “Executive Double Talk” and “Cliche’s”. I guess that was his Iraqi Austrian trying to act American perspective on the U.S. to be able to do what you want. Learn the cliche’s and executive double talk and everyone will believe you or at least play along. Cartel strange.
My fantasy console emulator uses the same trick as its cartridge format. searches for a central directory signature at the end of the file (like a ZIP). Image encoding can be JPG or PNG for the (fake) cartridge art or screen shots. I didn’t invent it, I stole the idea from some other retroprogramming forum some many years ago. (nesdev forums I think)
What do you get when you combine steganography with the Singularity?
I’ll be back in a GIF.
Please be kind and respectful to help make the comments section excellent. (Comment Policy)